CSCI 530 Lab
RSA public-key algorithm
operationally hard, formulaically easy
The RSA algorithm is centered on a short, simple piece of math: raising an integer to one number then dividing by a second one.
xy mod n
But you protest, shouldn't "dividing by a second one" be written as:
xy ÷ n ?
Usually in a division the "answer" is considered to be the quotient q. But a division has two results:
a = qb + r
The other result is the remainder r.
We tend to overlook it. Sometimes we explicitly discard it, focusing on the quotient as the purpose and outcome of dividing. But here, by contrast, the remainder interests us and the quotient doesn't. Just as the "÷" operation means determining the quotient, the "mod" operation means determining the remainder when doing a division. But it's the exact same operation. Division. The thing that differs is the choice of result:
| a ÷ b | ÷ means "division" - answer is the quotient, remainder is uninteresting |
| a mod b | mod means "division" - answer is the remainder, quotient is unintersting |
What's the answer when you divide 10 by 3? Why, it's 1 of course. Get it?
OK, now that we understand "same operation, different answer," what about the difficulty of doing RSA which means doing xy mod n? That's formulaically easy but operationally hard. It's easy because everybody knows about taking powers and dividing and remainders. If x is 4, y is 3 and n is 10 can you do the two steps? Can you raise 43? Sixty-four? Thank you. Can you divide 64 by 10? What's the answer? Four? Thank you. The steps are well-defined and familiar. How about if x is 65, y is 15141, and n is 212197? The challenge that now overcomes you is the size of the numbers. It's not that you don't know what to do, it's that you can't actually do it. You can't raise 65 to the power 15141 in your head. You can't do it on your calculator either. Or even your computer with a regular programming language. Those are made for reliable, accurate calculation. But not precise calculation. They offer some few digits of "holding capacity" for numbers and can't hold any more. They can hold a number as large as you like. The magnitude isn't the problem. But they can only hold a few digits of that big number. The precision is the problem. 6515141 is an integer with thousands of digits. The calculator can't hold it.
A tool you will like
There is an arbitrary precision calculator product called bc ("basic calculator"). It is installed by default in most linux distributions and a Windows version is available. bc contains a modest programming language. It can also be called from within shell scripts leaving the programming logic to the shell but calling on bc to do the heavy numeric lifting and return the results to the script. Unlike most numeric programs bc does not do its arithmetic in binary, which would limit it to the word size of the CPU and bus. Rather, it does the necessary internal contortions (unnatural to the computer) to conduct and maintain all its operations in decimal. It will give you as much precision as you want. It prints 6515141 on the screen down to the integer (which occupies several screensful of printed digits). It has no problem with performing such things as xy mod n, so it a great tool for doing RSA on a tutorial basis. For doing it on a practical basis bc is too slow. But it can do the operational part-- the hard part. It's just right for our purposes.
What RSA does
RSA does xy mod n twice, with different numbers. Once to encrypt, again to decrypt. Unlike secret key algorithms such as DES there is no excruciating sequence of intricate molestations forcing the plaintext through multiple meat grinders diagonally, laterally, and upside down. Just a single simple exponentiation divided. To make it work, you need only employ the right numbers. The numbers involved:
a public key consisting of 2 integers - e and n
a matching, precalculated private key consisting of 2 integers - d and n
an input integer, the plaintext message - m
the ciphered version of m, an integer - c
Then here's what RSA does.
encryption: me mod n
That is c, the ciphertext. Followed by
decryption: cd mod n
That will reproduce the original m. Integers are what RSA encrypts and decrypts. To process a substantial message, reduce it to a series of integers (ascii codes are a possibility) and individually treat those. That's it.
The exercise to perform:
As a convention, the text below printed like this signifies commands you are intended to type in and run as you see them. As you perform this exercise, when you determine values for p, q, e, and d retain them (write them down). You'll need them at the end.
In a linux environment with bc "basic calculator" installed, run it:
bc
You are now in an interactive environment. Though none is printed, you are at a prompt. If you type an expression and press enter, the value of the expression appears. Try
33
Suppose you are interested in one hundred divided by thirty-three:
100 = 3 x 33 + 1
If you want to do division and get the quotient:
100/33
You get the "3". If you want to do division and get the remainder:
100%33
You get the "1". In bc, " /" is the "division for quotient" operator while the "%" is the "division for remainder" operator. (While written texts use "mod", bc and other languages use "%" synonymously.)
Now do something heavier:
65^15141
Multiple screensful worth of answer show up. "^" is the exponentiation operator. (The language's few operators are documented in the bc man page, or the online manual.) Now, for the moment, quit from bc:
quit
Implementing RSA encryption and decryption in bc
As a preliminary, please obtain a file containing a needed library of functions for bc. What we need is the ability to find greatest common divisors. I found an online set of pre-written functions for bc including gcd(a,b), which returns the greatest common divisor of 2 integers (it uses Euclid's algorithm to do so). To be able to use the gcd function, acquire the file by the same name (which contains the desired function and a bunch of others) by
wget http://www-scf.usc.edu/~csci530l/downloads/gcd
or otherwise per your instructor. Put it in your current working directory. Then re-invoke bc together with gcd:
bc gcd
The gcd( ) function is now available, additively, within bc. So let's get back to business.
Doing the RSA algorithm boils down to 3 steps. The second one is encrypting an input with a public key and the third is decrypting the result with the matching private key. But don't forget step one, which is coming up with a pair of keys that match (i.e., that will actually work) in the first place. For right now, I will hand you a pair of keys that match so you can go ahead and perform encrypting/decrypting to satisfy that it works. Then we'll go back and learn how to generate keys if you don't already have them.
Encrypting an integer in RSA is just raising it to one number then dividing by a second one. The remainder, another integer, becomes the encrypted version of the first one. RSA is a substitution cipher, so it's used to replace the original integer in the data stream to be transmitted to a recipient. In order to do it, you just need to know what 2 numbers to utilize. They, as a pair, are "the key." Decrypting is the same thing with different numbers: the other key. As long as the numbers used as keys are right, you get correct recovery. Here you go:
Public key - 3 and 55
Matching private key - 27 and 55
These are suitable for encrypting and decrypting any integer up to 54. Let's try it on 43. In bc:
43^3%55
bc prints 32 as the result. That is, 43 encrypts to 32. Now, how do we revert 32 back into 43? In bc:
32^27%55
bc prints 43. It worked. Try it on some other integers below 55. You can't use this on any integer at all, and 55 is pretty limiting. In real calculations though you will use a much larger number in place of 55. But bear in mind, whatever that number, you can only process input below that threshold. Beyond it the math doesn't work.
Implementing RSA key generation in bc
This is the slightly tricky part. You can't just pick any numbers. My use of 3-55 and 27-55 was very calculated. To produce keys that work, here's what you do:
1. choose 2 primes - call them p, q
2. multiply them - call product n
3. multiply their “predecessors” (p-1,q-1) - call product φ
4. pick some integer between 1 and φ (exclusive) sharing no prime factor with φ
- call it e
5. find the integer (there’s only one) that, times e divided by φ, leaves
remainder 1 - call it d
then your keys are:
public: e together with n
private: d together with n
In the above set of keys, the primes I started with were 5 and 11. In general there are mulitple candidates for e, that is there are several or many integers less than φ that share no prime factor with it. It doesn't matter which one you adopt, they'll all work. Once you choose an e however, d is deterministically fixed. You just have to identify what number it is. bc takes the trouble out of all this. Please perform these steps and make keys of your own.
1. choose 2 primes
Your choice, your call. The prime numbers are widely published. Here are the primes under 2000.
2 3 5 7 11 13 17 19 23 29 31 37 41 43 47 53 59 61 67 71 73 79 83 89 97 101 103 107 109 113 127 131 137 139 149 151 157 163 167 173 179 181 191 193 197 199 211 223 227 229 233 239 241 251 257 263 269 271 277 281 283 293 307 311 313 317 331 337 347 349 353 359 367 373 379 383 389 397 401 409 419 421 431 433 439 443 449 457 461 463 467 479 487 491 499 503 509 521 523 541 547 557 563 569 571 577 587 593 599 601 607 613 617 619 631 641 643 647 653 659 661 673 677 683 691 701 709 719 727 733 739 743 751 757 761 769 773 787 797 809 811 821 823 827 829 839 853 857 859 863 877 881 883 887 907 911 919 929 937 941 947 953 967 971 977 983 991 997 1009 1013 1019 1021 1031 1033 1039 1049 1051 1061 1063 1069 1087 1091 1093 1097 1103 1109 1117 1123 1129 1151 1153 1163 1171 1181 1187 1193 1201 1213 1217 1223 1229 1231 1237 1249 1259 1277 1279 1283 1289 1291 1297 1301 1303 1307 1319 1321 1327 1361 1367 1373 1381 1399 1409 1423 1427 1429 1433 1439 1447 1451 1453 1459 1471 1481 1483 1487 1489 1493 1499 1511 1523 1531 1543 1549 1553 1559 1567 1571 1579 1583 1597 1601 1607 1609 1613 1619 1621 1627 1637 1657 1663 1667 1669 1693 1697 1699 1709 1721 1723 1733 1741 1747 1753 1759 1777 1783 1787 1789 1801 1811 1823 1831 1847 1861 1867 1871 1873 1877 1879 1889 1901 1907 1913 1931 1933 1949 1951 1973 1979 1987 1993 1997 1999
You can choose from these or use a test-for-primality website to come up with others. (For performance, I suggest you pick a couple primes in the low hundreds.) In bc assign the primes you chose to variables named p and q:
p=<your 1st prime>
q=<your 2nd prime>
2. multiply them
You want to assign their product to a variable called n; it's also interesting to view it:
n=p*q
print n
3. multiply their “predecessors”
You want to assign this product to a variable called phi; it's also interesting to view it:
phi=(p-1)*(q-1)
print phi
4. pick some integer between 1 and φ (exclusive) sharing no prime factor with φ
Hmmmm.... how are we going to do this one? We have gcd(a,b), which returns the greatest common divisor of 2 integers. So my strategy is to run through all the integers from 1 to φ, test whether the gcd between each and phi is 1, and call those my set of candidates. (If an integer's gcd with φ is 1, it means it has no common factors with φ. That satisfies me, because if it has none at all then it has none that are prime, as required. This doesn't produce the whole set, but I don't need the whole set. I only need one and will pick from this qualifying subset.) Key in and run this implementing code.
for (i=1; i<phi; i++) { if (gcd(i,phi)==1) {print i," "} }
The numbers printed on the screen qualify, they are (some of) your candidates for e. You can use any. Pick one. (shift-PgUp and shift-PgDn allow vertical screen scrolling.) Assign it:
e=<your chosen candidate>
5. find the unique integer that, times e divided by φ, leaves remainder 1
Key in the following search, which fishes for (and catches) that integer:
for (d=1;d<phi;d++) { print d,"\t",d*e,"\t", d*e%phi,"\n"; if (d*e%phi==1) { print d,"\n"; break } }
print d
When the loop breaks, the value of d will be right. The 3 constituents of your keys-- e and n for the public, d and n for the private-- are now set in memory variables e, n, and d. Ready for use. As you did before with my keys, test out yours. Encrypt something, then decrypt it. Use 43 as your plaintext integer m, the same as you used with my keys. Then, again, do it with some other integer m of your choosing-- presumably much larger than 43 but make sure it's less than your phi. You know what to do: me mod n (which is c), followed by cd mod n. Make sure m is recovered as your last result-- m in, m out.
Be patient. Especially if you chose pretty big primes. If you aren't patient, start over with small primes or get another job. This is what they're talking about when they say "computationally expensive."
For this exercise I used:
p=643
q=919
n=590917
phi=589356
e=588911
d=411887
What to turn in:
What I want you to turn in will be produced by a shell script program named "save-rsa". It generates a file named "outfile". That's what I want you to turn in. Be sure to run this program and capture "outfile" on a medium for you to incorporate in your submittal. Obtain save-rsa by
wget http://www-scf.usc.edu/~csci530l/downloads/save-rsa
or otherwise per your instructor. Then run it:
chmod +x save-rsa
./save-rsa
It asks you for the integers p, q, e, d you used in this exercise. Do not use the sample values I provide above, nor those in the comments within the script. Use your own, from today's activity. It additionally asks you for a plaintext integer to encrypt. It encrypts, then decrypts (had better reproduce the plaintext).
When it is finished, capture/take file "outfile" for inclusion in your submittal. Here is a screenshot of save-rsa's operation:
What is your first prime? 131 What is your second prime? 373 What value did you choose for e? 15941 What as a result was d? 17741 What to encrypt? (must be less than 48863) 29292 Thu Sep 18 13:49:38 PDT 2008 My two primes p,q: 131 373 Their product n: 48863 "lesser product" phi: 48360 "relative prime" e: 15941 Resulting d I found: 17741 Number I am encrypting: 29292 Encrypted value: 46460 Recovered value: 29292