There really is a need for a secure crypto program that restricts itself to the basic functions, composed of only a small amount of highly readable code that will be clear and understandable not only for security experts helping to regain the transparency which is needed for users to get themselves into the driving seat again.
The main objective of the Pure Crypto Project is to provide such a program, consisting of just a few hundred lines of security relevant code based on a single, well understood cryptographic method (RSA) which can be fully analysed with respect to its security implications by a number of crypto-literate people. Because of the fact that the basic questions regarding the security of a crypto product remain the same I will explain the mechanisms behind PCP in a similar way as I have done for PGP years ago.
- Summary
- Explanation
- The Security of the Cryptographic Methods Used by PCP
- The Public Key Cryptosystem RSA
- The Hashfunction SDLH
- How the Secret Key Is Stored
- Practical Attacks on PCP
- Security-relevant Actions
- How to Deal with Your Passphrase
- Attacks on Your Passphrase
- What Is a Good Passphrase ?
- Long Live the Signing Key !
- Creating RSA Keys and Hash Keys
- Keep An Eye on the Threat
- References to Other Sources of Information
- A Final Request
There are two possible strategies to attack the security of the Pure Crypto Program:
which try to make use of the possible weakness of cryptographic methods used by PCP compromising PCP's reliability and security fundamentally, and
which try to exploit weaknesses of the practical use of PCP. These attacks intend to compromise the passphrase to gain access to the user's secret key or try to deceive other people by distributing faked public keys.
Theoretical attacks on PCP will be unsuccessful, as I will prove subsequently, because the use of RSA-keys of sufficient length makes it practically impossible to gain the user's secret key. The security of cryptographic methods is not merely based on religous faith but can be justified by the history of failed attempts to break them. As theoretical security is based on the results of recent research in the field of computer science it has to be constantly reviewed in the light of new knowledge.
With the pointlessness of attacks on the cryptographic methods the attention of potential datafiends is shifting towards the weaknesses due to the daily use of the secret key. You have to consider a huge number of practical problems if you do not want to put the security of PCP at risk unintentionally. Most important is maintaining the secrecy of your passphrase which protects your secret key because maintainig this secrecy is a problem most people do underestimate dramatically.
The asymmetric cipher RSA is used to enable persons to establish themselves as the only legitimate author or to ensure that only the intended person can read a message. Other crypto programs usually use a symmetric cipher and a randomly selected session key to ensure the confidentiality of a message because RSA is slow when long keys are being used to encrypt and decrypt relevant information. But a solution to the complexity-problem will make it necessary to reduce the number of different crypto-algorithms to a bare minimum, giving preference to the most concise and clear concepts available in the field. If one does not worry about performance, there is really no need for a symmetric cipher, which will burden the cryptosystem with a number of avoidable security weaknesses or risks only. Given a proper padding of the plaintext and sufficient key lengths, the public key cipher RSA can do all encryption with a simple, intuitive operation, modular exponentiation. Basically, RSA is used to link to a single person the ability to individually decrypt a message or to sign a document.
The second basic method which is put to use by PCP is the hash function SDLH, which relies on a similar clear and intuitive concept, so that the security of both methods are based on the same foundations. The main purpose of the hash function is to convert a very long text into a single number of a few hundred decimal digits which is used as a fingerprint to replace the long text in a digital signature.
One of the fundamental questions to be answered is of course:
Can you really be sure that only the legitimate recipient of your message is able to read it in plain text?
A satisfying answer will imply that one can be sure that someone who has really enormous computational powers will not be able to invert the encryption process in a reasonable time scale, i.e. that the clear text will remain unaccessible and secondly that only the person who is the intended addressee of the message has the necessary information to turn the encrypted message back into readable plain text.
Systems which use symmetric ciphers usually guarantee the first condition showing that a brute force attack on the session key is computationally infeasible, because of the large number of possible session keys that can be used. To estimate the amount of work we have to perform to find a session key by brute force it is helpful to realize the fact, that computers have limits and that there simply is not an endless source of improvement in the computational abilities of supercomputers.
EXCURSUS
Why should it remain unfeasible for a computer to test all possible keys of a 128 bit symmetric cipher in the near future rather quickly ?
It remains unfeasible because as a matter of principle a computer will not have the necessary efficiency. According to today's state of physical research which in future could change the spreading of information in every imaginable computer system is limited by the speed of light.
Not even the results of current "experiments measuring velocities higher than the speed of light" do contradict this fact.If you assume the size of a high-performance computer system performing keytests is 0.3 mm for electronic or optical transfer of information, it can only perform 1 000 000 000 000 operations (10^{12}) per second, otherwise it has to be smaller. Over a period of 317 years or 10 003 759 200 seconds that will sum up to 10 003 759 200 000 000 000 000 operations. This ability to perform no more than 10^{22} operations within 317 years is truly not sufficient for testing 10^{22} different symmetric keys because every keytest requires more than a single operation cycle. But even if it was possible to do a keytest that fast, the large number of 2^{128} keys would have been searched for no more than 0.000 000 000 000 000 029 percent. Thus a specific key will not be found even if "lots" of those high-performance computers will work parallel to test the keys.
This argument is based on empirical assumptions and therefore it can be refuted by experience. But without any doubt it is definitely wrong to suppose "efficiency of any size" for future computer systems.
It is interesting to compare the ability of our proposed supercomputer to a real machine, "Deep Crack", which was invented in 1998 by EFF to show that a brute force attack on DES, the standard symmetric cipher used by banks and other commercial institutions for many years, was indeed possible. Deep Crack is a parallel computer using special hardware of 1500 Deep Crack chips and managed to test 90 billion DES-keys per second, so that any 56-bit DES-key can be found in about 5 days. This single maching today provides 9 percent of the ability we assigned to our proposed 0.3 mm supercomputer.
Of course the above question must be answered differently if encryption is done with the public key cipher RSA because there are longer keys being used and different methods of attack against the security of the secret key are much more promising than brute force, but knowing that about 10^{22} operations is today's limit for a successful attack will help to assess the necessary key lengths for RSA.
The secure performance of the public key cryptosystem guarantees:
On the other hand the validity of a digital signature depends on the fact, that only the person itself was able to create the document by applying his or her secret key which also made him or her responsible for the contents of the document. A person's intellectual property can only be protected by proving the copyright in a document if nobody else is able to create an information that matches with this person's public key.
The RSA cryptosystem basically uses three different and very long numbers (e,d,n), even if one of those numbers actually is chosen very small due to the technical implementation of the cipher, the security of the cipher will not be reduced. Two of them, the public key e and the modulus n are used to encrypt a message. The message to be encrypted in principle is understood as a very long number itself which will be raised to the power of e, the recipient's public key, and the modulus n will then be subtracted from the result until it gets smaller than n. This remainder (mod n) is the cryptogram or the encrypted message ready to be sent to the recipient. The recipient himself applies the same procedure to the cryptogram to decrypt it. Again the encrypted message is understood as a very long number being raised to the power of d, the secret key, which miraculously yields the original message after the remainder has been computed using the modulus n. Just one single number d strictly has to be kept secret whereas the method of computation and both publicly known numbers e and n as well as the cryptogram need not to be kept secret and can safely be analysed by everybody.
But the RSA cryptosystem only works well, if the three numbers fit together
"in an appropriate way". Matching this condition is ensured by the
process of generating a pair of keys which also gives a fine recipe for cracking
the whole cryptosystem.
To generate a pair of keys you have to find two very large primes, p and q, which
make up the modulus as a product (n = p*q). You can derive both interesting numbers,
e and d, which form the actual pair of keys, very quickly from these two primes
which are thrown away finally.
An attacker who knows the modulus n, say p*q, only needs to find out which
primes the modulus is composed of, in other words he or she just has to
factor the modulus, after that computing the secret key is a matter of
seconds. Fortunately the problem of factoring
is a job causing an awful lot of work if numbers are concerned which consist of
several hundred decimal digits, so this is a practically insoluble problem
.
The security of the RSA cryptosystem therefore depends on the fact, that it is practically impossible to factor the large known modulus n and nobody can infer the two primes p and q from his or her knowledge of the publicly known modulus to gain the secret key.
It may be worth noting, that the security of RSA also depends on the absence of any "shortcut", so that there is no way to decrypt or sign a message without using the secret key. Because of the fact that the cryptogram was created by taking the message to the e-th power (modulo n) one might think that computing the e-th root of the cryptogram (modulo n) will produce the message. On one hand nobody has found a way to perform this calculation yet, given that really large numbers are involved but on the other hand it has not been proved that calculations of e-th roots modulo n is of the same complexity and requires as much effort as factoring the modulus. So there might be a shortcut not yet known to the cryptographic community that can render the use of RSA useless in future and it would be wise not only to have a close eye on the advances in factoring but also on the possibility of shortcuts like calculations of roots or even some completely different approach.
How it really works: RSA
Take an integer value M that represents a message text and a large prime p > M and start calculating powers of M, say M^{1}, M^{2}, M^{3}, M^{4} and so on. Taking the remainder of these powers of M modulo p will result in a sequence of numbers between 0 and (p-1), which will appear in a strange and unpredictable order. But within this sequence the number 1 will show up when the exponent is (p-1) regardless of the integer M you have chosen. With a different prime q the number 1 will appear exactly when you finish to calculate M^{(q-1)} mod q.
Now we start calculating powers of M modulo the composite number p*q. Which exponent would you expect to produce the number 1 again ? If (p-1)*(q-1) seems to be the only natural answer to you, you have reinvented a theorem due to EULER, stating that for every M < n = p*q
M ^{phi(n)} mod n = 1 where (p-1)*(q-1) is usually called phi(n) (Euler's totient function).
We now are only one step away from RSA.
If you contiue to multiply the power by M once again you get
M ^{phi(n) + 1} mod n = M And if you choose your public/secret key pair to be e * d = phi(n) + 1 then you have:
M ^{e * d} mod n = M Thus applying the simple operation of modular exponentiation to a message twice, first using the public key for encryption and secondly using the secret key will recover the original message !
The Previous History of Factoring
- 70-digit numbers have been factored in 1998 on a workstation within 10 hours.
- 100-digit numbers have been factored on a single workstation within 1 year.
- 129-digit numbers :
In August 1977 Martin Gardner asked the readers of Scientific American to factor
114 381 625 757 888 867 669 235 779 967 146 612 010 218 296 721 242 362 562 561 842 935 706 935 245 733 897 830 597 123 563 958 705 058 989 075 147 599 290 026 879 543 541 .
Some 16 years later, in April 1994, the factors were presented by Paul Leyland (University of Oxford), Michael Graff (University of Iowa) and Derek Atkins (MIT). They had been supported by over 600 volunteers running a computer program written by K. Lenstra (Bell Labs, Morristown, New Jersey) on their workstations at night sharing the work of factoring over the internet. Later the amount of work was estimated to approximately 5000 Mips years. One Mips year is equivalent to 3.15 * 10^{13} operations.This experience gave rise to the "RSA Factoring Challenge" a contest which aimed at encouraging the research into number theory and the exploration of the difficulty of factoring in practice.
- A 155-digit number had been factored on August 22, 1999
by a group of researchers using the general number field sieve method.
The work started on March 17, 1999 and kept a total of 292 computers busy for little more than 5 months. The process of sieving required some preparations in which CRAY supercomputers were involved and it took nearly 4 months time and was worth approximately 8000 Mips years of effort. This complies excellently with the prediction made in 1996.- 160-digit numbers:
In 1996 experts expected factoring to be possible within about 5 years using a new method of factoring known as number field sieve.Factoring of RSA-155 in 1999 was important, since it made RSA keys insecure which had a length of 512 bit, a key length which was considered as "Low commercial grade, fast but less secure" since PGP was first introduced in 1991.
- 200-digit numbers:
The time for factoring was estimated at 52 000 000 years in 1998.
Conclusion Using a minimal key length of 1024 bit corresponding to a RSA-modulus of at least 309 decimal digits, it is guaranteed that RSA can be considered safe for the near future as long as there will be no fundamental advance in the factoring of large numbers. Today lots of RSA keys are certified with key lenghts of even 2048 bit corresponding to 617 decimal digits.
Without any doubt mathematics has made considerable progress within recent years in factoring very large numbers which has been discussed in detail by J. Buchmann in the June-issue of Scientific American 1996. Using advanced computer technology massively, completely different methods of factoring have been developed which essentially differ from conventional methods.
In 1985 W. Lenstra developed a method based on elliptic curves, which made it possible to find factors of 30 decimal digits size within three weeks using current workstation technology. This method is successful even if the number which has to be factored consists of several thousand decimal digits. If the RSA-modulus is composed of a small and a large factor, the small one could be found within an acceptable period of time. So it is essential that two sufficiently large primes of almost the same size are used for RSA-key generation to prevent fast factoring of the modulus.
A different method known as quadratic sieve requires the solution of a system of linear equations consisting of a huge number of linear equations and unknown quantities. Using this method a 120-digit RSA-number was sucessfully factored in 1993, although 252222 equations with a total of 245810 unknown quantities had to be solved. This work which would have taken almost 50 years using a single workstation could be parallelized, so that it could be done within a much shorter period of time. The method of number field sieves which is based on a similar principle but works much faster than the quadratic sieve was successfully used in 1996 for the factoring of a 130-digit RSA-number.
Considering the progress of factoring over the last ten years one could expect 170-digit numbers to be factored in the near future even if today no known algorithm is able to perform this job within a reasonable amount of time. But it is still possible that a completely new algorithm will be found which would make todays secure RSA-keys absolutely worthless. So it will be essential to observe new developments in the field of factoring and to consider them carefully when evaluating the length of RSA-keys.
RSA has long been subject to a patent (No. 4,405,829) under the law of the United States of America which turned out to be an impediment to the future development of PGP and it seems to be the reason why newer versions proceded to introduce new Diffie-Hellman/ElGamal keys. But this patent has expired in September 2000 and RSA now is in the public domain even in the US.
Attacks on the RSA Cryptosystem
Looking at the Source Code: Modular Exponentiation
def ModExp (Base, Exp, Mod): Hash = 1 X = Exp Factor = Base while X > 0 : Remainder = X % 2 X = X / 2 if Remainder == 1: Hash = Hash * Factor % Mod Factor = Factor * Factor % Mod return Hash This function of 11 lines is the implementation of the basic function that performs modular exponentiation of very large integer numbers, the heart of the RSA cryptosystem. The basic idea behind this function is not to cancel out the multiples of n in the result of a very, very large number in the end but to get rid of multiples of n early during the stepwise process of computing the power so that the resulting numbers do not exceed a certain value. This is done by extracting the binary bits of the exponent beginning with the least significant one. If the bit is 1 a factor representing a certain number of bases multiplied which corresponds to the value of the bit is multiplied to the rest which has accumulated so far. When the most significant bit is reached, exactly as much bases have been multiplied as defined by the exponent and the computation is complete.
Unfortunately it is generally insecure to apply this function to blocks of plain text directly or to sign a message directly because cryptograms and more important signatures are multiplicative. That implies that you can generate a "new" valid signature simply by multiplying two old ones and the same holds for crypto blocks. To defeat this multiplicative characteristics of RSA a one-way hash function is used to convert a message into a large number which is being signed instead of the message, so multiplication is no longer a threat because no-one knows which plain text corresponds to the result of the multiplication as the function is one-way.
Preparing a plain text for safe encryption is more difficult and requires a mechanism called hash chain encryption padding which ensures that some random data is attached to the message and that a wilful or accidental manipulation of encrypted blocks can be detected to thwart oracle attacks which I will explain now.
Oracle attacks will work in a way similar to blinding attacks on signatures which were described by Boneh in detail. An attacker uses a random number r and computes C'=C*r mod N and sends C' for decryption to the victim. If the attacker knows that the plaintext will have a certain known structure, he can deduce information about the plaintext from the fact whether or not the victim accepts the message as a valid plaintext.
The recipient can form the same hash-chain during decryption and can recover the message at any stage with the current PAD1. And she can check, if the current CHALLENGE is the XOR of both pads at stage i. If not, an oracle attack is imminent and anyway integrity is violated. In this case PCP now continues to decrypt (to prevent timing attacks) and throws away the decrypted material leaving a warning in the output file, that an oracle attack may have been launched and the user will never ever see the decrypted material.
How it really works: Hash-Chain Encryption Padding
- Use standard OAEP (Optimal Asymmetric Encryption Padding) to transfer a single Nonce of 256 bit (a number used once) in the first crypto packet to the recipient.
OAEP is a method normally used to blind a small piece of information like a session key prior to encryption with RSA. It consists of two steps:
- Calculate the hash of a seed-value xored with the data block to be transmitted:
C1 = hash(seed) XOR message- Calculate the hash of the first step xored with the seed:
C2 = hash(C1) XOR seedThe recipient can learn the data block by recovering the seed from C2 XOR hash(C1) and finally gets the message from C1 XOR hash(seed).
- Prepare a hash-chain for the second block (first encrypted message block)
PAD1 = initial Nonce
PAD2 = hash(PAD1)
CHALLENGE = PAD1 XOR PAD2- Create a long number to be encrypted with the RSA primitive ModExp as follows:
Concatenate two numbers, the 256 bit CALLENGE and the padded message with PAD1 extended to the whole length of the message.i-th Block : Message XOR (repeated PAD1) || CHALLENGE- Prepare the hash-chain for following block i+1:
PAD1 = PAD2
PAD2 = hash(PAD2)
CHALLENGE = PAD1 XOR PAD2
- Continue at step 3 until all plaintext is processed.
An attacker will only be able to replace a crypted block by another one, if she can manage to keep the lower 256 bits inside the encrypted packet the same. Otherwise the hash-chain is broken and the decrypted material will disappear forever. But that would imply that she could fix the lower 256 bits (the CHALLENGE) of the new forged plaintext encrypted under the recipient's public key without knowing its value.
Of course it is possible that some other text will accidentally produce the same hash value because even if the number of different hash values is very large, it is also limited. But such a collision could not be constructed intentionally because you have to try about 10^{77} variations to find one. The one-way function does not give any clue for an appropriate choice of the forgery and the possibility to get the same hash value from any two different texts is almost close to zero (1/2^{256} or even less than 1/2^{1024} for the hash values used in signatures).
The Birthday-Attack
How many people are required to make the possibility for two of them having their birthday on the same day reach 50 percent ?
The wrong answer is: 365/2 = 182 people.Actually you only need 23 people which is far less than expected, and related to forging a digital signature, the forger can benefit from this situation considerably. If she does not intend to forge a certain existing document but rather confines herself to submit a document to the potential victim for signing and she has already found a forged document which hashes to the same fingerprint, forging someone's digital signature is possible, using such a "suitable pair of documents" (benign and malicious document).
The number of pairs of documents she has to check until she will probably find one working pair is no longer 2^{256} (115 792 089 237 316 195 423 570 985 008 687 907 853 269 984 665 640 564 039 457 584 007 913 129 639 936), "but merely" 2^{128} (170 141 183 460 469 231 731 687 303 715 884 105 728), that is 10^{39} pairs of documents. Finding a working pair of documents still is "bloody lotta work", and way beyond today's computational abilities (see my argumentation concerning the brute-force attack) but anyway, it far less work than expected.
Proof
With N different documents (benign and malicious all in all) there are N(N-1)/2 pairs. Given that there are K = 2^{160} different hash values the probability of two different documents sharing the same hash value is 1/K. To be on the safe side one must try at least half of N^{2} pairs of documents, for if the probability for a collision should be 50 percent, 2^{80} pairs of documents will be sufficient.
(N^{2} * 1/K) / 2 = 0,5
N = sqrt(K) = 2^{80}.Among those 1 208 925 819 614 629 174 706 176 different pairs one will probably be suitable for forging a digital signature. It might be necessary to check slightly more than 2^{80} documents, because of the fact that there are half malicious and half benign documents a matching pair can be within one of the groups which does not help to make a forgery. But since there are twice as many "cross-community" pairs than "single-community" pairs, 2^{81} documents will provide a fair chance for a forgery.
Note that this is a calculation which would apply for a hash function like SHA-1 with 160 bit hash values. In PCP the hash values used in signatures are all longer than 1000 bits as shorter hash moduli are not accepted by PCP.
How it really works: Shamir Discrete Logarithm Hashfunction
The SDLH is based on a simple idea that once the message is converted into a long integer a hash of the message can be computed as follows: hash(x) = g^{ x} (mod p*q)given, that both p and q are large primes which are being kept secret so that factoring n = p*q is computationally infeasible.There are a number of issues to be addressed, especially when the SDLH is used together with the RSA signature scheme and a full analysis of the SDLH's security can be found in the paper "A Discrete Logarithm Hash Function for RSA Signatures". The analysis shows, that SDLH can safely be used together with RSA once certain conditions are met with regard to the selection of the user's key material. For details I like to refer to the paper about SDLH.
Adi Shamir once proposed the following hash function: Let n = p*q be the product of two large primes, such that factoring n is believed to be infeasible. Let g be an element of maximum order in Z_n^* (i.e. an element of order lambda(n) = lcm(p-1,q-1)). Assume that n and g are fixed and public; p and q are secret. Let x be an input to be hashed, interpreted as a non-negative integer. (Of arbitrary length; this may be considerably larger than n.) Define hash(x) = g^x (mod n). Then this hash function is provably collision-resistant, since the ability to find a collision means that you have an x and an x' such that hash(x) = hash(x') which implies that x - x' = k * lambda(n) for some k. That is a collision implies that you can find a multiple of lambda(n). Being able to find a multiple of lambda(n) means that you can factor n. I would suggest this meets the specs of your query above. Cheers, Ron Rivest Ronald L. Rivest Room 324, 200 Technology Square, Cambridge MA 02139 Tel 617-253-5880, Fax 617-258-9738, Email
Attacks on the Hashfunction SDLH
Looking at the Source Code: SDLH
################################################################ def SDLH(Message) : # # Calculates hash(x) = Generator^x mod HashModulus # # precomputing table for step (1) table = [] table.append(1L) index = 1 Number = 1L while index < 256 : Number = (Number * Generator) % HashModulus table.append(Number) index = index + 1 # process the message character by character Hash = 1 for Character in Message: A = table[ord(Character)] # step (1) B = Hash # step (2) index = 8 while index > 0 : B = (B * B) % HashModulus index = index - 1 Hash = (A * B) % HashModulus # step (3) return Hash ################################################################ def SDLH256(Message) : Hash256 = 0L X = SDLH(Message) Size = pow(2,256L) while X > 0 : Section = X % Size X = X / Size Hash256 = Hash256 ^ Section return Hash256 Although calculating a hash value would require a single call to the function ModExp() only the message has to be converted into a long integer first. This would slow down the hashing even more because every character will be processed twice, first during convertation and second during exponentiation. As the hash is already the slowest part of PCP the above code performs convertation and exponentiation simultaneously while the characters are being fed into the loop. A message M = c1 c2 c3 ... cn will be converted into a number by consecutive multiplications
a modular exponentiation using this number as the exponent can be performed in a similar way m = (...((( 0 + c1 ) * 256 + c2 ) * 256 + c3 ) * 256 + ... + cn)
cancelling out any multiples of the hash modulus as soon as they appear in the intermediate results. g ^{m} = (...((( 1 * g ^{c1} ) ^{256} * g ^{c2} ) ^{256} * g ^{c3} ) ^{256} * ... * g ^{cn})
The process can be speeded up further by precalculating the 256 powers of g stored in a look-up table used inside the loop. Anyway one ModExp() operation per character makes hashing expensive but that is the price to be paid for a 1024 bit hash function which is collision-resistant and is based on an intuitive, clear and understandable concept. (If you don't believe me, look at the 400 lines of C code for SHA-1)
Finally a 256 bit version is created by xoring all 256 bit pieces of the hash value beginning with the least significant bit.
For the creation of signatures the most important property of a hashfunction is its "collision-resistance", it must be practically impossible to create a different document deliberately that hashes to the same value, otherwise the hashfunction is compromised and can no longer be used for digital signatures. Please keep in mind that always there will be collisions for every hashfunction because the number of possible documents which can be an input of a hashfunction are digested into a fixed number of hash values, but those collisions which are inevitable cannot be created at will and therefore it is extremely unlikely but nevertheless possible that collisions can be found accidentially. Producing collisions is not a problem as long as it is an extremely unlikely event and it is unpredictable. But any chance to create a collision using a certain method would certainly be the death of the hashfunction. As we saw SDLH is provably collision-resistant but an adversary who is able to factor the hash modulus can compute phi(n) and can create collisions. These collisions will certainly not make much sense but to protect signatures made by PCP it is essential that factoring of the hash modulus must be infeasible for the lifetime of the signature.
But the same method suitable to protect your secret key cannot be used to encrypt a message that is meant to be sent to another person for practical reasons which are generally known and accepted. So the intuitive and secure one-time-pad method can be used locally only without the need to transfer the pad to another party.
Let me first describe in detail how the secret decryption value d is protected on your local computer system before I will discuss the reasons against the plan to engage this method as a general symmetric substitution for the RSA-encryption of plaintext.
The following two functions are used to prepare the secret key stored encrypted in the local filesystem for use by PCP:
Looking at the Source Code: Secret Key Decryption
def load_secretkey(): import os, sys global Decryption # makes secretkey available in Decryption if protected == "otp" : print EOL + "Your secret key is protected." +EOL print "Please enter your passphrase to unlock it." Flags = Mode = 0 try: CONSOLE = os.open(Console, Flags, Mode) PASS = "" if OS == "unix" : os.system("stty -F " + Console + " -echo") CHAR = os.read(CONSOLE, 1) while CHAR != EOL : PASS = PASS + CHAR CHAR = os.read(CONSOLE, 1) if OS == "unix" : os.system("stty -F " + Console + " echo") except: print "Cannot open console for reading." print "Your secretkey is unavailable!" sys.exit(3) Hash = hash256(PASS) print EOL + "Unlocking your secret key." Decryption = unlock_secretkey(Hash) # Burn sensitive information PASS = str(Modulus) Hash = Modulus else: print EOL + "Your secret key is not protected !" + EOL # check if secretkey works well with a challenge Number = 1893947706487L EncryptedNumber = 0L EncryptedNumber = ModExp(Number, Encryption, Modulus) Challenge = 0L Challenge = ModExp(EncryptedNumber, Decryption, Modulus) if Challenge == Number : print "Secret key is good." else : print "Error : Your secret key is unavailable." sys.exit(4) def unlock_secretkey(H): import os, sys try : File = open(Home + "entropy", "r") Entropy = File.read()[:-1] File.close() # check if entropyfile is long enough if len(Entropy) < 1099999 : print "FATAL ERROR: There is not enough entropy", print " available!" print len(Entropy) sys.exit(4) except: print EOL + "Your entropy file is unavailable !" + EOL sys.exit(4) X = H IndexList = [] while X > 0 : Remainder = X % 1000000 X = X / 1000000 IndexList.append(Remainder) OTP = Modulus # Initial value of PAD for index in IndexList : PadString = Entropy[int(index) : int(index) + int(Block+1)] NewPad = StringToLong(PadString) print "*", OTP = OTP ^ NewPad print return Decryption ^ OTP # otp-method If the secret key is protected as it usually should be, the user's passphrase string is read in from the console and then hashed into a 256 bit number using the secret key's modulus for hashing. To unlock the secret key the resulting number of 78 decimal digits is then split into 13 pieces consisting of 6 decimal digits each, providing 13 pointers into a file which consists of at least 1100000 bytes of truly random data. Each pointer provides a different sequence of random data with the length of the secret key's modulus. All 13 random sequences are then xored producing the one-time-pad that had originally been used to encrypt the secret decryption value which is finally xored with the protected decryption value to recover the secret key. But before the secret key is actually used it is checked if some challenge can be decrypted correctly after having been encrypted with the corresponding public key,
The sequence of random data is clearly used as a one-time-pad because it will never be reused to encrypt anything else but the secret key and if the hash function does not cluster, every possible pointer is equally likely, depending only on the passphrase used, which must contain as much entropy as the pointers.
Entropy measures the amount of uncertainty being included in a passphrase. The amount of entropy can be reduced drastically if you can be certain that the passphrase is taken from a dictionary comprising no more than say 130 000 different words. Even if the passphrase is hashed the entropy is not 2^{256} but merely 2^{17} due to the fact that only a few of all possible combinations of characters are being used as a passphrase. But once the user has selected a passphrase with enough entropy to recover the pad one has to recover all 13 pointers used, i.e. one has to guess the passphrase, if the hash is a one-way function.
A brute force attack on the secret key encryption would roughly require to check all possible combinations of 13 sequences out of a million, which is roughly proportional to enumerating all 256 bit hash values. Usually it seems much more rewarding to make assumptions on the passphrase and to try dictionary attacks instead of mounting a fully fledged brute force attack, hoping that the user has a weak passphrase. But if the passphrase has enough entropy the strength of this encryption scheme is almost 256 bit, provided that the entropy file is made out of truly random data, which is not an easy job to do.
I would like to stress that the quality of the entropy file is of utmost importance for the security of the secret key so that it is prudent to spend as much care as possible to select the 1100000 Bytes of the random pool using a truly random source of entropy and to protect this file along with the secret keyfile using the operation system's protection mechanisms carefully. It can even work out to be a wise decision to store these sensible data on a removable storage medium to be locked away while not in use.
EXCURSUS
Why would a randomized stream cipher not be suitable to encrypt messages in transit ?
Ferguson, Schneier and Wagner published their cryptoanalytic research on a randomized stream cipher implemented by TriStrata which was designed as a symmetric encryption algorithm also using a 1 MByte random data pool but only 5 instead of 13 pointers to generate a keystream for encryption. They analysed this smaller variation of an encryption scheme first introduced by Maurer which promised to be both provably secure and at the same time practical using a very huge amount of publicly available random data with 50 pointers. In their analysis Ferguson et. al. showed that serveral attacks against the smaller Maurer-like variant of the cipher are possible with a complexity of 2^{93} for an exhaustive search attack on the pointer space and even 2^{57} when a meet-in-the-middle attack is being mounted. They also showed that improved attacks with a complexity of only 2^{39} are possible if one could induce faults into the random pool or if one could find related keys whose contributions to the keystream cancel out. They concluded that if the number of pointers used is increased to 14 the complexity of an attack would rise to 2^{128} but with only 5 pointers the TriStrata system will not always be secure. All attacks against the TriStrata variant are based on two assumptions which are realistic for a symmetric cipher. Firstly, that the random pool is publicly known, a fact we can suppose for PCP's entropy file as well, even if it is protected by the operation system's protection and ACL mechanisms and secondly, that the stream of encrypted data can be observed freely by the attacker who knows some bytes of the plaintext because of some known structures of certain plaintexts like WORD-documents and the like. Ferguson et. al. suppose that the adversary knows at least 16 bytes of plaintext revealing 16 bytes of the keystream. But this assumption of known plaintext is unrealistic in the context of PCP's protection of the secret key as the secret key does not only lack a known structure but also even if a fraction of the secret key is known there are different ways to reconstruct the rest of the secret decryption value and the protection is already broken. Thus the special purpose the Maurer-like encryption scheme is used for in PCP to protect the secret key will render the known-plaintext attacks described by Ferguson et. al. useless.
But at the same time their cryptanalysis shows that such an encryption scheme cannot be safely used as a general substitution for RSA encryption inside PCP to speed up its performance. As long as the random pool and the number of pointers are not increased considerably randomized stream ciphers a la Maurer cannot be regarded as a secure cipher suitable for the exchange of strongly encrypted messages.
But anyway there is one aspect of Ferguson's analysis which may affect even PCP's protection of the secret key. If the attacker is able to tamper with the stored pool and the adversary can observe enough ciphertext encrypted with the manipulated random pool she might locate the position of the pointers (see section 5.2). But in the context of PCP's protection mechanism the attacker will not be able to observe the faulty ciphertext, because PCP checks if the secret decryption value works correctly with the public key, and if not, it will display a warning that the secret key is unavailable and will simply stop working. On the other hand as long as the attacker constantly keeps changing the random pool and at the same time observes that Alice is happily signing her files she knows that the manipulated bytes of the pool are not inside the 13 sections which are used to create the one-time-pad. But if she hits a significant byte Alice will notice that her secret key does no longer work and if she is careful, she will create a different entropy file to protect her secret key taken from the safe again bringing the attacker back to square one.
RESULT Using sufficient long keys an attack on cryptographic methods used by PCP is practically pointless. This "theoretical security" has to be evaluated with respect to current results of cryptographic research.
Fortunately most of the everyday tasks can be carried out without any difficulties, because only public keys are used. Checking the signatures on your emails you have received will require the use of the sender's public key only and even if you send confidential information to other people, you have to encrypt them using the recipient's public key.
But adding a new public key of an unknown person to your pool of trusted keys, is definitely a security-relevant action, because by making the new public key a valid key for use by PCP you definitely recognize that the available public key you often have received from non-reliable sources (i.e. key-servers) is actually identical with the public key as it was initially generated by that person itself. This could be a problem if you do not know that person or if you can only communicate with him or her using an unreliable channel like the internet which frequently happens.
How can you assess the reliability of an unknown public key? There are two different ways to solve this problem.
The first one relies on some first-hand knowledge about the new public key. You may be able to obtain the public key's digital fingerprint through a direct and reliable contact with that person comparing this first-hand information with the public key's fingerprint you have received from unreliable sources. Having confirmed the public key's authenticity you can certify this key yourself signing the new key with your secret signing key and adding it to your pool of trusted keys.
As PCP can use public keys being stored in different key formats, including RSA keys in PGP-version-2, PGP-version-3, SSH and X509 formats at the moment, checking the new public key's fingerprint will also depend on the reliability of the corresponding software as well. But once you have established trust in the authenticity of a new key you can use PCP's extraction tools to extract the public key's modulus and encryption values in decimal digits together with the user's identification string from the binary or armored key file. These three lines build the foundation of a new PCP key file which has to be supplemented with a hash key (hash modulus and generator) the key owner is currently using. Of course you will need first-hand knowledge about the hash key too, to be able to complete the key file which will eventually look like this:
After having stored this file in the trusted-keys directory you have to sign it in order to enable its use.
Looking behind the scenes: Public Key Structure
Modulus = 60543512488910112606602583998 ... 671880494444421949488339311 Encryption = 65537 HashModulus = 9839391050497052426120625 ... 899354547842152832236191997 Generator = 115277462192246252358083703 ... 011718422799012403067626981 Ralf Senderek, Wassenberg PCP signingkey 2003 Securityhash = 494660087634075087624428 ... 086648493733366295231438448
Looking behind the scenes: Public Key Certificates
- - - - -----BEGIN PURE-CRYPTO SIGNED MESSAGE----- Modulus = 60543512488910112606602583998 ... 671880494444421949488339311 Encryption = 65537 HashModulus = 9839391050497052426120625 ... 899354547842152832236191997 Generator = 115277462192246252358083703 ... 011718422799012403067626981 Ralf Senderek, Wassenberg PCP signingkey 2003 Securityhash = 494660087634075087624428 ... 086648493733366295231438448 - - - - -----BEGIN PURE-CRYPTO SIGNATURE----- Hash: SDLH *** based on modular exponentiation and RSA alone *** Ralf Senderek, Wassenberg PCP signingkey 2003 196776708728465316800282963884340869037 ... 143076856763231124797789521 - - - - -----END PURE-CRYPTO SIGNATURE----- Note, that there is a single end-of-line character in line 8 following the message body in order to allow the signing of files that do not end with an EOL character.
Securityhash
You may have noticed line 6 which is the place designed to store the public key's security hash being a hash of the decimal values stored in line 1 to 4 with the user's identity string concatenated. The security hash will detect any manipulation of the key material reliably and of course - learning from the ADK problem - nothing else but these 6 lines will be an acceptable public key body, becoming valid only if it bears a signature made with your signing key. To ensure that your signature can be validated reliably PCP will always check the security hash of your signing key before anything else is done. Although this step will take time while using PCP it provides the necessary information confirming, that your signing key has not been tampered with during storage in the local filesystem.
Despite the clear advantages of the discrete logarithm hash function, which is used to compute the security hashes, there has not been much practical experience collected by now, a fact which raised concerns by some people that nothing new should be used but the standard hash functions like SHA-1 which are supposed to have been subject to intense scrutiny and review. Consequently PCP is designed to run in two different modes. The "conservative mode", being the default mode, which addresses the concerns of those who will only trust standard algorithms uses SHA-1 as the hash algorithm under all circumstances. Clearly this is a compromise solution because at this point the program relies on a precompiled module implementing SHA-1 breaking with the idea of pure crypto which should entirely be understandable by the user.
But the "pure mode", which can be selected on the command line or set permanently in a config file, will use the discrete logarithm hash function SDLH, which being used with moduli longer than 1024 bit, will produce hash values used in signatures of at least that length. With more scrutiny and experience I am convinced that sooner or later SDLH can be the default mode of PCP, making the "SHA-1 concession code" obsolete in the near future.
Meanwhile the public key files of users who prefer the conservative mode can contain the number 1 for the hash modulus and the generator indicating that the user has no hash key. Using such a key in PURE mode to verify a signature will cause PCP to switch to conservative hashing while everything else will be done with the discrete logarithm hash function.
The second way to establish trust in a public key is to receive a signed key body, usually called a certificate, which can be verified using an already trusted key. It will depend on your assessment of the person's ability to check the authenticity of the signed key material with care whether you will decide to replace your trusted introducer's signature with your own to enable the key's use within PCP or not. Accordingly you can simply remove your signature on a key to revoke the key, rendering it useless for PCP, if you have reasons to mistrust an already signed key.
Key Spoofing Attacks
Some carefully crafted key spoofing attacks have been described in detail by Ross Anderson and Roger Needham. Especially when users try to sign something that contains encrypted material already an opponent who is able to factor the modulus used for encryption (i.e. at least the recipient Bob) can work out a different message M' and publish a modified public key for Bob so that the new message M' looks like a valid signature made by Alice on something that was encrypted for Bob.
To prevent this kind of attack PCP checks every message the user is about to sign whether it contains lines of cryptographic material - which is everything made out of 0..9 on a line by itself - and issues an unequivocal warning that the user should not proceed with the signing process unless he knows what he is doing. The warning makes clear that it always is prudent first to sign before encrypting something.
Another form of key spoofing attacks require compatible weak keys that will work if an attacker is able to modify the modulus in a public key published in an unprotected place. This will show up in PCP because before any key is used the security hash is printed, so that the vigiliant user will notice that the hash has changed. If the manipulation has occured before the user signs the key and the correct unmanipulated security hash is on the key and has been checked with the user, then PCP will notice that the signed key material gives a different security hash than the one which is signed. This will lead to another unequivocal warning not to use the key before it has been checked with first-hand knowledge. But if the security hash is replaced as well - a daring attack, given the fact that he will have to trick you into signing this thing - everything looks fine, except that the length of the modulus is considerably longer than the original and this will be calculated and displayed to the user as well before the key is used. So if you know that your partner uses a 2003 bit signing key and the length of the public key shows more when it is used to verify, you know that something is rotten with this key.
It is of utmost importance to protect your secret keys and to be aware of its central role in the whole system but during your everyday use of different people's public keys your own secret key will never be used. So security-relevant actions do reduce to signing own documents (as the "crowning finish" of your creative work) and to decrypting messages meant only for your eyes and of course accepting new public keys.
Please pay attention to the fact, that whenever such a security-relevant action is carried out, you always have to enter your passphrase!
If somebody got to know your passphrase he or she just would need to get your encrypted signing key and your entropy file to be able to sign in your name or read your private data. Practically you could try to prevent someone copying your secret data files by using file-permissions provided by the operating system. Although with a suitable networking environment, one could restrict the access to sensible data, it cannot be guaranteed that the encrypted secret key will not fall into the hands of other people, i.e as a result of burglary. Of course during normal operation there is a fundamental difference between a responsibly administered multiuser system and an utterly unprotected desktop PC. But because of the fact, that in the worst case the theft of your secret signing data is to be expected, the quality of your passphrase is essential which protects your secret key from being (mis)used by others.
Some text editors create temporary files on their own, which hold a copy of the text during the process of editing. After finishing those "tempfiles" will usually only be deleted not "wiped", leaving sensitive data undestroyed on the medium.
Moreover your passphrase is at risk to appear on a storage medium, if your operating system is extending its memory by "swapping" . While UNIX does separate this swap-space from the rest of the filesystem, so that someone has to gain root access to analyse it, MS-WINDOWS directly stores parts of the memory to disk. Therefore with some luck everyone who has access to your harddisk could find your passphrase or parts of confidential information in the swapfile in plain text. Especially in a networking environment a swapfile is a serious danger to your passphrase. It definitely would be the least to avoid swapping while using WINDOWS or at least to wipe the swapfile before turning off your computer, because deleting does not destroy the contents of your swapfile.
UNIX restricts the ability to examine any blocks of data to the person of the system administrator. In principle he or she also has access to the memory (/dev/kmem) and can search for your passphrase in the memory during the execution of your PCP-process. Do not hesitate to speak to your system administrator about this problem. A responsible system administrator will put himself to a lot of trouble to prevent others from using the superuser's password and he or she will be able to assure you that your passphrase will not be at risk by root access. At this point the security of your secret key depends on the ethos and professionality of another person who is responsible for the UNIX system. At least since Bruce Schneier's epiphany we begin to understand that security is a process and not a product. Therefore do not bother to ask your system administator about monitoring, intrusion detection and risk management to make sure that the integrity of your working environement is carefully observed.
Peter Gutmann has examined the reliability of safe destruction of data stored on magnetic disks and solid-state memory and found surprisingly that data seem to be more resistant to destruction on these media than expected. In case your computer falls into the hands of an attacker procedures including "Magnetic Force Microscopy" can be applied to retain stored data which had even been overwritten several times. There are different reasons for the persistency of formerly stored data on magnetic media ranging from the inability of writing data to exactly the same track so that old data remains at the edges to a lack of overwrite performance which leaves progressivly diminishing images of older written data in present patterns on the disk. Based on an analysis of the way raw data is written to disks Gutmann proposes a method of truly erasing data which uses a sequence of 35 consecutive write circles to get rid of old data reliably.
He also found out that data present in RAM is recoverable after power is switched off depending on the time the data is held in memory unchanged. Interestingly not only static RAM could "remember" last stored states over days but also dynamic RAM can hold such information because the charge of cells is retained while the electric field stresses the thin oxide that makes up storage capacitor dielectric and thus changes its electric properties remaining in the cells. This effect is not exploitable unless the data remains constant for several minutes but after that the information remains for days depending on temperature once it has reached a critical threshold. Unfortunately overwriting RAM does not have the same effect as with magnetic media, rather the data used for erasure should be applied unchanged for as long as possible and should not be changed as often as possible to be effective. As a precaution Gutmann proposes that security relevant data like passphrases and secret keys should be held in specially designed RAM which has to flip constantly to prevent "burning in" of secret information at all.
But without any big effort "key-logging utilities" or "key-press snooping tools" can be installed in operating systems without access control (DOS/WINDOWS) which will detect every key pressed, storing them somewhere or sending them over the network. Those keyloggers replace some parts of the operating system, which control the keyboard input by tampered code thus leading to an alteration of the operating system. With computers runnig DOS or WINDOWS everybody will be able to install keyloggers without any difficulties, if he or she has access to your harddisk. But such an alteration could be done by a boot virus as well or based on a network attack with Back Orifice giving an adversary control over your machine using the network connection.
To get an efficient protection against these alterations of the operating system it is necessary to check the integrity of the corresponding routines of the operating system regularly before you use PCP. This could be done by generating fingerprints of relevant parts of the operating system comparing them with the actual software-fingerprints (may be automatically) to ensure a change of the operating system not to remain undetected. Tripwire can be of help in those situations.
With UNIX "key-logging" requires access to the corresponding device files (terminals) which normally is not permitted to other people. An alteration of the input routines of the operating system is possible only with root access. But the X Window Sytem (X11R6) which is the graphical user interface for UNIX features a conceptual design error, which could be exploited for "key-press-snooping" without root access. You will find comprehensive information about this in Rune Braathen's Crash Course in X Windows Security. You should always avoid entering your passphrase during a X Window session and you should switch to the character based console if possible. But if you cannot do so, you can restrict access to your X server to your localhost using the UNIX command "xhost -", so you can only be threatened by users who have already logged into your local computer.
Following the support of JAVA, JAVA-Script and ACTIVE-X by the new generation of
web browsers
a severe lack of security has become known, because unauthorized access to
private local data over the network has been permitted due to the implementation of
these new features. The concerted action of the software industry to deliver
"brand-new features" in a premature and imperfect condition will
certainly damage the security of the browsers in their further development -
particularly, if you take into account the rivalry between Microsoft and Netscape.
The demand for comfort and user-friendliness lots of users call for, causes
suspicion of an increase of security-relevant attacks using applets
and "hostile webpages" in future. Even today some
websites
intend to cause destructive effects based on JAVA-applets. So in future
those pages could also be dedicated to the purpose of hidden information gathering.
You can face these dangers best, if you do not use your system for PCP and as a
platform for testing web browsers simultaneously.
Usually a firewall is useful to protect against intruders, if configured correctly,
which has become much of an art today. Some products that offer personal firewalls
have been found to leak information out into the internet, a problem called
"internal extrusion". To check the abilty of personal firewalls
against such leakages Steve Gibson developed a free utility called
LeakTest which tries
to pass out through the firewall. Unfortunately most of the current products
today fail the leak test.
But even if firewalls work reliably they do not block faked DNS-requests which can
be used to send confidential information to the nameserver of a hostile domain,
as was reported by Richard Cox in a prominent
security mailinglist.
Because of the fact that you should never forget your passphrase, otherwise your secret key is useless and the protected data is lost, solutions have to be found that will work safely. At this point two different strategies can be offered to guarantee the memorability ot the passphrase, both leading to a loss of security.
Both strategies do raise the question what length will be required to make sure that a passphrase is still secure. My argument concerning the efficiency of a future computersystem has shown, that it will be sufficient to ensure, that the work for "systematic guessing" the passphrase will reach about 10^{22} operations. If you choose your passphrase in a way that in principle there are 10^{22} possible passphrases, you certainly are on the safe side. To cause this amount of work 72 bits or 9 Bytes of truly random data will be sufficient, because this will leave 2^{72} or 4,7 * 10^{21} variations.
The number of characters could be reduced further by using more than those 16
hexadecimal characters (0-9A-F). Using not only all lower and upper case letters
but also every special character you can enter through the keyboard resulting in
some 70 different symbols, 12 of those randomly chosen characters
will be sufficient to cause the required work
(70^{12} = 1,4 * 10^{22}).
Then may be your passphrase would be e8ZWr!ySFt?m.
By using the <SHIFT>, <CTRL> and <ALT>-keys and the combination
<CTRL><ALT> as well you can easily increase the number of different
characters to 160, which results in an unbeatable secure 10-character
passphrase like
G<ALT>)a<CTRL>#Hs<CTRL><ALT>e.3<ALT>wY
.
So the gain you have achieved using the meta keys is rather modest and you simply cannot construct a secure passphrase with less than 10 characters!
A dictionary contains maybe 60000 different items, so an arbitrary choice of five different items would again produce plenty of combinations (60000^{5} = 7,7 * 10^{23}). But your secure passphrase salmagundi_latrine_warning_gowk_marathon now has a lenght of 36 characters, a price you have to pay for the fact, that you only need to remember the order of the five different words, which all have meaning now.
One single word chosen from a dictionary is clearly useless as a passphrase. R. T. Williams describes the attempt of testing PGP-passphrases using lowtech computer technology (PC-486 and PGP-2.6.2). Performing at least two keytests per second one easily covers 172800 words a day, so every dictionary will be searched exhaustively within a short period of time.
If you still want to use a single word as a passphrase, it is advisable to choose a random-looking word, which has no meaning for anyone except you. Twcae,pwcab. would be no poor candidate for my passphrase, if nobody could get the idea that it reflects the quotation of I. Kant "Thoughts without content are empty, perceptions without concepts are blind". But those who know that I am busy with philosophy professionally might get this idea, so this passphrase is completely useless for me. But if you can invent such an "abbreviation", nobody who knows you would possibly assume, this would be a good passphrase given an appropriate length and spending plenty of <CTRL><ALT>.
RESULT The PROTECTION of your PASSPHRASE is the crucial factor which determines the security of PCP. The trouble you are taking to choose a good passphrase and to make your working environment secure but also your awareness of the various possibilities of practical attacks on the use of PCP builds the basis of your security and the basis of the trust of others in the value of cryptographic applications like PCP as well.
A key used to create signatures usually has a very long lifetime, because trusted key distribution is an awful expensive procedure and in case of a compromise of your secret key the secure revocation of your signing key takes equal pain. Moreover you would probably use your signing key more frequently so that spending an extra effort to protect your secret key would be only prudent. On the other hand keys which protect confidential communication should have a short lifetime to minimize the damage caused by a key compromise. The change of even revocation of a shortlife encryption key would be a pretty normal thing whereas you would probably go through much trouble to avoid the revocation of your longlife signing key. Governments have often toyed with the idea of demanding the disclosure of private encryption keys used to protect confidential communication but have confined themselves to include private signing keys in this demand up to now. Using two different keys for signing and encryption is terribly advisable in the light of those intentions of governments to establish the seizure of private keys as a natural means of law enforcement which are currently developing in different countries all around the globe and especially in the European Union.
You can easily specify the purpose of your longlife key by adding something like "high security signing key, not for encryption" to your User-ID and use this key to sign any short lifetime "encryption key [expires: date]" which can be published on a website and can be replaced even monthly without revocation so that everyone who has got your signing key reliably can obtain your current encryption key and check its validity before starting a confidential conversation. There is no need for a wide distribution of your encryption keys using keyservers because only your signing key has to be certified properly to ensure the authenticity of all your other keys which are to be replaced from time to time. To complete this picture there is also a need for a safe data haven which is immune against demands for cryptographic keys.
Well, to be honest, I do not really know. I have not (yet) included a key generator in PCP because there are several different software packages which generate RSA keys and it is a matter of trust which one you will employ to create your keys. All I can tell is, that as far as I know, it is important to get as much randomness as possible into the process of key generation and with respect to low public encryption exponent attacks it may be advisable not to select the lowest value for the encryption exponent, which is usually 17 to gain speed. As the openssl software uses both a large file containing random data to seed the random number generator and a longer encryption value I have created my signing key with openssl but, as I said, your milage may vary. Anyway, generating keys is still a little bit like counting on the egg in the hen's arse.
With respect to the hash key the choice is much easier, because you simply have to select two strong prime numbers to build the hash modulus and to select another prime of highest order, i.e. a few bits shorter than the hash modulus, to be used as the generator. Here again you will rely on the software of your choice to produce primes but you can use a tool that finds strong primes which comes with the PCP distribution.