2.2 - Models of Cryptosystems

2.2 - Models of Cryptosystems

Copyright(c), 1990, 1995 Fred Cohen - All Rights Reserved

The standard model of a cryptographic system [Shannon49] consists of two communicating parties 'A' and 'B' (often called Alice and Bob) who each posess a commonly known cryptographic transform and secret keys. The plaintext 'P' to be obscured is encrypted into cyphertext 'C' by applying the encryption transform 'Te' as controlled by the encryption key 'Ke'. To retrieve the plaintext from the ciphertext, the ciphertext is decrypted into plaintext by applying the decryption transform 'Td' as controlled by the decryption key 'Kd'. Figure 2.8 shows a typical secret communication, wherein A performs Te(Ke,P) to produce C which is transmitted to B, who applies Td(Kd,C) to reproduce the original P.

           A                         B
         ------                   ------
P------>| Te  |->C--transmit-->C->| Td |------>P
         ------                   ------
           |                        |
          Ke                       Kd

Figure 2.8 - The Standard Model of Cryptographic Communication
This is written in a more symbolic form as:
A: Te(Ke,P) => C
B: Td(Kd,C) => P

Since C, Te, and Td are common knowledge, the only secrets in this system are P, Ke, and Kd. The obscuring of information is therefore based on the secrecy of Ke and Kd and the properties of Te and Td. In order for such a system to be of practical value, several conditions must be met, some depending on the application. Here are some examples of necessary criteria for many applications.

This model is quite general in that it describes many different cryptographic applications, but it is insufficiently detailed for a precise understanding of how some systems operate in practice.

As an example of its generality, the standard model can be used to describe the operation of a system where is no "key". In such a system, the transforms Te and Td are kept secret. Whatever the secret parts of Te and Td are can be called a key, while the known parts can be called transforms. Such systems should be approached with extreme caution, since historically, they have failed to provide adequate protection. Keeping physical devices secret is just not very easy, and the problem is made much harder if the strength of an entire system lies purely in its implementation. In addition, changing keys becomes a hardware operation that may be quite expensive and time consuming.

Historically, cryptosystems continue to be used thousands of years after having been broken, and lives and fortunes have been lost because of such ignorance. If we ignore this, we are destined to repeat these mistakes and lose more lives and more fortunes.

There are also a set of standard extensions to this model which are used to model effects of attacks. When third parties are involved in attacks on a cryptosystem, they are typically called tappers, and indicated by the letter 'T'. In the literature, we often find the name Tom used for simplified exposition. When Tom is only able to observe signals transmitted over the channel, he is called a passive tapper (PT). When Tom can also introduce and/or eliminate signals, he is called an active tapper (AT). Variations on these situations can be modeled fairly easily by specifying Tom's capabilities in more detail.

By making a variety of assumptions we may show what can and cannot be done to make such systems safe from a variety of attacks. The most commonly analyzed types of attacks are:

The most commonly used type of cryptosystem in use today is the one-key cryptosystem in which Ke and Kd are identical. The "Data Encryption Standard" (DES for short) is the most widely known example of a one-key cryptosystem in widespread use, but there are many other examples. The DES actually uses a different Td than Te such that Td=Te**-1, and T**-1(K,T(K,M))=M. Many problems with the DES have been found, but with proper use it still presents a viable alternative for a large class of cryptosystem needs. The DES is covered more extensively in the cryptographic literature [DESDOC77] .

If Te and Td are also identical, in other words Ke=Kd=K and Te=Td=T, we now have the equations:

A: T(K,P) => C
B: T(K,C) => P

This implies that T(K,T(K,P))=P, which in other words means that reencryption with the same key yields the plaintext. This would lead us to believe that there may be weaknesses introduced by repetitive use of these cryptosystems, and thus we are introduced to a common pitfall of information protection in general, and cryptosystems in particular.

It is often assumed that more of a good thing improves it, when in fact, too much of a good thing may be worse than none of it at all. Consider that with no encryption, we assume that our communication is untrustworthy and act accordingly, while with the use of encryption, we may falsely assume that our communication is trustworthy. This brings up the questions of how we are to assess the quality of a cryptosystem, and how much trust we should associate with it.

One major problem with a one-key system is that the key is a shared secret between A and B. Thus we create a problem in that there must be a way for both parties to attain the secret without anyone else attaining it. This is called the key distribution problem, and is often more of a stumbling block to the proper implementation of cryptography than procurement of sound cryptosystems. Once the keys are distributed, they must be kept secret in order to protect the system from illicit use. This is called the key maintenance problem and it too is nontrivial.

In the case of a one-time-pad, the key distribution problem is particularly severe since the quantity of key that must be transmitted grows linearly with the size of the message. A simple solution to this problem would seem to be the use of a fixed length key for the distribution of the one-time-pad, and the use of the one time pad for encryption of messages. This seems rational because the unicity distance computation shows that any amount of purely random text can be sent with perfect secrecy using any length of key. In fact, this is not strictly true.

Actually, the use of a finite key limits the number of possible plaintexts to H(K). Even though we could be reasonably certain that a random key of 1000 bits would be unguessable in practice (1/2**1000 chance that any given guess is correct), and we could thus be certain that the one-time-pad would not be revealed with this distribution method, this system of key distribution has another flaw. The flaw comes as a result of the eventual use of the one-time-pad for encryption.

If we know that a certain bit is 100% likely to be a 1, we can determine the bit of the one-time-pad used to encrypt it, and thus derive one bit of the 1000 bit distribution key with perfect accuracy. With a lower degree of certainty, we can derive bits of the key from uncertain data. Given enough observations of statistical data, we can eventually derive, to within a high degree of certainty, the entire distribution key, and thus the entire one-time-pad. In the terms of cryptography, this key distribution system yields to a known plaintext attack and a selected plaintext attack.

The point of this example is not to provide cryptanalysis against a particular technique, but rather to drive home the point that cryptosystem design and analysis is not straight forward, and cannot simply by done by the application of a few formulas. This particular case is an example of a "cryptographic protocol" which, although it may appear at first glance to be secure, is in fact fundamentally flawed. Any solution to the key distribution problem is an example of a cryptographic protocol.

Protocols are sets of rules that specify how parties interact. Cryptographic protocols are protocols that maintain cryptosystem requirements in the exchange of information, and typically require that none of the information content of the plaintext be revealed through analysis of the communications behavior implemented by the protocol. Deadlock, livelock, and/or incorrect functional behavior are often experienced when protocol specification and/or implementation fail to handle variable delays, message losses, spurious message insertion, and other phenomena that occur in the environment [Bochmann77] . Research has been carried out for more than a decade on formal protocol design [Sunshine79] [Merlin79] [Sabnani85] , but cryptographic protocols have only recently come under intensive study. [Chaum83]

In a one-key cryptosystem, anything that one party can do, the other party can do, since they both have identical information. This means that such a system is not capable of allowing irrefutable and unforgeable digital signatures, but is applicable to secrecy and authentication of messages between two mutually trusting parties.

Another type of cryptosystem is the two-key cryptosystem in which Ke and Kd are not identical, but invert each other, so that Td(Kd,Te(Ke,P))=P. If the derivation of Ke from Kd, P, and C or the derivation of Kd from Ke, P, and C is very straight forward, there is little practical difference between a one-key system and a two-key system. If the derivation of one key from the other is very difficult, we may use the two-key system as a public-key system [Diffie76] . If both derivations are hard, we have a double-public-key system. If the system is to be of value, it must be good enough to assure that information is sufficiently obscured. In a public-key system, there are two senses in which a system may be good enough:

In this context, hard is taken to mean computationally expensive enough for the application at hand.

The basic principle of a public-key system is that one key is distributed to the public, and the other is kept secret. If the derivation of Ke from Kd is hard, we may distribute Kd to the public and sign messages using Ke. The signature is legible because the general public can read P by applying Kd to C to verify the contents. If the system also makes P,Kd=>C hard, the signature is hard to forge. If the derivation of Kd from Ke is hard, we may distribute Ke to the public, and allow secrets to be sent to us by anyone using Ke. The messages are legible to us because we can read P by applying Kd to C. If the system also makes C,Ke=>P hard, the messages are secret. A double-public-key cryptosystem can be used for both authentication and encryption. In the case of digital signatures, by having public authorities certify and publish public keys, we can irrefutably demonstrate the source of C (to within the accuracy of the public authority).

By publishing a public encryption key Ke, we also provide a secure means of key distribution for other cryptosystems. A session-key "Ks" can by secretly transmited by forming E(Ke,Ks) and sending the result. Since only the creator of Ke has Kd and determining Kd from Ke is hard, Ks is kept secret except to the creator of Ke. D(Kd,E(Ke,Ks)) can only be used by the creator of Ke to yield Ks, so Ks is kept secret even if E(Ke,K) is transmitted in plaintext.

In some cases, determining P from C is unimportant, and only verifying C's legitimacy is necessary. In this case, a class of systems called one-way ciphers (also called trap doors) may be used. Any public-key system can be used for this purpose by publishing Ke and destroying Kd at its creation, but the question may arise as to how we can be sure Kd was destroyed without copying, and thus assure that the system is not invertible. Trap doors may be made hard to forge by making them hard to invert many to one transformations on P. In this last case, by hard to invert, we mean that it is hard to find a P such that Te(Ke,P)=C for any given C. By many to one, we mean that for a given C, there may be many Ps that generate C. Since many possible messages can cause C, it is impossible to uniquely determine P from C, and since it is hard to find a P that transforms to C, it is hard to forge a P that would cause any particular C to result.

Many operating systems use one-way ciphers for storage of passwords, because it makes it easy to verify that a password is correct without having to keep a plaintext copy of passwords on line. Thus password protection in the system doesn't depend on the secrecy of the password file.

A simple symmetric reversible one-key cryptographic scheme is attained by taking the exclusive or (XOR) M XOR K to get C. M is then retrieved as C XOR K. For cases where the message is longer than the key, the key is repeated, and shorter messages are padded with random bits to fill out the key length. By example, choose M=101, and K=011. After encryption, you get 110. When we apply the same key to the coded message, we get 101 again. Without the key, we only know that M is one of 8 different combinations, and we have no way to know which one.

In a one-key system, if each of a set of N parties is to have secure communication with each other party, the number of keys is N*(N-1). For a system with 20 parties, this comes to 380 keys. If each key is 128 bits long, each party must store 16 bytes for each of 19 keys, or 304 bytes of data. The entire set of keys could be stored in just over 6K bytes. As N gets large, this becomes infeasible very quickly. In a public-key system, each party needs only 1 public key and one private key, so the number of keys goes as 2*N. 380 keys in a public-key system thus allows 190 parties to securely communicate. In large modern organizations, even internal communications may involve tens of thousands of communicating parties. At N=10,000, for a one-key system, we have almost 100 million keys, or about 1.6 billion bytes just to store the keys that allow parties to communicate. For a public-key system of the same size, we need only 20,000 keys, or 320 thousand bytes to store all of the keys.

Unfortunately, current secure public-key systems are far slower than private-key systems in operation, and the tradeoff is sufficient to make purely public-key transactions infeasible for most systems. The result is a combination of methods, wherein public keys are used to distribute session-keys. A session-key is typically a pseudo-randomly generated private key used for a short period of time to allow an exchange between two parties. After the exchange is completed, the private key is thrown away. Thus we may eliminate the need to store large numbers of keys, while providing reasonable performance and a high degree of protection.

The no-key system is actually a misnomer. In reality this system is an unshared key system whereby two users can exchange information without an attacker being able to get the information and without the users sharing a common secret key. The basic concept is that of a "two-lock-box" with an independent site for each lock.

        +---------------+
        |   _       _   +
        |--|-|-----|-|--+
        |  |_|     |_|  +
        |   La      Lb  +
        +---------------+

Figure 2.9 - A Two Lock Box

If A wants to send a message to B, A puts the message inside a box and locks it with lock La. A then sends the locked box to B who places lock Lb on the box. B then sends the doubly locked box back to A who removes La. A then sends the box back to B who removes Lb and reads the message. The critical property here is that the locking and unlocking processes must not interfere with each other. With a physical box, this is quite easy, but with mathematical transformations things are not quite so straight forward. A very straight forward analogy can be made between most cryptosystems and boxes with locks and keys.

A major problem with a no-key system is that an active tapper can trivially forge an identity. Suppose Alice is sending a message to Bob with a two-lock-box. If Tom wants to intercept the message, he simply steels the box as it is sent from Alice, puts on his lock, returns it to Alice, and when Alice removes her lock, Tom again intercepts the box, removes his lock, and reads the message. Tom then places his lock on another box, sends it to Bob, and the process repetes. This system has no inherent ability for authentication.

As we delve deeper and deeper into cryptosystems, we find fewer and fewer definitive answers, and more and more complex system designs. One way to resolve the complexity is to use a mathematical notation that shows how things are related to each other. With some simple comments, this can provide a very clear description of what is going on in a particular situation. We will begin to rely on this notation more and more from this point forward, so it will be useful to recap what we have just described verbally in terms of this notation.

The General Model

    A: Te(Ke,P) => C
    B: Td(Kd,C) => P

A One-Key System

    Ke=Kd=K
    A: Te(K,P) => C
    B: Td(K,C) => P

A Two-Key System

    A: Te(Ke,P) => C
    B: Td(Kd,C) => P
    Kd should not lead to Ke
    Ke should not lead to Kd

A Private-Key System

    A: Te(Ke,P) => C
    B: Td(Kd,C) => P
    Ke and Kd must be kept secret

A Public-Key System

    A: Te(Ke,P) => C
    B: Td(Kd,C) => P
    Ke is kept secret for authenticity
    Kd is kept secret for secrecy

A Double-Public-Key System

    A: Te(Ke,P) => C
    B: Td(Kd,C) => P
    Ke is kept secret for authenticity and secrecy

A One-Time-Pad

    Ke=Kd=K
    A: Te(K,P) => C
    B: Td(K,C) => P
    Each bit of K is used only once by A

A No-Key System

    A: Te(Ke,P) => C1
    B: Te(Ke2,C1) => C2
    A: Td(Kd,C2) => C1
    B: Td(Kd2,C1) => P

To quickly review, we list the major types of cryptosystems and their characteristics here.

The major problems we have uncovered in our discussion include:

Notice that the major factor that pervades the practical issues is the issue of how to effectively deal with keys. This is predominantly because everything else is presumed to be known to the attacker. Given a reasonably good cryptosystem designed for the application, the major issue is dealing with the keys.