Network Security
References
Security is a complicated business that wasn't given much thought until
uses of computer networks increased and the potential for abuse became
interesting (i.e. profitable).
There are five aspects of security. They aren't totally independent
of each other, and real applications usually have to worry about several
or all of them. We ignore authorization as an OS issue. Authorization clearly
relies on authenication.
Secrecy or Privacy
Keeping content hidden from the wrong people. Traditional systems:
locks, safes.
Authentication
Being confident about a persons or a documents identity. Traditional
systems: id cards, face/voice recognition, physical distinction of copies
Authorization
What a user/process is allowed to do. Distinct from authentication,
an OS issue if authentication can be done
Nonrepudiation
Applying the digital equivalent of a persons signature so they can't
later deny something. Traditional systems: signature
Integrity control
Being confident that a message wasn't altered. Traditional systems:
registered mail, courier
Where does security belong in the protocol stack? Probably at many levels.
Physical - protect wires, shield them, use fiber
Data link - encrypt/decrypt for each point-to-point hop
Network - firewalls may filter packet traffic
Transport - provide and end-to-end (process-to-process) encryption
Application - authentication and non-repudiation must be solved here
Cryptography
The need to be able to change the encryption quickly rules out using different
sorts of encryption algorithms. Instead, a single, public encryption algorithm
paramaterized by some short key is used. Then switching encryption is just
a matter of changing the key, not re-inventing a new technique.
Here's the model
Plaintext, P -----> | Encryption algorithm | -----> Ciphertext, C ------>
| Decryption algorithm | ------> Plaintext, P
Intruders monitor the channel and see the ciphertext. Active intruders
may introduce ciphertext message into the channel.
The ciphertext C = EK(P) which is to say that the ciphertext
is the output of a function E with two parameters, K and P.
Obviously DK(EK(P)) = P so E and D are
inverses of each other.
The encryption algorithm is public. Keeping it secret is too hard (equivalent
to the need to encrypt P in the first place). Making it public nowadays
also means that people all over the world try to crack it, so if they can't
crack it they world gains confidence that it is a good algorithm.
With a public encryption algorithm the protection comes from the secretness
of the key. An example is combination lock using a 3 digit key. Everyone
knows the algorithm (left, right, left) but you must have the key to open
a lock. The longer the key the larger the amount of work that must be done
to break the encryption with a brute force attack.
Two principles underly all cryptographic systems. These principles have
nothing to do with specific algorithms, but rather about how attacks on
algorithms and attacks on systems by active intruders work.
Need for redundancy
-
All encrypted messages must contain redundancy to prevent the sending of
false messages.
If the ciphertext is exactly as big as the plaintext, and all possible
ciphertext messages decrypt to legal plaintext messages, then it is trivial
for a bad guy to generate false messages. The bad guy doesn't need to know
what message he is sending with is random-bit false messages in order to
cause problems.
For example, suppose that a banking message had three fields for a transfer
operation: from account, to account, and amount. The field sizes are known
to the bad guys (say they are 4, 4, and 6 digits). If every possible 4
digit number is potetially a legitimate account number, then a random set
of 14 digits in a false ciphertext message will be decrypted to a legal
but bogus money transfer operation.
If the coded message contains redundancy, then bogus decrypted messages
can be detected. So to continue the example, each 4 digit account number
could be proceeded by some standard word or number string. Now it is impossible
to generate bogus ciphertext which maps to a legal plaintext account number.
-
Something must prevent the undetected re-transmission of intercepted messages.
The intruder may listen to the channel, record ciphertext messages,
then play them back at some later time. This can cause all sorts of damage
and problems, and isn't detectable since the re-played messages are perfectly
legal.
One approach is to use a timestamp and discard any message older than
some threshold.
Modern cryptography
Long keys are a hassle to manage. If you can't remember them you write
them down, which is bad. Modern encryption algorithms use complicated functions
that can be fed simple keys and can't be decrypted.
There are two major categories of algorithms: secret key and public
key.
Attacking an encryption algorithm is much easier if a sample of plaintext/ciphertext
is available. It should be assumed that this is the case when gauging the
safety of an algorithm.
Secret key
The key is kept secret, a complicated algorithm is used for encryption.
An algorithm which is fast to implement in both encryption and decryption
is important.
The problem with secret key is distributing the keys. This is particularly
hard if you want to communicate securely with large numbers of people,
and if you don't necessarily know who you want to communicate with in advance.
DES (Data Encryption Standard)
A standard in the US since 1977 widely used in the financial industry.
No longer secure in the original form, but has been modified to be secure.
Uses a 56 bit key, transposition, XOR operations, and swapping through
19 stages of bits in 64 bit blocks.
Estimated in 1994 that a machine could be built for $1 million to crack
DES through brute force search of the state space in 4 hours. More creative
idea: put a cheap DES chip capable of 1 million encryptions per second
in every radio/TV in one billion peoples homes (say everyone in China).
Each chip is assigned a tiny piece of the key space. Ciphertext/plaintext
pair messages are broadcast to each radio/TV, all chips starts doing encryptions
for their part of the key space and comparing. Within 60 seconds the key
is found.
Better secret key algorithms (e.g. IDEA) are availalble than DES. The
problem of distributing the key remains even with these cryptographically
secure algorithms.
IDEA
The important thing to remember about secret key approaches is that they
all require the secure exchange of a secret key. If you have a channel
to do this, why not just send the message that way too?
Public key
Diffie and Hellman proposed a scheme in 1976 that solved the problem of
secret key distribution. Their approach is asymmetrical, in that different
keys are used for encrypting and decrypting, and secure in that no secure
distribution of keys is needed. In other words, two people who have never
met, talked, or exchanged information before can now exchange securely
encrypted messages.
The requirements for this to work are the following, where E is the
encryption algorithm plus its key and D is the decryption algorithm plus
its key
-
D( E(P) ) = P
-
It is very hard to deduce D from E.
-
E cannot be broken by a "chosen plaintext" attack.
The first requirement just says that D and E are inverses of each other.
The second requirement says that knowing E isn't enough to tell you
D.
A "chosen plaintext" attack is one where the bad guy can encrypt as
much plaintext as he wants using the encryption algorithm E. This is the
strongest sort of attack since it lets the bad guy experiment with plaintext/ciphertext
pairs.
If these conditions all hold then we can make E and its key public without
loss of security. If we can do that, then we can establish secure communication
between two parties without exchanging secret keys.
-
Person X wants to communicate to person Y. Person X devises a algorithms
and a keys for encrypting and decrypting, call these Ex and Dx. Person
X makes Ex (algorithm + key) public by putting it on the web in a well-known
place. Person X then makes D known, but keeps the decryption key private.
-
Person Y does the same thing, making Ex and D public.
-
Person X takes a message P and encrypts it using with person Y's public
algorithm/key C = Ey(P) and sends C to person Y.
-
Person Y computes Dy(C) = Dy( Ey(P) ) = P to get the plaintext.
This works because only person Y knows the private key, and Dy can't
be derived from the public information Ey.
The decryption algorithms D are published to take advantage of people
trying to crack the scheme.
In real applications, everybody uses the same algorithms for E and D, and
the security is in the keys. The public key is widely available for everyone
who wishes to receive secure messages. The private key is held by each
person and never distributed.
RSA Algorithm
Rivest, Shamir, Adleman at MIT, 1978 found algorithms for E and D that
satisfied the three requirements. This has become the most important commercial
public key technology and is sold by a company called RSA Security.
First you need the following
-
Select two large (greater than 10100) prime numbers, p and q
-
Compute n = p * q, and z = (p-1) * (q-1).
-
Choose a number relatively prime to z and call it d (d and z must have
no common factors)
-
Find e such that (e * d) mod z = 1
You can get p and q from a book of prime numbers. This keep supercomputers
gainfully employed.
When you multiply two large primes you get a number with only two factors.
Because factoring a large number is very expensive, you have effectively
hidden p and q in the product n which you can safely make public.
To send a message
-
Represent the message as a plaintext bit string (just put the bits for
the ASCII codes together into a string).
-
Divide the plaintext bit string into blocks of k bits so that each block
when viewed as a k bit binary number P <= 2k has value less
than n. You want the biggest k you can get.
-
To encrypt the plaintext form the ciphertext C by C = Pe (mod
n)
-
To decrypt the ciphertext, calculate P = Cd (mod n)
The encryption and decryption algorithms are inverses of each other. To
do the encryption you need e and n, to do the decryption you need d and
n. The first is the public key, the latter is the private key.
To break the encryption you must factor n to p and q, hence z, and then
use Euclid's algorithm to get d. RSA calculate that it would take 4 billion
years of compute time using the best algorithm and a 1 usec calculation
time to factor a 200 digit number.
RSA example
The only problem with RSA is that it is slow to encrypt large messages
(100 to 1000 times slower than a secret key approach). In practice what
is done is that a secret key algorithm is used to encrypt the data, and
the secret key is sent to the receiver encrypted with the RSA algorithm.
Security Scenarios
Consider these scenarios when A wants to send a private message to B. Each
scenario achieves a different aspect, or a different combination of aspects,
of security.
Scenario 1 - privacy
A encrypts message with B's public key
B decrypts with private and public key combination
Result
Anyone can know your public key - how do you know who the message was
from? (no authentication)
Scenario 2 - authentication, no privacy
A encrypts message with A's private key (signing the message)
A sends message to B
B decrypts with A's public key
Result
Anyone can decrypt with A's public key, which is widely available (no
secrecy)
Scenario 3 - privacy, authentication
A encrypts message and a large random session number RA
with B's public key
B decrypts using private key
B generates a large random session number RB and sends
it along with RA back to A encrypted with A's public key
A decrypts with A's private key
A encrypts RB with B's public key and sends to B.
B decrypts with B's private key, recognizing the RB it previously
sent
Result
B doesn't know immediately that A actually sent the message, but forgery
is detected later in message 3
A knows that message 2 isn't a recorded replay because it contains
RA which was just generated by A
B knows that message 1 wasn't a forgery when it receives message 3
containing RB which only A could have decrypted and returned
to it.
A and B now have an authenticated, secure means of communication. In
practice a secret key for the session is also exchanged for subsequent
secure data transfer.
Scenario 4 - authentication, privacy, non-repudiation
A encrypts message with A's private key (signing the message)
A encrypts encrypted message with B's public key
A sends message to B
B decrypts with private key
B decrypts with A's public key
Result
Authentication, privacy, nonrepudiation assured - A and B only know
each others public keys
The public key scheme used must have the following property for this
to work:
in addition to the standard property of
RSA algorithm has this property and is widely used for digital signatures.
Use of random session numbers, as in scenario 3, guards against replay
attacks.
Scenario 5 - authentication, integrity, no privacy
Suppose you just want to sign your email in such a way that the receiver
knows for sure it is from you, and couldn't have been generated by someone
else. You don't care about the contents of the mail being private. Or imagine
that you want to put a binary program out on an ftp site and give people
a way to verify that what they have downloaded is exactly what you put
out there. Again, you don't care about encrypting the plaintext (email
or binary) itself, just signing it and guaranteeing integrity.
A message digest is a special hash function that takes an arbitrary
amount of plaintext and creates a fixed length bit string. The hash function
has the following properties:
-
Given P, it is easy to computer MD(P)
-
Given MD(P) it is impossible to find P
-
No two messages generate the same message digest (to a very small probability)
Here's how a message digest can be used with a public key scheme to digitally
sign something without the overhead of encrypting the entire plaintext
message:
A computes MD(P), then sends signs it with A's private key
A sends P and DA( MD(P) ) to B
B uses A's public key to decrypt MD(P) by doing EA( DA(
MD(P) ) )
B computes MD(P) for himself, from P, and compares. If there is a difference,
then P was tampered with in transit.
A common message digest algorithm is called MD5 and was invented by Rivest.
Digital Signatures: Authentication & Nonrepudiation
It is important in many scenarios that A not be able to repudiate the sending
of a message. This is what signatures do in the real world. Public key
cryptography provides a means of doing digital signatures. Here are the
desireable properties of a digital signature (Bruce Schneier, Applied
Cryptography John Wiley & Sons, 1996):
-
Verifiable: Anybody should be able to check a signature to see whether
it is valid.
-
Unforgeable: It should be impossible for anybody but you to attach your
signature to a document.
-
Non-reusable: It should be impossible to "lift" a signature off one document
and attach it to another.
-
Unalterable: It should be impossible for anybody to change the document
after it has been signed.
Non-deniable: It should be impossible for the signer to disavow the
signature once it is created.
Implementing public key
Major assumption: A and B know each others public keys. Attack is possible
by spoofing the key server.
Problems arise if A's private key is stolen or released to the public
(A can now repudiate), or if A changes its private key at some point. Some
kind of key repository helps solve these problems.
Distributing public keys: certificate authorities
Putting public keys in a public place somewhere sounds like a good idea,
but how do you prevent someone from putting a public key for X and labelling
it as A's public key? Then if X can intercept messages encrypted with this
key it can decode them. Also, what if someone can spoof the key repository?
Sending a public key as part of a request has similar problems. It would
be easy for X to send a message claiming to be A with a public key attached.
The receiver B can't tell just from the key that it is indeed A's key.
One way of insuring to whom a public key belongs is with a certificate.
A certificate has the following information
-
Person's name, address, organization, etc
-
Person's public key
-
Validity dates
-
Certificate number
The certificate is encrypted with the private key of the certificate authority
to guarantee that it can't be forged or corrupted. Now when you send your
public key you also send your certificate. The receiver uses the public
key of the certificate authority to decrypt the certificate, which then
lets them confirm the owner of the public key. With a certificate it becomes
impossible to claim a public key belongs to someone that it doesn't belong
to.
The certificate authority (CA) must be trusted to have determined the
identity of the party to whom the certificate was issued. The only public
key that must be widely accessible to everybody is that of the CA.
PGP - Pretty Good Privacy
Phil Zimmerman, first release in 1991. Too strong for US export (>40 bit
key), threatened with prosecution, stiff fine. Resolved in early
1996. Based on public key technology.
Pure public key is impractical because:
-
Algorithms are slow, so encrypting a large file is time consuming
-
The keys are long, so the private key can't be memorized
PGP and the speed problem
Encrypting the plaintext with a single-key algorithm is much
faster
A session key is generated (randomly) (IDEA 128 bit encryption)
The message is encrypted with this session key
The session key is encrypted with A's private key, then B's public
key
The cyphertext and the encrypted session key are sent to B
B decodes session key with B's private and A's public keys
B decodes message
PGP and the private key problem
Keys must be long to be safe (large prime products are hard to
factor, small ones are easy)
128 byte key = 1024 bits ~= 150 characters - too much to memorize
Where to store private key? Public key is not a problem
- directory on Net is most convenient
Store with computer - human error attacks
PGP encrypts the private key with a pass phrase
PGP weaknesses
Losing your private key/pass phrase
Tampering with public keys
Most of these are in the PGP shell you use
files that don't really get deleted from the disk
temporary copies of plaintext files (swap space, /tmp)
viruses within computer or PGP shell - could capture pass phrase
trojan horse versions of PGP shell
The limits of cryptography
Three sources of weakness to exploit
-
algorithms
-
implementation
-
human factors
RSA-129 Challenge
Issued by Martin Gardner in 1977 Scientific American article describing
RSA 129 digit key - expected to take 1015 years to crack.
Group of people used the Net to accumulate 5,000 mips-years in 1993/94
to break it
New means of factoring large primes - vector units. Accumulate enough
vector units, determine the factor. Makes estimations of difficulty of
factoring wrong since better algorithms now available.
Netscape SSL - direct attack
Export quality (40 bit) key cracked by several people using idle workstations,
spare supercomputer time
Took a long weekend of searching the entire key space (64 mips-years)
128 bit US version keys are much stronger - "billion machines, each
a million times more powerful, 6 billion years required"
Netscape SSL - indirect attack
Property of PRNG - they repeat if started from the same seed.
Two UCB students - looking at algorithm used to select session key to
encrypt messages noticed that the seed to the PRNG was determined from
time-of-day, pid, and parent pid
Process ids are easy to get, or even to break (15 bit number)
Huge mistake in implementation cause Netscape to do a new version, start
a "bugs bounty" prorgram
Secret key schemes
RSA Security issued a "secret key challenge" in 1997 to break various stengths
and algorithms of secret key encryption.
40-bit (export version)
From PC Week article:
"The 40-bit key that can readily be exported permits more than
a trillion possible values; the 56-bit DES key permits more than 72,000
trillion possible values. But even at this higher level of protection,
a professional DES-cracking machine (which may already exist, based on
a 1993 design) could crack a new key (on average) every 3.5 hours at a
present-day construction cost of about $600,000."
56-bit version
RSA Security sponsored a $10,000 challenge to anyone breaking
the 56bit DES code. It took 5 months of people using idle cycles of machines
connected to the net to do it.
brute force attack:
72 quadrillion keys possible
peak rate of 7 billion keys tested per second
25% of key space searched before key found
10s of thousands of cooperating hosts
all operating systems, huge range of compute power