SE 4472b
Midterm Study Guide
The following guide is presented as a convenience. Note: The midterm questions will be derived from topics covered in the lectures and assignments, the emphasis of which varies from year to year.
Security Notions
Concepts
 Rules to live by:
 Kerkhoff’s principle
 The security of an encryption scheme should be based on the secrecy of the key, not the secrecy of the encryption algorithm. More specifically: always assume the encryption scheme is publicly known.
 Don’t roll your own
 Crypto algorithms and software implementations are really really ridiculously easy to get wrong in subtle ways that can easily turn into big vulnerabilities. It only takes one line of code (e.g., Apple’s goto fail, or Heartbleed) to mean the difference between secure and vulnerable.
 Don’t assume something has a certain form unless you check it
 Kerkhoff’s principle
 Bruteforce attack
 Try every key and/or message until one “works”
 Bits of security
 Exponential value describing how many operations (encryption, hashes, etc) are necessary to recover the message and/or key in a particular cryptosystem. E.g. 128 bits of security means the attacker has to do on the order of 2^128 operations to succeed
 If a system can be broken in approximately 2^b operations, we say it has bbits of security.
 Negligible quantity
 A value that is less than one over any polynomial function with degree less than or equal to the security parameter
 Indistinguishability
 The probability that you can tell the difference between two things is less than a negligible quantity
 Pseudorandom functions
 A random looking mapping of inputs to outputs
 A potentially manytoone mapping
 Inverse does not necessarily exist
 Pseudorandom permutations
 Onetoone mapping
 Image and preimage sets are equivalent
 An unique inverse exists for every element
 Oracles
 An abstract black box. Ask a question, get an answer
 Oracles considered:
 Encryption Oracle
 Here’s the encryption of your query
 Decryption Oracle
 Here’s the decryption of your query
 Random Oracle
 Here’s a fixedlength random value that I associate with this particular query
 Padding Oracle
 Yes or No: Your query is the encryption of a plaintext with valid padding
 Encryption Oracle
Attack Games
Attack games are a way to qualitatively addressing the following question: how much information can about a plaintext by seeing its associated ciphertext.
Features common to all games
Games involve two players A and B. The game begins with B generating an independent random secret encryption key (n.b., every time the game is played, B picks a fresh key). A chooses two messages m1 and m2 and sends them to B. B flips a coin and chooses one of the messages, mb, and encrypts it. This value c=Enc(mb) is called the challenge. Challenge c is returned to A. The game concludes with A guessing if b=1 or b=2, i.e., if c=Enc(m1) or c=Enc(m2). If A guesses correctly, she is said to win the game. The encryption scheme is said be indistinguishable if A has negligible advantage over a random guess of guessing b.
Game types / Security levels
 Eavesdropping attack (EAV)
 A doesn’t get to make any queries.
 Chosen plaintext attack (CPA)
 B behaves as an encryption oracle. That is, at any point in the game, B will encrypt any plaintext and return the ciphertext to A. A variant is called an adaptive chosen plaintext attack in which B continues to behave as an encryption oracle
 Chosen ciphertext attack (CCA1)
 Prior to the challenge B will decrypt any ciphertext and return the plaintext to A
 Adaptive chosem ciphertext attack (CCA2)
 B with decrypt any ciphertext and return the plaintext to A both before and after the challenge. The only limitation is that A cannot submit the challenge to be decrypted (otherwise the game is trivial).
Implications of Security Levels
 INDCCA2 implies INDCCA1
 INDCCA1 implies INDCPA
 INDCPA implies INDEAV
How to achieve security levels (informal)
 INDCCA2
 It should not be possible to create a valid ciphertext without knowledge of a private/secret key
 Achieved in practice through the use of message authentication codes
 INDCPA
 Encrypting the same message twice should give two totally different looking ciphertexts
 Achieved in practice by using randomized encryption
 INDEAV
 You should have negligible advantage telling the difference between ciphertext, and random noise
Symmetrickey Primitives
Block Ciphers
Used for efficient bulk encryption of data. Encryption takes message (plaintext) and key and produces encryption (ciphertext). Decryption takes a ciphertext and key and produces a plaintext. One secret key used for both encryption and decryption.
 Ideal functionality:
 Pseudorandom permutation
 Imagine shuffling a deck of cards
 For a given key, every message has a unique ciphertext, and vice versa
 Secret key determines the particular permutation
 Fixed length input maps to fixedlength output
 Pseudorandom permutation
 Feistel Network
 A simple, provable way to take a pseudorandom function and turn it into a pseudo random permutation
 Cipher modes of operation examined:
 Electronic Codebook Mode (ECB)
 Not INDEAV secure
 Cipher Block Chaining (CBC)
 Requires an initialization vector (IV)
 Counter Mode (CTR)
 A block cipher that acts like a stream cipher
 Also requires an IV
 Pros: random access, no decryption function needed
 Electronic Codebook Mode (ECB)
 Initialization vectors (IVs)
 Works toward CPA security goal by making encryption of related messages different
 Public value transmitted with ciphertext, necessary for decryption
 Must be independently and randomly generated
 Must be unpredictable to adversary, otherwise encryption oracle attacks are possible
 Block ciphers examined:
 AES
 128bit block, 128, 192, and 256bit key options
 AES
Sample Problems

Fortuna is a pseudorandom number generator based on AES in counter mode. Recall that CTR mode encrypts a successively incrementing counter value and xors it with a message. Suppose you don’t xor the key stream with any message, you just output it. This could form a useful pseudorandom number generator. This is a actually a pretty decent approach, but it’s not perfect. After a certain amount of output you can distinguish between Fortuna’s output and that of a perfectly random sequence. How much output would you need to expect to be able to do this?
 A: recall a block cipher is a pseudorandom permutation, i.e., every plaintext maps to a unique ciphertext. That means if you were using counter mode, you’d never see a ciphertext block repeat until the counter rolled over. AES has a 128bit block, meaning you wouldn’t see a block repeat until 2^128 blocks were generated. A truly random sequence, however, does not have this property. Each block is randomly generated, and thus has a small probability of matching a previously generated block. So how many blocks would it take before you’d expect to see a repeat (i.e., where the probability was >=1/2)?
Hash Functions
Used for producing a “fingerprint” or “digest” of a message. Hashing accepts a message and produces a hash (doesn’t use a key in its basic form). Used for checking file integrity, storing passwords, and for making certain publickey operations more efficient.
 Ideal functionality:
 Random oracle
 Input: arbitrary length message
 Output: fixed length string, l bits long where l is the output length
 Given an input message m, the oracle flips l coins and associates m with the result in a giant lookup table
 Every time you input m again, it gives you back the same lbit result
 Cannot exist in practice (requires infinite memory)
 Differs from a real hash function in that the hashes of two messages m and m’ are chosen by completely independent coin tosses, whereas a real hash function generates the hashes using a deterministic (though highly nonlinear) function.
 Random oracle
 Terms:
 Preimage: input to hash function (message)
 Image: output of hash function (sometimes called message digest, fingerprint or simply “hash”)
 Hash length: the length l (in bits) of the hash
 Collision: two preimages hash to the same value. Since preimage space is unbounded, but there are 2^l images, collisions must exist due to the pigeon hole principle
 Necessary properties:
 Preimage resistance
 Given a hash y, it should be hard to find an x s.t. h(x)=y
 If hash function is indistinguishable from a random oracle, difficulty is 2^(l), i.e., lbits of security
 Second preimage resistance
 Given a preimage x, it should be hard to find a second preimage y such that h(x)=h(y)
 If hash function is indistinguishable from a random oracle, difficulty is 2^(l)
 Collision resistance
 It should be hard to find any pair x,y such that h(x)=h(y)
 If hash function is indistinguishable from a random oracle, difficulty should be 2^(l/2), i.e., (l/2)bits of security, due to birthday paradox
 Preimage resistance
 Hash functions examined:
 MD5
 l = 128
 Is no longer indistinguishable from random oracle
 NOT collision resistant
 Collisions can be generated in much less than 2^(128/2)
 Still technically preimage resistant, but the best practice is to not use it at all anymore
 SHA1
 l = 160
 Collision resistant to 2^80
 No longer meets NIST’s minimum 112bit security level for collision resistance
 Can still be used for preimage resistance, but SHA256 is the current minimum overall best practice
 SHA256
 l = 256
 128 bits of collision resistant
 MD5
Sample Problems
 Suppose there are two files f1 and f2 and suppose SHA1(f1) = SHA1(f2). Are these files identical?
 A: Possibly. If they were the same, they would definitely have the same hash value since hash functions are deterministic. If they are different, they still could possibly have the same hash (called a collision)
 Suppose two files f1 and f2 and suppose SHA1(f1) != SHA1(f2). Are these files identical?
 A: No. As before, SHA1 is deterministic, meaning the same input always gives the same output. So if the outputs are different, its because the inputs are different
Cipher padding
 Plaintext must be a multiple of the block length
 Idea: Add padding so that plaintext is a multiple of block length
 AES is a 16 byte block. For this cipher, plaintext must be a multiple of 16
 Padding must be unambiguous: cannot be confused with plaintext and vice versa
 PKCS7 is a standardized padding scheme: add
n
bytes of0xn
.  When using padding you always need to pad—even when the block length is a multiple of 16. In this case, you add an entire block of padding!
Sample Problems
 Is
0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00
a valid AES plaintext block? A: No. It is 14 bytes, and AES requires a block length that is a multiple of 16.
 What is the PKCS7 padded version of the previous plaintext?
 A: Message is 14 bytes, so add 1614=2 bytes of padding, in this case, two bytes of
0x02
:0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x02,0x02
 A: Message is 14 bytes, so add 1614=2 bytes of padding, in this case, two bytes of
Message Authentication Codes (MACs)
Used for verifying the integrity of data by associating a fixedlength value called a ‘tag’ with a given message. The tag is derived from a message and a secret key.
 Tag creation
 Input: message m, secret key k
 Tag t = MAC_k(m)
 Output t
 Tag verification
 Input: message m’, secret key k, tag t
 Tag t’ = MAC_k(m’)
 Output t’ == t
 Ideal functionality:
 Like a keyed hash function
 Variable length input maps to fixed length output
 MACs examined:
 HMAC, a MAC built from a hash function
 GHASH, the MAC portion of AESGCM
 Essentially a polynomial evaluated over a Galois field
Authenticated Encryption (AE)
A means of securely packaging a cipher with a MAC under one common interface. Simplifies (i.e., protects developers from themselves) by preventing the plaintext from being returned if the MAC tag was invalid. Uses the encryptthenmac strategy.
 Authenticated Encryption:
 Accepts: message m, encryption key ke, mac key km
 c = Enc_ke(m)
 t = MAC_km(c)
 Output: (c,t)
 Authenticated Decryption:
 Accepts: ciphertext c, tag t, decryption key ke, mac key km
 Check tag: t’ = MAC_km(c)
 If t’ != t
 Output tag error
 Else:
 Return Dec_ke(c)
Authenticated encryption function accepts a message, an encryption key, and a MAC key and produces a ciphertext and authenticator tag. Authenticated decryption accepts a ciphertext, an authenticator tag, a decryption key, and a MAC key. If the authenticator tag is valid, the function returns the plaintext, otherwise it returns an error condition.
 Modes:
 EncryptthenMAC
 MAC on ciphertext
 Preferred choice
 Used in AES/GCM
 MACthenEncrypt
 Append MAC to plaintext before encryption
 Used in TLS
 Not optimal (subtle attacks possible in some cases)
 EncryptandMAC
 MAC plaintext, append to ciphertext
 Used in SSH
 EncryptthenMAC

Encryption, MAC keys, and IVs must be independently generated
 Provides INDCCA2 security
Using encryption without a MAC is exploitable
 In the CCA2 game, the attacker can use the decryption oracle to indirectly determine m
 In practice a padding oracle attack can be used to bootstrap a decryption oracle to determine m
 A MAC scheme prevents an attacker from being successful, because they cannot produce another valid ciphertext without knowledge of the MAC key.
 AE’s examined:
 AES/GCM
 EncryptthenMAC
 Encryption = AES in CTR mode. MAC = GHASH function, i.e., a MAC function that iteratively multiplies each ciphertext block by the MAC key in GF(2^128)
 AES/GCM
Sample Problems
 Prove AESCTR is not INDCCA2 secure. Argue why AESCTR becomes INDCCA2 secure when the ciphertext is MAC’d.