parazyd.org/1/git/tomb/file/doc/LinuxHDEncSettings.txt.gph

  URI:

tLinuxHDEncSettings.txt - tomb - the crypto undertaker
HTML git clone git://parazyd.org/tomb.git
DIR Log
DIR Files
DIR Refs
DIR README
DIR LICENSE
---
tLinuxHDEncSettings.txt (20410B)
---
1
2 Linux hard disk encryption settings
3
4 This page intends to educate the reader about the existing weaknesses
5 of the public-IV on-disk format commonly used with cryptoloop and
6 dm-crypt (used in IV-plain mode). This page aims to facilitate risk
7 calculation when utilising Linux hard disk encryption. The attacks
8 presented on this page may pose a thread to you, but at the same time
9 may be totally irrelevant for others. At the end of this document, the
10 reader should be able to make a good choice according to his security
11 needs.
12
13 A good quote with respect to this topic is ''All security involves
14 trade-offs'' from Beyond Fear (Bruce Schneier). You should keep in mind
15 that perfect security is unachievable and by all means shouldn't be
16 your goal. For instance, when using pass phrase based cryptography, you
17 have to trust in that the underlying system is secure, the computer
18 system has not been tampered with, and nobody is watching you. The most
19 obvious weakness is the last one, but even if you make sure nobody nor
20 any camera is around, how about the keyboard you're typing on? Has it
21 been manipulated while you have been getting your lunch?
22
23 So security comes for a price, and the price when designing
24 cryptography security algorithms is performance. You will be introduced
25 to the fastest of all setups available, the "public-IV", which
26 sacrifices security properties for speed. After that we will talk about
27 ESSIV, the newest of IV modes implemented. It comes for a small price,
28 but it can deal with watermarking for a relatively small price. Then
29 you'll be introduced to the draft specifications of the Security in
30 Storage Working Group ([18]SISWG). Currently SISWG is considering EME
31 and LRW for standardisation. EME along with it's cousin CMC seems to
32 provide the best security level, but imposes additional encryption
33 steps. Plumb-IV is discussed only for reference, because it has the
34 same performance penalty as CMC, but in constrast suffers from
35 weaknesses of CBC encryption.
36
37 As convention, this document will use the term "blocks", when it
38 referes to a single block of plain or cipher text (usually 16 byte),
39 and will use the term "sectors", when it refers to a 512-byte wide hard
40 disk block.
41
42 CBC Mode: The basic level
43
44 Most hard disk encryption systems utilise CBC to encrypt bulk data.
45 Good descriptions on CBC and other common cipher modes are available at
46 * [19]Wikipedia
47 * [20]Connected: An Internet Encyclopedia
48 * [21]NIST: Recommendation for Block Cipher Modes of Operation (CBC
49 is at PDF Page 17)
50
51 Please make sure you're familiar with CBC before proceeding.
52
53 Since CBC encryption is a recursive algorithm, the encryption of the
54 n-th block requires the encryption of all preceding blocks, 0 till n-1.
55 Thus, if we would run the whole hard disk encryption in CBC mode, one
56 would have to re-encrypt the whole hard disk, if the first computation
57 step changed, this is, when the first plain text block changed. Of
58 course, this is an undesired property, therefore the CBC chaining is
59 cut every sector and restarted with a new initialisation vector (IV),
60 so we can encrypt sectors individually. The choice of the sector as
61 smallest unit matches with the smallest unit of hard disks, where a
62 sector is also atomic in terms of access.
63
64 For reference, I will give a formal definition of CBC encryption and
65 decryption. Note, that decryption is not recursive, in contrast to
66 encryption, since it's a function only of C[n-1] and C[n].
67 Encryption:
68 C[-1] = IV
69 C[n] = E(P[n] ⊕ C[n-1])
70 Decryption:
71 C[-1] = IV
72 P[n] = C[n-1] ⊕ D(C[n])
73 The next sections will deal with how this IV is chosen.
74
75 The IV Modes
76
77 The "public-IV"
78
79 The IV for sector n is simply the 32-bit version of the number n
80 encoded in little-endian padded with zeros to the block-size of the
81 cipher used, if necessary. This is the most simple IV mode, but at the
82 same the most vulnerable.
83
84 ESSIV
85
86 E(Sector|Salt) IV, short ESSIV, derives the IV from key material via
87 encryption of the sector number with a hashed version of the key
88 material, the salt. ESSIV does not specify a particular hash algorithm,
89 but the digest size of the hash must be an accepted key size for the
90 block cipher in use. As the IV depends on a none public piece of
91 information, the key, the sequence of IV is not known, and the attacks
92 based on this can't be launched.
93
94 plumb IV
95
96 The IV is computed by hashing (or MAC-ing) the plain text from the
97 second block till the last. Additionally, the sector number and the key
98 are used as input as well. If a byte changes in the plain text of the
99 blocks 2 till n, the first block is influenced by the change of the IV.
100 As the first encryption effects all subsequent encryption steps due to
101 the nature of CBC, the whole sector is changed.
102
103 Decryption is possible because CBC is not recursive for decryption. The
104 prerequisites for a successful CBC decryption are two subsequent cipher
105 blocks. The former one is decrypted and the first one is XOR-ed into
106 the decryption result yielding the original plain text. Therefore
107 independent of the IV scheme, decryption is possible from the 2nd to
108 the last block. After the recovery of these plain text blocks, the IV
109 can be computed, and finally the first block can be decrypted as well.
110
111 The only weakness of this scheme is it's performance. It has to process
112 data twice: first for obtaining the IV, and then to produce the CBC
113 encryption with this IV. With the same performance penalty CMC is able
114 to achieve better security properties (CMC is discussed later), thus
115 plumb-IV will remain unimplemented.
116
117 The attack arsenal
118
119 Content leaks
120
121 This attack can be mounted against any system operating in CBC Mode. It
122 rests on the property, that in CBC decryption, the preceding cipher
123 block's influence is simple, that is, it's XORed into the plain text.
124 The preceding cipher block, C[n-1], is readily available on disk (for n
125 > 0) or may be deduced from the IV (for n = 0). If an attacker finds
126 two blocks with identical cipher text, he knows that both cipher texts
127 have been formed according to:
128 C[m] = E(P[m] ⊕ C[m-1] )
129 C[n] = E(P[n] ⊕ C[n-1] )
130 Since he found that C[m] = C[n], it holds
131 P[m] ⊕ C[m-1] = P[n] ⊕ C[n-1]
132 which can be rewritten as
133 C[m-1] ⊕ C[n-1] = P[n] ⊕ P[m]
134 The left hand side is known to the attacker by reading the preceding
135 cipher text from disk. If one of the blocks is the first block of a
136 sector, the IV must be examined instead (when it's available as it is
137 in public-IV). The attacker is now able to deduce the difference
138 between the plain texts by examining the difference of C[m-1] and
139 C[n-1]. If one of the plain text blocks happens to be zero, the
140 difference yields the original content of the other related plain text
141 block.
142
143 Another information is available to the attacker. Any succeeding
144 identical pair of cipher text, that follows the initial identical
145 cipher pair, is equal. No information about the content of those pairs
146 can be extracted, since the information is extracted from the
147 respective preceding cipher blocks, but those are all required to be
148 equal.
149
150 Let's have a look at the chance of succeeding with this attack.
151 Assuming the output of a cipher forms an uniform distribution, the
152 chance, p, of finding an identical block is 2^-blocksize. For instance,
153 p = 1/2^128 for a 128-bit cipher. Because the number of possible pairs
154 develops as an arithmetic series in n, the number of sectors, the
155 chance of not finding two identical blocks is given by
156 (1-p)^n(n-1)/2
157 As p is very small, but in contrast the power is very big, we apply the
158 logarithm to get meaningful answers, that is
159 n(n-1)/2 ln (1-p)
160 An example: The number of cipher blocks available on 200GB disk with
161 known C[n-1] is 200GB × 1024^2 KB/GB × 64/1KB ^1. Or in other words, a
162 128-bit block is 16 bytes, so the number of 16-byte blocks in a 200GB
163 hard disk is 13.4 billion. Therefore, n = 1.342e10. For a 128-bit
164 cipher, p = 2^-128. Hence,
165 ln(1-p) = -2.939e-39
166 n(n-1)/2 = 9.007e19
167
168 n(n-1)/2 ln (1-p) = -2.647e-19
169 1-e^-2.776e-13 = 2.647e-19
170 The last term is the chance of finding at least one pair of identical
171 cipher blocks. But how does this number grow in n? Obviously
172 exponentially. Plotting a few a decimal powers shows that the chance
173 for finding at least on identical cipher pair flips to 1 around n =
174 10^20 (n = 10^40 for a 256-bit cipher). This inflexion point is reached
175 for a 146 million TB storage (or a hundered thousand trillion trillions
176 TB storage for a 256-bit cipher).
177
178 ^1The blocks with available preceding cipher blocks is 62/1KB for all
179 non-public IV schemes, i.e. ESSIV/plumb IV
180
181 Data existence leak: The Watermark
182
183 No IV format discussed on this page allows the user to deny the
184 existence of encrypted data. Neither cryptoloop nor dm-crypt is an
185 implementation of a deniable cryptography system. But the problem is
186 more serious with public-IV.
187
188 With public IV and the predicable difference it introduces in the first
189 blocks of a sequence of plain text, data can be watermarked, which
190 means, the watermarked data is detectable even when the key has not
191 been recovered. As shown in the paragraph above, the existence of two
192 blocks with identical cipher text is very unlikely and coincidence can
193 be excluded, which is relevant when somebody tries to demonstrate
194 before the law that certain data is in an encrypted partition.
195
196 As the IV progresses with a foreseeable pattern and is guaranteed to
197 change the least significant bit ever step, we can build identical pair
198 of cipher text by writing three consecutive sectors each with a flipped
199 LSB relative to the previous. (The reason it's three instead of two is,
200 that the second least significant bit might change as well.) This
201 "public-IV"-driven CBC encryption will output exactly the same cipher
202 text for two consecutive sectors. An attacker can search the disk for
203 identical consecutive blocks to find the watermark. This can be done in
204 a single pass, and is much more feasible than finding to identical
205 blocks, that are scattered on the disk, as in the previous attack. A
206 few bits of information can be encoded into the watermarks, which might
207 serve as tag to prove the existence copyright infringing material.
208
209 A complete description of watermarking can be found in [22]Encrypted
210 Watermarks and Linux Laptop Security. The attack can be defeated by
211 using ESSIV.
212
213 Data modification leak
214
215 CBC encryption is recursive, so the n-th block depends on all previous
216 blocks. But the other way round would also be nice. Why? The weakness
217 becomes visible, if storage on a remote computer is used, or more
218 likely, the hard disk exhibits good forensic properties. The point is,
219 the attacker has to have access to preceding (in time) cipher text of a
220 sector, either by recording it from the network, or by using forensic
221 methods.
222
223 An attacker can now guess data modification patterns by examining the
224 historic data. If a sector is overwritten with a partial changed plain
225 text, there is an amount of bytes at the beginning, which are
226 unchanged. This point of change^2 is directly reflected in the cipher
227 text. So an attacker can deduce the point of the change in plain text
228 by finding the point where the cipher text starts to differ.
229
230 This weakness is present in public-IV and ESSIV.
231
232 ^2aligned to the cipher block size boundaries
233
234 Malleable plain text
235
236 The decryption structure of CBC is the source of this weakness.
237 Malleability (with respect to cryptography) is defined as a
238 modification of the cipher text that will resulting in a predictable
239 change in plain text. To put it formally, there is a function f(C),
240 that, if applied to the cipher text, C' = f(C), will result in a known
241 function f', which will predict the resulting plain text, P' = D(C'),
242 correctly assuming P is known, that is P' = f'(P).
243
244 As we can see in it's definition, CBC decryption depends on C[n-1]. An
245 attacker can flip arbitrary bits in the plain text by flipping bit in
246 C[n-1]. More formally^3, if
247 P = P[1] || P[2] || ... || P[i] || ... || P[n]
248 C = E[CBC](P)
249 C = C[1] || C[2] || ... || C[i-1] || ... || C[n]
250 the function
251 f(C[1] || ... || C[n]) = C[1] || ... || C[i-1] XOR M || ... || C[n]
252 follows the function f', which predicts the resulting plain text
253 correctly as,
254 f'(P[1] || ... || P[n]) = P[1] || ... || P[i] XOR M || ... || P[n]
255 The first block of the CBC cipher text stream is not malleable, because
256 it depends on the IV, which is not modifiable for an attacker.
257
258 ^3The IV parameter for E[CBC] has been intentionally omitted.
259
260 Movable
261
262 On the expense of one block decrypting to garbage, an attacker can move
263 around plain text as he likes. CBC decryption depends on two variables,
264 C[n-1] and C[n]. Both can be modified at free will. To make meaningful
265 modifications, an attacker has to replace the pair C[n-1] and C[n] with
266 other cipher text pair from disk. The first block C[n-1] will decrypt
267 to garbage, but the second block C[n] will yield a copy of the plain
268 text of the copied cipher block. This attack is also known as
269 copy&paste attack. This attack is mountable against any CBC setup. The
270 only limitation is, the first block, C[0], can't be replaced with
271 something meaningful, as C[-1] can't be modified, because it's the IV.
272
273 CMC and EME: Tweakable wide block cipher modes
274
275 CMC is a new chaining mode. It stands for ''CBC-Mask-CBC''. It works by
276 processing the data in three steps, first CBC, then masking the cipher
277 text, and then another CBC step, but this time backwards. The last step
278 introduces a dependency from the last block to the first block. The
279 authors of the CMC paper provide a prove for the security of this mode,
280 making a secure 128-bit cipher a secure 4096-bit cipher (sector size).
281 As in normal CBC, this scheme also takes an IV, but the authors call it
282 tweak.
283
284 EME is CMC's cousin. EME has also been authored by Haveli and Rogaway
285 as well been authored for the same purpose. The difference to CMC is,
286 that EME is parallelizable, that is, all operations of the underlying
287 cipher can be evaluated in parallel. To introduce an interdependency
288 among the resulting cipher blocks, the encryption happens in two
289 stages. Between these stages a mask is computed from all intermediate
290 blocks and applied to each intermediate block. This step causes an
291 interdependency among the cipher blocks. After applying the mask,
292 another encryption step diffuses the mask.
293
294 The interdependency among the resulting blocks allow CMC and EME to be
295 nonmovable, nonmalleable, to prevent content leaks and in-sector data
296 modification patterns. The tweaks are encrypted by both cipher modes,
297 thus both are nonwatermarkable.
298
299 For simplicity, the EME description above omitted the pre- and post-
300 whitening steps as well as the multiplications in GF(2^128). An
301 in-depth specification can be found at the [23]Cryptology ePrint
302 Archive. An applicable draft specification for EME-32-AES can be found
303 at [24]SISWG. I have written an EME-32-AES test implementation for
304 Linux 2.6. It's available [25]here. The CMC paper is available from the
305 [26]Cryptology ePrint Archive as well.
306
307 LRW: A tweakable narrow block cipher mode
308
309 EME as well as CMC are comparatively secure cipher modes, but heavy in
310 terms of performance. LRW tries to cope with most of security
311 requirements, and at the same time provide a good performance. LRW is a
312 narrow block cipher mode, that is, it operates only on one block,
313 instead of a whole sector. To make a cipher block tied to a location on
314 disk (to make it unmovable), a logical index is included in the
315 computation. For LRW you have to provide two keys, one for the cipher
316 and one for the cipher mode. The second key is multiplied with a
317 logical index under GF(2^128) and used as pre- and post- whitening for
318 encryption. With those whitening steps the block is effectively tied to
319 a logical index. The logical index is usually the absolute position on
320 disk measured with the block size of the cipher algorithm. The
321 different choice of the measuring unit is the only different between
322 the logical index and the public-IV.
323
324 The LRW draft is available from the [27]SISWG mailing list archive.
325
326 Summarising
327
328 The following table shows a comparison between the security properties
329 of different encryption setups and their computational costs. The
330 number of cipher calls, XOR operations and additional operations are
331 stated in terms of encryption blocks, n.
332 IV mode cipher mode content leaks watermarkable malleable movable
333 modification detection^5 cipher calls XOR ops additional op.
334 public-IV CBC Yes Yes Yes Yes Yes n n None
335 ESSIV CBC Yes No Yes Yes Yes n+1 n None
336 Plumb-IV1^4 CBC Yes No Yes Yes No 2n-1 2n None
337 public-IV CMC No No No No No 2n+1 2n+1 1 LW GF ⊗
338 public-IV EME No No No No No 2n+1 5n 3n-1 LW GF ⊗
339 public-IV LRW No No No No Yes n 2n n HW GF ⊗
340
341 Legend:
342 * LW GF ⊗: light-weight Galois field multiplication, that is, a
343 multiplication with a constant x^2^i, which can be computed in
344 θ(1).
345 * HW GF ⊗: heavy-weight Galois field multiplication, that is, a
346 multiplication with an arbitrary constant, which can be computed in
347 θ(bits).
348
349 ^4plumb-IV1 uses CBC-MAC instead of hashing, so we can make a good
350 comparison with other ciphers in terms of cipher/XOR calls.
351 ^5detectable partial in-sector modification
352 __________________________________________________________________
353
354 Clemens Fruhwirth, , also author of LUKS and ESSIV, porter of
355 cryptoloop, aes-i586 for 2.6., twofish-i586, and implementor of
356 EME-32-AES. This text is an excerpt of my diploma thesis.
357
358 This page has been reviewed by
359
360 Dr. Ernst Molitor
361 Arno Wagner
362 James Hughes , "Security in Storage Working Group" chair
363
364 Additional thanks to Pascal Brisset, for pointing out an error in the
365 Bernoulli estimation in an earlier version of this document, further
366 Adam J. Richter for pointing out an error in the KB/GB ratio.
367
368 Content and design, Copyright © 2004-2008 Clemens Fruhwirth, unless
369 stated otherwise
370 Original design by [28]haran | Additional art by [29]LinuxArt | | Blog
371 by [30]NanoBlogger
372
373 References
374
375 1. http://clemens.endorphin.org/
376 2. http://clemens.endorphin.org/credits
377 3. http://clemens.endorphin.org/aboutme
378 4. http://clemens.endorphin.org/cryptography
379 5. http://blog.clemens.endorphin.org/
380 6. http://clemens.endorphin.org/patches
381 7. http://clemens.endorphin.org/archive
382 8. http://clemens.endorphin.org/Cryptoloop_Migration_Guide
383 9. http://clemens.endorphin.org/LUKS
384 10. http://clemens.endorphin.org/AFsplitter
385 11. http://clemens.endorphin.org/lo-tracker
386 12. http://blog.clemens.endorphin.org/2008/12/luks-on-disk-format-revision-111.html
387 13. http://blog.clemens.endorphin.org/2008/11/xmonad-gridselect.html
388 14. http://blog.clemens.endorphin.org/2008/11/workaround-for-bittorrent-traffic.html
389 15. http://blog.clemens.endorphin.org/2008/09/i-love-lolcat-meme.html
390 16. http://blog.clemens.endorphin.org/2008/09/counter-steganography-research.html
391 17. http://clemens.endorphin.org/cryptography
392 18. http://www.siswg.org/
393 19. http://en.wikipedia.org/wiki/Block_cipher_modes_of_operation
394 20. http://www.freesoft.org/CIE/Topics/143.htm
395 21. http://csrc.nist.gov/publications/nistpubs/800-38a/sp800-38a.pdf
396 22. http://www.tcs.hut.fi/~mjos/doc/wisa2004.pdf
397 23. http://eprint.iacr.org/2003/147/
398 24. http://grouper.ieee.org/groups/1619/email/pdf00011.pdf
399 25. http://article.gmane.org/gmane.linux.kernel.device-mapper.dm-crypt/544
400 26. http://eprint.iacr.org/2003/148/
401 27. http://grouper.ieee.org/groups/1619/email/msg00160.html
402 28. http://www.oswd.org/user/profile/id/3013
403 29. http://www.linuxart.com/
404 30. http://nanoblogger.sourceforge.net/