/ writeup.org
writeup.org
  1  #+TITLE: Description of the NSA manipulated SBT algorithm
  2  #+AUTHOR: stf
  3  #+EMAIL: s@ctrlc.hu
  4  #+OPTIONS:   H:2 num:t \n:nil @:t ::t |:t ^:t -:t f:t *:t <:t
  5  #+OPTIONS:   TeX:t LaTeX:t skip:nil d:nil todo:t pri:nil tags:not-in-toc
  6  #+DESCRIPTION:
  7  #+EXCLUDE_TAGS: noexport
  8  #+KEYWORDS:
  9  #+LANGUAGE: en
 10  #+SELECT_TAGS: export
 11  #+OPTIONS: texht:t
 12  #+LaTex_HEADER: \usepackage{lmodern}
 13  #+LaTex_HEADER: \usepackage[T1]{fontenc}
 14  #+LaTex_HEADER: \usepackage{textcomp}
 15  #+LaTex_HEADER: \usepackage{graphicx}
 16  #+LaTex_HEADER: \usepackage{minted}
 17  #+LaTex_HEADER: \usepackage{listings}
 18  #+LaTex_HEADER: \usepackage{msc}
 19  #+LaTex_HEADER: \usepackage{xcolor}
 20  #+LaTex_HEADER: \usepackage{tikz}
 21  #+LaTeX_HEADER: \usemintedstyle{gruvbox-light}
 22  #+LaTex_HEADER: \def\shrug{\texttt{\raisebox{0.75em}{\char`\_}\char`\\\char`\_\kern-0.5ex(\kern-0.25ex\raisebox{0.25ex}{\rotatebox{45}{\raisebox{-.75ex}"\kern-1.5ex\rotatebox{-90})}}\kern-0.5ex)\kern-0.5ex\char`\_/\raisebox{0.75em}{\char`\_}}}
 23  #+LaTex_HEADER: \definecolor{AlmostWhite}{rgb}{0.97,0.97,0.97}
 24  
 25  * Description of the NSA manipulated SBT algorithm
 26  ** Abstract
 27  
 28  We reverse-engineered and analyzed the ROM of a cryptographic device
 29  manipulated by the NSA. The following paper summarizes the findings
 30  and gives a preliminary analysis. For more information about this
 31  device see the page in the cryptomuseum:
 32  
 33  https://www.cryptomuseum.com/crypto/philips/ua8295/sbt.htm
 34  
 35  This is a draft of a work in progress project.
 36  
 37  ** Diffing the original and the manipulated ROM
 38  *** The original ROM contained a DES implementation.
 39  *** The original DES function has been replaced with what we refer to az nsa() function below.
 40  *** A diff with the manipulated ROM shows the region 0x1c5d-0x2181 to be changed.
 41  *** The code between 0x1aed - 0x2181 has been entirely decompiled by hand.
 42  
 43  A lot of other code has also been hand decompiled, but is irrelevant
 44  for this topic and discussion of this has been omitted from this document.
 45  
 46  
 47  *** One change is a modification of a constant, this seems to be a checksum function.
 48  
 49  #+LATEX: \hfill
 50  #+LATEX: \begin{minipage}{0.95\textwidth}
 51  #+ATTR_LATEX: :options bgcolor=AlmostWhite
 52  #+begin_SRC c
 53    char maybe_checksum(char *param_1,char param_2) {
 54      char cVar1 = 0x12; // changed from 0xab
 55      if ((char)((ushort)param_1 >> 8) == 0x40) {
 56        return 0x40;
 57      }
 58      do {
 59        cVar1 += *param_1++;
 60      } while (param_2 != (char)((ushort)param_1 >> 8));
 61      return cVar1 + -1;
 62    }
 63  #+END_SRC
 64  #+LATEX: \end{minipage}
 65  
 66  This explains the claim by cryptomuseum that the checksum of the original ROM is equal to the checksum of the NSA manipulated ROM.
 67  *** Another change is a one byte change in an unused data area.
 68  
 69  Speculation: this might be to force the colliding checksum with the original ROM.
 70  
 71  *** The diff also shows a few small changes outside the main changed area, all these are call-sites into the main changed area.
 72  
 73  The targets of these call sites are:
 74   - 5 times: 1c6f ($nsa\_init()$)
 75   - 7 times: 1cf5 ($crypt\_byte()$)
 76   - once 1d13     ($nsa\_short()$)
 77  
 78  ** Generic algorithm overview
 79  *** The algorithm is a stream-cipher construction.
 80  *** The Key is a 15 character string of values for characters A-Z, 0-9 and ',', '.', '-', space.
 81  *** The key has a total entropy of \tilde80 bits.
 82  
 83  log_2(40^15) = 79.82892142331043
 84  
 85  *** The plaintext contains only characters A-Z,0-9 and ',', '.', '-', space and '$\_$' used only for padding.
 86  
 87  It is possible that also newline is an allowed character. This must be
 88  verified, but doesn't influence anything significantly.
 89  
 90  The charset consists of 41 valid characters, which are represented by 6 bits. This means:
 91   - 27 characters can be represented by only 5 bits (a-z and padding char '$\_$')
 92   - the 6th bit is always 0 for letters-only, for all allowed
 93     characters there is only a 35% chance that bit 6 is set (and thus
 94     one of the numbers or punctuation.)
 95  
 96  *** The algorithms I/O
 97  **** The nsa() function takes 2 input parameters: 8B state, 7B message key.
 98  **** The nsa() function outputs 2 results: an updated 8B state, and an 8B cipherblock.
 99  **** The nsa() function updates the state by applying an LFSR for 64 steps: (1 + x^32 + x^63).
100  *** Initialization
101  **** A 15 bit (with 12 bits of entropy) nonce is passed as a parameter.
102  This means that a birthday collision of the key stream will have a 50%
103  chance after just 64 messages with the same key.
104  
105  15 bits from 12 bits of entropy are the result of taking 3 4bit
106  values and incremented by 1, which thus can be stored on 5 bits, but
107  those 5 bits can only take 16 distinct values (1-16)
108  
109  **** To initialize the cipher, the key is split into 2 parts.
110  **** The first 8 characters of the key are used for state.
111  **** The 3x5 bits of the nonce are xored with the low bits of the first 3 bytes of the state.
112  **** The last 7 characters of the key are used for the message key.
113  **** The core nsa() function is called with the state and the message key.
114  **** The first 7 bytes of the output of the nsa() function is used as the derived message key.
115  **** The state is set to a fixed value of { 0xf5, 0xc0, 0x7a, 0x10, 0x8a, 0xaf, 0x17, 0xcf }
116  **** nsa() is called a second time generating the first 8 bytes of the cipherstream.
117  **** The state last 3 bytes are overwritten with 3x5 bits of the nonce with the 3 highest bits of the overwritten bytes set to 0.
118  *** Setting a key
119  **** The initialization is called with a nonce with all bits set to zero
120  **** The verification check group is calculated based on the value of the 8B cipherblock
121  
122  The code calculating and converting the output to ASCII looks as following:
123  
124  #+LATEX: \hfill
125  #+LATEX: \begin{minipage}{0.95\textwidth}
126  #+ATTR_LATEX: :options bgcolor=AlmostWhite
127  #+BEGIN_SRC c
128    static void key_id(const uint8_t *cipherblock) {
129      printf("key_id: ");
130      uint8_t *cb=cipherblock; // alias
131      for(int i=7;i>0;i-=2) {
132        putchar((((cb[i]>>4 ^ cb[i-1]) & 0xf) | 0x40) + 1);
133      }
134      putchar('\n');
135    }
136  #+END_SRC
137  #+LATEX: \end{minipage}
138  
139  It is known, that a key consisting of 15 times the letter A - using the
140  manipulated algorithm -, generates a verification code of JJDI.
141  *** A hard-coded default key exists, it is 2V58RGKLXN49TKD
142  
143  This key was also present in the original non-manipulated ROM.
144  *** Encrypting a string
145  **** Generate nonce by reading the 8bit timer TL0 5 times and xoring this with the previous encryption first five cipherstream bytes, keeping only the low nibbles.
146  
147  The following code shows the important details:
148  
149  #+LATEX: \hfill
150  #+LATEX: \begin{minipage}{0.95\textwidth}
151  #+ATTR_LATEX: :options bgcolor=AlmostWhite
152  #+BEGIN_SRC c
153    volatile uint8_t TL0; // 8b Timer 0 at SFR:0x89
154    // nonce is uninitialized assumed to be 0 after reset
155    // nonce is overwritten after init with cipherblock[:5]
156    static uint8_t nonce[5]={0};
157  
158    for(int i = 0; i<5; i++) {
159      uint8_t tmp = (nonce[i] ^ TL0) & 0xf;
160      nonce[i] = tmp + 1;
161    }
162  #+END_SRC
163  #+LATEX: \end{minipage}
164  
165  Note, that this is simplified. In the ROM each generated "character" is
166  also displayed on the device display and optionally also sent to the
167  printer, which takes more cycles.
168  
169  Hypothesis: according to nonces generated by a real device, it seems that if the
170  printer is not enabled, and no interrupt is handled, then TL0
171  increases by 15 in each iteration.
172  
173  Question, if the printer is enabled, how does TL0 increase in each iteration?
174  
175  All in all it can be said that TL0 is a very bad random source.
176  
177  **** Initialize the derived message key using the "random" nonce generated in the previous step.
178  **** Copy the first 5B of the initial cipherblock to the nonce.
179  
180  This step can have two explanations:
181  
182   - it is to improve the entropy of the next nonce
183   - it is to leak the first 5 lower nibbles of the initial cipherblock
184  
185  The second explanation only makes sense if it is possible to reliably
186  guess the delta of TL0 when generating successive values for the
187  nonce.
188  **** Iteratively xor 3 bytes of plaintext with the cipherblock.
189  **** Expand the lower 6b of each ciphertext triple into 5 x 6b creating 5 letter groups.
190  
191  This step discards the top 2b of each ciphertext byte, since the original plaintext only used the lower 6b anyway.
192  
193  These 5 letter groups are also displayed on the device and optionally also sent to the printer.
194  **** Whenever a cipherblock is consumed, genereate 8 more cipher stream bytes by calling nsa() with the state and the derived message key.
195  *** Block operation
196  
197  #+LATEX: \hfill
198  #+LATEX: \begin{minipage}{0.95\textwidth}
199  #+ATTR_LATEX: :options bgcolor=AlmostWhite
200  #+BEGIN_SRC python
201      for i in 0..4:
202          nonce[i] = ((nonce[i] ^ TL0) & 0xf) + 1
203      state, cipherblock = nsa(key[0:8] ^ nonce[0:3],key[8:15])
204      message_key = cipherblock[0:7]
205      state0 = [0xf5, 0xc0, 0x7a, 0x10, 0x8a, 0xaf, 0x17, 0xcf]
206      state, cipherblock = nsa(state0, message_key)
207      # note state[0:5] is fully known, it is state0
208      # run through the LFSR 64 times
209      state = state[0:5] + nonce[0:3]
210      # since nonce is also fully known
211      # the full state from here on is known
212      nonce = cipherblock[0:5]
213      for each plaintextblock:
214          # state is only modified using
215          # 64 steps of the (1 + x^32 + x^63) LFSR
216          state, cipherblock = nsa(state, message_key)
217          ciphertextblock = cipherblock ^ plaintextblock
218  #+END_SRC
219  #+LATEX: \end{minipage}
220  
221  The following gives a quick overview of all the steps involved when
222  running the nsa() function:
223  
224  Input: state, message key
225  
226   1.  state = lfsr(64), output state
227   2.  cipherblock = $fix\_bit\_permutation(state)$
228   3.  iterate 8 times
229     1. rotate right both 28b halves of the message key by [5, 2, 2, 5, 5, 5, 2, 2] bits
230     2. $ctrl\_words[i:0..3] = state[3-i] \oplus msgkey\_bits(bits...)$ (for the concrete bits, see details below)
231     3. rotate state right by one byte
232     4. for each nibble of the cipherblock, modify it based on $ctrl\_words$ & cipherblock
233     5. permute cipherblock bytes, new order: 3 4 6 7 2 1 5
234     6. nibble swap cipherblock bytes if bits 26, 54, 1, 29, 4, 32, 7, 35 of message key are set
235     7. SBOX each cipherblock nibble
236  
237  Output: cipherblock, state from step 1
238  
239  For details about each of these steps, see the following section.
240  
241  ** Core algorithm - $nsa(state,message\_key)$
242  The core algorithm is well described on the original cryptomuseum page at: https://www.cryptomuseum.com/crypto/philips/ua8295/sbt.htm
243  *** The state is updated by applying an LFSR for 64 steps: (1+x^{32}+x^{63})
244  
245  For the first cipherblock the initial state is 0xf5c07a108aaf17cf
246  which is shifted to be 0xbfb7b6ef1a8c5090.
247  
248  *** Cipherblock is initialized to the updated state
249  *** Cipherblock bits are permuted
250  
251  New order of bits:
252  
253  #+LATEX: \hfill
254  #+LATEX: \begin{minipage}{0.95\textwidth}
255  #+ATTR_LATEX: :options bgcolor=AlmostWhite
256  #+BEGIN_SRC python
257  35, 10, 30, 57, 2, 55, 40, 20
258  29, 0, 62, 15, 52, 34, 43, 23
259  58, 28, 12, 46, 4, 19, 53, 38
260  42, 31, 6, 36, 9, 48, 21, 60
261  18, 51, 32, 1, 41, 56, 26, 13
262  50, 37, 25, 45, 8, 17, 59, 5
263  47, 61, 11, 27, 33, 54, 7, 22
264  49, 3, 14, 63, 24, 16, 39, 44
265  #+END_SRC
266  #+LATEX: \end{minipage}
267  
268  Bits are indexed from left to right, from 0 .. 63.
269  
270  For the first cipherblock, with the known fixed state of
271  0xf5c07a108aaf17cf, the result of this permutation is:
272  0xead04d72c73b23ed.
273  
274  For the basis of this analysis, see the file checkbitperm.py.
275  
276  *** The following steps are iterated 8 times
277  **** Rotation of message key.
278  
279  The message key is split into two 28 bit chunks, and both of the
280  chunks are rotated right by either 5 or 2 bits each round. The number
281  of bits rotated for each of the eight rounds is: 5, 2, 2, 5, 5, 5, 2, 2.
282  
283  See the following illustration for the bits of the 7 bytes of the
284  message key, each character is a bit, the number at the start of the
285  line shows by how many bits the halves are rotated to the right, in
286  each of the rounds. The first line is the original key that is rotated
287  in the first round into the second line, thus the original order of
288  the message key is only used in the last round.
289  
290  \footnotesize
291  #+BEGIN_EXAMPLE
292    abcdefgh ijklmnop qrstuvwx yzAB | CDEF GHIJKLMN OPQRSTUV WXYZ0123
293  5 xyzABabc defghijk lmnopqrs tuvw | Z012 3CDEFGHI JKLMNOPQ RSTUVWXY
294  2 vwxyzABa bcdefghi jklmnopq rstu | XYZ0 123CDEFG HIJKLMNO PQRSTUVW
295  2 tuvwxyzA Babcdefg hijklmno pqrs | VWXY Z0123CDE FGHIJKLM NOPQRSTU
296  5 opqrstuv wxyzABab cdefghij klmn | QRST UVWXYZ01 23CDEFGH IJKLMNOP
297  5 jklmnopq rstuvwxy zABabcde fghi | LMNO PQRSTUVW XYZ0123C DEFGHIJK
298  5 efghijkl mnopqrst uvwxyzAB abcd | GHIJ KLMNOPQR STUVWXYZ 0123CDEF
299  2 cdefghij klmnopqr stuvwxyz ABab | EFGH IJKLMNOP QRSTUVWX YZ0123CD
300  2 abcdefgh ijklmnop qrstuvwx yzAB | CDEF GHIJKLMN OPQRSTUV WXYZ0123
301  #+END_EXAMPLE
302  \normalsize
303  
304  After 8 rounds the message key is rotated into its original state.
305  
306  For the basis of this analysis, see the file ctrlwords.py and
307  updctrlvar.py.
308  
309  **** Create 4 box permutation control bytes by xoring the first 4 bytes of the state in reverse order with certain bits of the message key.
310  
311  \footnotesize
312  #+LATEX: \hfill
313  #+LATEX: \begin{minipage}{0.95\textwidth}
314  #+ATTR_LATEX: :options bgcolor=AlmostWhite
315  #+BEGIN_SRC python
316       ctrl_words[0] = state[3] ^ msgkey_bits(10, 38, 13, 41, 16, 44, 19, 47)
317       ctrl_words[1] = state[2] ^ msgkey_bits(22, 50, 25, 53, 0, 28, 3, 31)
318       ctrl_words[2] = state[1] ^ msgkey_bits(6, 34, 9, 37, 12, 40, 15, 43)
319       ctrl_words[3] = state[0] ^ msgkey_bits(18, 46, 21, 49, 24, 52, 27, 55)
320  #+END_SRC
321  #+LATEX: \end{minipage}
322  \normalsize
323  
324  Bits are indexed from left to right, 0..55.
325  
326  **** Rotate the state by 1 byte right.
327  **** Box permutation of cipherblock - defined by the 4 control bytes created previously in step (b).
328  
329  This part is manipulating each nibble of the cipherblock in order,
330  from left to right, it operates on nibbles (4 bit) and quads (2 bit)
331  values.
332  
333  Let's denote C the vector of 16 nibbles of the cipherblock. Then the
334  current nibble is $n = C_i, i \in \{0..16\}$. Furthermore let's
335  denote the high quad of a nibble by $n_{hi}$ and the low quad
336  $n_{lo}$.
337  
338  Depending on
339  
340   - the current control quad (generated in step (b) above), and
341   - the value (either 0 or 3) of one of the two quads of the current nibble,
342  
343  the current nibble is either
344  
345   - changed by a certain value (one of \pm 1 or \pm 4) or
346   - a second nibble is selected from $C$ by xoring one bit in the index
347     i: $n^{\prime} = C_{i \oplus s}, s \in {8,2,4,1}$. The high and low
348     quad of this second nibble are added to the quad of the current
349     nibble that is not used for deciding whether to do this. The quad
350     used for deciding to do this is inverted.
351  
352  | cbtq | quad      | 2nd nibble                | n_{hi}                           | n_{lo}                           | else |
353  |------+-----------+---------------------------+----------------------------------+----------------------------------+------|
354  |    0 | n_{hi}==0 | n^{\prime}=C_{i \oplus 8} | =3                               | +n^{\prime}_{lo}+n^{\prime}_{hi} |   -4 |
355  |    1 | n_{lo}==0 | n^{\prime}=C_{i \oplus 2} | +n^{\prime}_{lo}+n^{\prime}_{hi} | =3                               |   -1 |
356  |    2 | n_{hi}==3 | n^{\prime}=C_{i \oplus 4} | =0                               | +n^{\prime}_{lo}+n^{\prime}_{hi} |   +4 |
357  |    3 | n_{lo}==3 | n^{\prime}=C_{i \oplus 1} | +n^{\prime}_{lo}+n^{\prime}_{hi} | =0                               |   +1 |
358  
359  Let's see an example: the control byte top quad (cbtq) is 0, the index
360  (i) of the current nibble is 5 and the value of the current nibble
361  is 1 (C_5 = 1). Then cbqt selects the first row in the table above. Since the
362  current nibble is 1, the high quad is all zeros, and thus we have to
363  get the 2nd nibble, whose index is 13 (5 \oplus 8), let's say it's value
364  is 9 (C_13 = 9). This means the high quad of the output nibble will be 3, and the
365  bottom quad will be $1 (n_{lo}) + 2 (n^{\prime}_{hi}) + 1 (n^{\prime}_{lo}) = 4 mod 4 =
366  0$. Combining all this, the nibble C_{5} will change from 1 to 12.
367  
368  For the basis of this analysis, see the file boxperm-visu.py
369  **** Cipherblock fix byte permutation, new order of bytes: 3, 4, 6, 7, 2, 1, 5, 0
370  **** Cipherblock Nibble switch - controlled by derived message key
371  
372  Depending on the bits 26, 54, 1, 29, 4, 32, 7, 35 message key, the
373  according byte of the cipherblock has its nibbles swapped.
374  
375  **** Each nibble of the cipherblock is mapped with their own SBox
376  There is 16 4x4 SBoxes in total.
377  
378  ** Test Vectors
379  
380  The fine people of the cryptomuseum generously generated a few test
381  vectors:
382  
383  #+BEGIN_EXAMPLE
384  HELLO WORLD
385  (truncated) BAMLK GAFFJ EDBCP
386  (recovered) BAMLK GAFFJ EOBCP EEIMP CDAFP
387  NAVEL PLUIS
388  (truncated) HCAMM MBKJO DGCNC
389  (truncated) EGLLA IKMHG PHHJL
390  LEEIC BITKY DBCFE DPKLB ECILE
391  JDMNJ REJHP CPBLO JPJDD BEDNG
392  #+END_EXAMPLE
393  
394  Unfortunately some of them have been truncated during transcription,
395  but knowing the plaintext and the nonce, and having a working
396  emulation of the original rom and our reverse-engineered
397  re-implementation we were able to recover the missing 5 letter groups.
398  
399  ** Analysis
400  *** Key-strength
401  
402  Although the "master" key is 15 characters from an alphabet of 40
403  signs and thus ~80 bits of entropy, after the initialization only 56
404  bits of entropy are used as a message key for the generation of the
405  first cipherblock. Thus on the surface it looks like the strength is
406  80 bits, in reality it is reduced to 56 bits immediately.
407  
408  This means an attacker might want to recover the message key which
409  will be useful to decrypt other messages using the same nonce and
410  key.
411  
412  *** Randomness
413  
414  The first message after rebooting uses only 12 bits of the the 8bit
415  timer TL0 for generating a nonce, this is very predictable and very
416  low quality. All subsequent messages have nonces that are using the
417  8bit timer xored with the low nibbles of the first 5 bytes of the
418  previous key stream.
419  
420  *** Nonce Reuse
421  
422  Since the nonce is effectively only 12b, collisions can happen very
423  often due to the birthday paradox. With 64 messages the chances for a
424  collision are 50%.
425  
426  *** Information leakage
427  
428  After reboot, the very first message leaks the delta of TL0 between
429  the 5 iterations generating the nonce. This delta is depending on
430  whether the nonce is also sent to the printer, or only the display,
431  another factor that might affect this is any interrupts occuring
432  during the generation of the nonce.
433  
434  Starting with the second message the first 5 lower nibbles of the
435  initial cipherblock of the previous message leaks - althought this is
436  xored with the 8b timer TL0, which seems to have a very predictable
437  delta, which can be learned by looking at the first message after a
438  reboot.
439  
440  *** Algebraic attack
441  
442  An algebraic attack without any extra information is not efficient, in
443  our tests a bruteforce attack was significantly faster. It might be
444  possible that an algebraic attack taking using the leakage of the
445  initial cipherblock in the following messages nonce might improve,
446  this needs to be tested.
447  
448  *** Bruteforce attack
449  
450  A naive bruteforce attack against 4 out of 7 bytes of the derived
451  message key completes in about 70 minutes and finds 16328869 candidate
452  plaintexts for a the encrypted string "HELLO WORLD_". Interestingly
453  some candidate plaintexts occur multiple times for example
454  "WBMJN6W6V2QR" is generated using the derived message keys:
455   - 51914e1b5951f5
456   - 51914e1f5951f5
457   - 11914e1f5951f5
458  Hypothesis: This is due to the top 2 bits of each cipherblock byte
459  being discarded in this character encoding scheme. And thus it seems
460  plausible that each candidate plaintext can be generated using 4
461  different derived message keys.
462  
463  *** Known and limited state
464  
465  After the generation of the derived message key the state is
466  initialized to a known value of
467  
468  #+BEGIN_EXAMPLE
469  [0xf5, 0xc0, 0x7a, 0x10, 0x8a, 0xaf, 0x17, 0xcf]
470  #+END_EXAMPLE
471  
472  It is curious that this fixed state is being "LFSRed" and used for the
473  generation of the first cipherblock, and only aftewards are the last 3
474  bytes of this "LFSRed" state overwritten by the first three bytes of
475  the nonce. Why not use the nonce also for the first cipherblock? Could
476  it be that there is some pre-computation look-up table that can
477  somehow be used for this fixed state value?
478  
479  All later rounds modified the state only by running it 64 times per
480  block through the LFSR. Since the 3 bytes of the nonce each have their
481  top nibble set to all-zero, this means that for the second cipherblock
482  there is only 4096 different starting points in the LFSR.
483  
484  It is also notable, that since the inputs to the nsa() function are
485  only the fixed message key and the trivially computable state, this
486  function can be very efficiently parallelized, and any arbitrary
487  cipherblock can be independently computed without computing all
488  preceeding cipherblocks.
489  
490  *** Message key analysis
491  
492  The message key is derived from the 15 character master key by running
493  it through the nsa() algorithm once. It is 56 bits long. The message
494  key:
495   - is split in half and each half is rotated by 5 or 2 bits to the right.
496   - then it is used in the during the 8 rounds of the nsa() algorithm
497     to create the 32 bit control word for the box permutation.
498   - and finally 8 bits of it are also used for the nibble switch
499     operation.
500  
501  We run a symbolic simulation of these 8 rounds to see which bits of
502  the message key are (un)used, and how often.
503  
504  The following shows the 56 bit message key, with each bit assigned its own
505  unique symbol (a-zA-Z0-3):
506  
507  \footnotesize
508  \ttfamily
509  abcdefgh ijklmnop qrstuvwx yzABCDEF GHIJKLMN OPQRSTUV WXYZ0123
510  \normalsize
511  \normalfont
512  
513  In the following figure the first four lines show which bits of the
514  message key were used for the four box perm control words in exactly
515  this order, and the last line in this figure shows which message key
516  bits were used for the nibble swap operation.
517  
518  The first column in each row shows the bits that are unused for the
519  specific value. All bits that are used twice are highlighted with red.
520  
521  Box permutation control words (rounds in row, words in column):
522  
523  \footnotesize
524  \ttfamily
525  2A \textcolor{red}{fH}iKlN\textcolor{red}{oQ} dFgIjL\textcolor{red}{mO} bDeGhJ\textcolor{red}{kM} y0B3cE\textcolor{red}{fH tV}wYz1aC \textcolor{red}{oQ}rTuWxZ \textcolor{red}{mO}pRsUvX \textcolor{red}{kM}nPqS\textcolor{red}{tV}
526  
527  Mk \textcolor{red}{rT}uWxZ\textcolor{red}{A2} pRsUvX\textcolor{red}{y0} nPqStV\textcolor{red}{wY} iKlNoQ\textcolor{red}{rT dF}gIjLmO \textcolor{red}{A2}bDeGhJ \textcolor{red}{y0}B3cEfH \textcolor{red}{wY}z1aC\textcolor{red}{dF}
528  
529  Yw \textcolor{red}{bD}eGhJ\textcolor{red}{kM} B3cEfH\textcolor{red}{iK} z1aCdF\textcolor{red}{gI} uWxZA2\textcolor{red}{bD pR}sUvXy0 \textcolor{red}{kM}nPqStV \textcolor{red}{iK}lNoQrT \textcolor{red}{gI}jLmO\textcolor{red}{pR}
530  
531  Ig \textcolor{red}{nP}qStV\textcolor{red}{wY} lNoQrT\textcolor{red}{uW} jLmOpR\textcolor{red}{sU} eGhJkM\textcolor{red}{nP B3}cEfHiK \textcolor{red}{wY}z1aCdF \textcolor{red}{uW}xZA2bD \textcolor{red}{sU}vXy0\textcolor{red}{B3}
532  \normalsize
533  \normalfont
534  
535  Nibble swap control words for each round:
536  
537  \footnotesize
538  \ttfamily
539  Qo \textcolor{red}{vX}y0B3\textcolor{red}{cE} tVwYz1\textcolor{red}{aC} rTuWxZ\textcolor{red}{A2} mOpRsU\textcolor{red}{vX hJ}kMnPqS \textcolor{red}{cE}fHiKlN \textcolor{red}{aC}dFgIjL \textcolor{red}{A2}bDeG\textcolor{red}{hJ}
540  \normalsize
541  \normalfont
542  
543  It is interesting to see, how the repeat bits occur always in the
544  same position, and also how the same bits occur in exactly the same
545  order in each line.
546  
547  It is also notable, that the repeats happen always in pairs, or rather
548  quads, which is very relevant for the box permutation. Although these
549  control words are all xored with the state, this state is known.
550  
551  The quads that are missing from one line, are double in the line above
552  it.
553  
554  For the basis of this analysis, see the file $msgkey\_ana.py$.
555  
556  *** Scrambling pads ciphertext to multiple of 3
557  
558  Due to the expansion from 3 chars to the 5 letter group, the very last
559  two characters in a message might be the "$\_$" character which is
560  used for padding. Introducing a known plaintext crib.