Poem Codes - WWII Crypto Techniques
Introduction
A few years back, after I won my first crypto contest, the contest author, G. Mark Hardy, suggested I read Between Silk and Cyanide. Written by Leo Marks, it’s a first-person account of the difficulties managing cryptographic communications with field agents in Europe during World War II.
Much of the story centered on the “poem codes” used by the agents, but the technical details were kind of obscure and not clearly explained. So I thought I’d do my best to document how I think it worked. This probably isn’t the exact method they used, but hopefully it’ll be close enough that you can get the general idea, and understand some of the difficulties these agents faced.
Background
The British Special Operations Executive, or SOE, was tasked with “running” agents in Europe during World War II. These agents primarily operated in occupied or enemy territory, such as France and Germany, and therefore any communication with England presented exceptional risks. To protect their message traffic, it was encrypted. But, unlike today, they couldn’t simply install S/MIME on their smartphone. Instead, they had to manually encrypt and decrypt each message, and hand them off to an operator who’d send it out over shortwave radio in Morse code.
The encryption system they used had to meet several very important criteria:
- Requires only pencil-and-paper
- Each message should have its own key
- Must be reasonably secure against cryptanalysis, and most importantly,
- Must not leave any lasting evidence of code use (no codebooks, etc.)
The poem code system met these requirements. It uses a reasonably straightforward procedure that can be executed on paper. It supports unique keys for each message, while the “master key” used to derive message keys is a poem, committed to memory by the agent. Finally, the actual encryption used is double columnar transposition, providing (for the time) very good security.
However, it had some drawbacks. Though this system is simple in its mechanics and can be easily learned, it’s cumbersome, time-consuming, and prone to errors – even a small error can render an entire message indecipherable. If a message is re-transmitted because of errors, sending the exact same ciphertext twice with different keys can leak valuable information to an attacker. A similar risk occurs if the same key is used twice for different messages of the same length. Finally, if the poem used by an agent is ever discovered (or perhaps revealed through torture), then all communications to and from that agent could be easily deciphered.
Improvements
When Leo Marks joined SOE, he quickly recognized some of these limitations and set about to mitigate the risks they posed. Under his guidance, the SOE personnel responsible for decoding agents' messages from the field mounted a huge effort to decipher the “indecipherables.” Using cryptanalysis, and knowing the typical errors agents made (as well as any individual agent’s weaknesses), they significantly reduced the number of messages that couldn’t be deciphered. This lowered the number of retransmissions, which also reduced the time the radio operators needed to broadcast messages. This, coupled with better training for the agents, resulted in greatly improved reliability and decreased exposure for everyone.
Another major contribution was the use of custom poems. The practice up to that point had favored poems that were easy to memorize because, in many cases, they already were memorized by the agents. Popular poems, famous poems, favorite rhymes from childhood, etc., were all likely sources of an agent’s “personal” code poem. But their very familiarity presented a risk, in that the enemy could simply try the 100 most popular poems and greatly increase their chances of finding a match.
So Marks instructed each agent to create their own poem, known only to themselves and the SOE agents in London. If they didn’t feel up to the task, Marks himself composed many poems and kept them locked away, and could provide one for any agent who needed it. One of these poems later became famous in its own right. “The Life That I Have” was issued to Violette Szabo, who was eventually captured and executed. The poem gained prominence when included in a movie about Szabo, and again later when it was read at Chelsea Clinton’s wedding.
Encryption
So how did this system work? There were two distinct phases: key generation and encryption.
First, the agent would randomly select five words from their poem. They could do this by flipping coins, rolling dice, or any other similar method. Once selected, the agent needed to indicate to the recipient which words were chosen. To do this, they would send an “indicator group” of five letters, where each letter indicated the position of a key word in the poem. The first word added an “A” to the indicator group, the eighth an “H”, and so forth.
Let’s work an example as we go. Agent X has “The Jabberwocky” as his poem:
'Twas brillig, and the slithy toves
Did gyre and gimble in the wabe.
All mimsy were the borogoves,
And the mome raths outgrabe.
For our key we’ll pick “the, all, mome, gyre, and ‘twas.” The indicator group, then, is DNUHA
.
After selecting the key words, the agent would write them all together as one word, and number the letters, starting with “A”. So the first A would be number 1, then if there was a second A that would be number 2. Then Bs would be numbered, then Cs, etc. Once the entire key word is numbered, those numbers themselves become the encryption key.
Our key word is therefore "THEALLMOMEGYRETWAS"
. Numbering first the As, we get:
T H E A L L M O M E G Y R E T W A S
1 2
There are no Bs, Cs, or Ds, so the next letter up is E:
T H E A L L M O M E G Y R E T W A S
3 1 4 5 2
Continue until everything is numbered:
T H E A L L M O M E G Y R E T W A S
15 7 3 1 8 9 10 12 11 4 6 18 13 5 16 17 2 14
After generating the key, and communicating its elements with the indicator group, the agent must begin the actual encryption of the message. First, write the key across several columns. Then the plaintext message is written left-to-right underneath the key, one letter at a time in each key column. Once written out, the letters are read back by going down each column, in the order of the key numbers over the columns.
Continuing the example from above, we write the key out and then copy the message plaintext below the key, one word per column:
T H E A L L M O M E G Y R E T W A S
15 7 3 1 8 9 10 12 11 4 6 18 13 5 16 17 2 14
I h a v e d e p o s i t e d i n t h
e c o u n t y o f B e d f o r d a b
o u t f o u r m i l e s f r o m B u
f o r d s i n a n e x c a v a t i o
n o r v a u l t s i x f e e t b e l
o w t h e s u r f a c e o f t h e g
r o u n d t h e f o l l o w i n g X
NOTE – I’ve added a null character (“X”) to the end of the message to ensure that the plaintext fills out every column for all rows. In practice, this probably won’t happen often, but it makes things very easy for this example. See other, much better written explanations of columnar transposition for dealing with such situations.
Then, to encrypt, first read down each column, starting with number 1:
(col 1): vufdvhn
(col 2): taBieeg
(col 3): aotrrtu ....
Or, for the entire message:
vufdvhn taBieeg aotrrtu sBleiao
dorvefw ieexxcl hcuoowo enosaed
dtuiust eyrnluh ofinsff pomatre
effaeoo hbuolgx Ieofnor iroatti
ndmtbhn tdscfel
But we’re not done! Do it a second time:
T H E A L L M O M E G Y R E T W A S
15 7 3 1 8 9 10 12 11 4 6 18 13 5 16 17 2 14
v u f d v h n t a B i e e g a o t r
r t u s B l e i a o d o r v e f w i
e e x x c l h c u o o w o e n o s a
e d d t u i u s t e y r n l u h o f
i n s f f p o m a t r e e f f a e o
o h b u o l g x I e o f n o r i r o
a t t i n d m t b h n t d s c f e l
and assemble the new ciphertext as before, reading down columns 1, then 2, then 3, etc… :
dsxtfui twsoere fuxdsbt Booeteh
gvelfos idoyron utednht vBcufon
hllipld nehuogm aautaIb ticsmxt
eronend riafool vreeioa aenufrc
ofohaif eowreft
Finally, it’s typical to break the message up into five-character groups. Don’t forget to add the 5-character indicator group at the beginning of the message, or the recipient won’t be able to regenerate the message key (unless, of course, you’ve already made arrangements to transmit this information via a different channel). This, then, is the final message:
dnuha dsxtf uitws oeref uxdsb tBooe tehgv elfos idoyr
onute dnhtv Bcufo nhlli pldne huogm aauta Ibtic smXte
ronen driaf oolvr eeioa aenuf rcofo haife owref t
That’s what you’d then hand off to the radio operator, who’d broadcast it on to London based on scheduled times and frequencies. (Of course, since it’s going out over Morse code, capitalization doesn’t happen. I just left them in here because it’s helpful to see how the letters get scrambled).
Decryption
Decrypting works similarly. First, generate the key, based on knowing the sender’s poem code and reading the indicator group at the start of the message. In our example, using the agent’s poem code “The Jabberwocky” and the indicator group DNUHA
means the key words are “the all mom gyre twas
” Put those together, number the columns as before, and you’ve got the numeric key.
Write the key out across multiple columns, then fill the message in, by columns, starting with 1. That is, in the example, write "dsxtfui"
column 1, "twsoere"
under column 2, etc. This will eventually give the 2nd table above. Read the intermediate text out by rows, starting at the top (vufdvhnt….). Write that, downwards by columns, into a new grid with the same key row. Now you should have regenerated the first table above, and can simply read the plaintext back out row-by-row.
Alternatives
So, is this exactly how the SOE agents used poem codes? I don’t know for certain, but if anyone can point me to a very solid reference that’d be greatly appreciated. In particular, I’m uncertain whether they used the same key for both encryption steps, or if they removed duplicate words from their poems. I think it’s probably pretty close, from what I read in Marks’ book. Really, the hardest part is documenting how the key generation phase worked.
The key generation could use any of a number of different approaches. In fact, it’s even possible that many of these approaches were all used, with each agent having their own unique variation. This would certainly have added to the security of the system, at the expense of more complexity on the London end of the communications (having to track and associate each particular method with each agent).
Here are some ways they could have mixed up the key generation, just off the top of my head:
- Indicator groups point to letters, not words, in the poem
- Instead of using full words, only use the 1st letter, or 3rd letter, or maybe different letter from each word (1st letter from 1st word, 2nd from 2nd, etc.)
- Handle duplicate letters in the final key word differently (number them in reverse order, rather than forwards)
- Eliminate duplicate words in the poem (otherwise, you’re only ever using the 1st 26 words in the poem, and if there are a lot of repeated words, that decreases randomness somewhat)
- Instead of all indicator letters counting from the start of the poem, each could count from the last word (so “ABC” really points to the 1st, 1+2 or 3rd, and 1+2+3 or 6th, words of the poem)
- For the 2nd pass of encryption, perhaps use a second indicator group to pick a different key, or use the same indicator but count words backwards from the end of the poem, or other such complications
Really, there are an infinite number of ways to create the key. From what I’ve read, I think the method presented here is the simplest, and most straightforward, but that doesn’t mean it’s historically accurate.
A slightly different approach to key generation is described here and also here. This differs a bit from details in Marks' book, but incorporates extra steps to improve security, including agent-specific offsets and signals to indicate duress. For example, using the poem “Mary Had A Little Lamb”:
Let’s assume that the letters chosen are PQRSTU, the odd letters furnish the first ‘key’ and the even letters the second. In our example PRT points to ‘WENT LAMB SURE’ as the first ‘key’. For the second we use QSU so it’s ‘THE WAS TO’. The indicator showing which words were used as ‘keys’ will be PRT filled with two nulls so as to form a 5-letter group (all messages were sent in 5-letter groups), so let’s say PARNT and the final step is to move all the letters forward by using the agents’ secret number. For instance if the number was 45711 then in our example PARNT will change into TFYOU, as each letter moves forward as many positions as indicated by the secret number P+4=T, A+5=F, R+7=Y, N+1=O, T+1=U.
Note that Marks doesn’t say anything about six consecutive letters. On the contrary in his book page 324 he says ‘every poem code message began with a five letter indicator group to show which five words of the poem had been used’.
Ultimately, as long as there’s a repeatable, deterministic method for creating a reasonably random transposition key, and that there’s an easy way to transmit the parameters to generate that key, it doesn’t matter what methods you use. In fact, later in the war they dropped poems altogether and used one-time, pre-shared transposition keys that the agent would tear off a silk sheet. They also went all the way to one-time-pads in some situations, also printed on silk.
Conclusion
Though there are an infinite number of ways that the SOE agents could have derived their encryption keys, I’m inclined to think that the simpler method was what was used. And with luck, I’ve made the mechanics clear here. Even if I’m not close enough that you could actually decode real SOE intercepts, hopefully you’ve got a good idea for the complexity of the system and some of the challenges the agents faced. I certainly recommend reading the book…even if it sometimes becomes a bit sensational, even incredible, I’m certain that it’s as close to the truth as the general public will ever know.