Last weekend, the Associated Press published a story about a Confederate Army message that was recently decrypted. It had been written on a small sheet of paper, rolled up tightly and placed in a glass vial with a bullet (probably so it could be sunk into a river in the event of imminent capture). The vial sat in The Museum of The Confederacy for years, until it was unrolled early in 2009. The article didn’t say when the message was decoded – presumably it sat untouched for a while and they only just sent it out to the experts (one at the CIA, one at the Navy).

I’d been celebrating Christmas, so I didn’t see the story until G. Mark Hardy emailed it and challenged me to “extract the key.” The first thing I had to do was to get the ciphertext, which, natrually, wasn’t included in the story. A little digging got me some low-resolution photos, and I could get most of the ciphertext out of those, but it wasn’t great. Also, it was hard to avoid seeing the plaintext (which was in all the articles I found).

However, I think I can demonstrate breaking this code without any knowledge of the plaintext. Also, keep in mind that knowing more about the context of the message (who sent it, who it was sent to, the words and phrases frequently used in such messages, etc.) would have provided an actual wartime cryptanalyst a lot more leverage than I had.

After a couple days spent ignoring the challenge, I mentioned the story to my brother. He’s also a bit of a computer geek (but more into web technology and other such things), and is also a history buff. He actually once discovered a hitherto-unknown example of Lincoln’s signature while working at the National Archives. So I figured he’d enjoy this story, and within 5 minutes, he located a high-resolution copy of the ciphertext. So now that I could actually distinguish letters from inkblots, I set to work.

If you’d like to try to solve this yourself, then STOP now, as the rest of this post is full of spoilers. If you’d like a copy of just the ciphertext (as written, plus a "cleaned up" copy, and one with no word breaks for a different challenge), click here.

The first thing I noticed was that the writer of the message preserved word breaks. That seems, to me, a huge mistake, as now I can use those breaks to help guide my attack. For example, near the end of the message, I see four singleton letters – in plain English, those would either be “I” or “A”, though in something like this there’s always the chance they’re abbreviations, initials, cardinal directions, etc. But I’d bet at least one of them is “I.”

The whole ciphertext. It’s hard to read.

Also, I notice that 3 of the 4 singletons are encrypted with different ciphertext, which makes me think that this is a polyalphabetic cipher. The Vigenère cipher was used frequently in the Civil War, so I’ll start with that. I first have to figure out what the key length is.

In the first line is a four-letter word that’s repeated – this either means the same four-letter word is repeated in the plaintext and we have a 4-character key (which seems possible, but unlikely) or a key with a 4-letter repeat (which seems even more unlikely), or it was an astounding coincidence (with appropriate likelihood), or it was an error in transcription and shouldn’t have been repeated (I’ll go with that for now).

A closer look at the text, the handwriting, the inkblots, etc. Note the erroneously repeated word block on the 1st line.

Dropping the extra word, I now have a ciphertext of 220 characters. The letters “SEA” appear at the very beginning of the message, and again 210 characters later. This might be a hint as to the key length – 210 is probably not the key length itself, but a multiple of the key. So 3, 7, 10, 21, 30….all possible key lengths. Also, the singleton M letters are 30 characters apart, so I’ll assume for now that the key is 30 characters long.

The first thing I’ll do is work on my assumption that the singletons are all the letter I. Changing the last one (J) to I means the key letter for that position will be “B.” I’ll repeat that key backwards and forwards, at 30-character intervals, and decode the plaintext appropriately. Interestingly, one of the other singletons fell on an interval, and now it’s decoded to A. I’m pretty confident now that I’ve at least got that key letter correct. Trying the other ones (the Ms), and here’s what I have (the first row is the ciphertext, the next is the key stream, and the last is the plaintext):

SEAN WIEUIIUZH DTG CNP LBNXGK OZ BJQB FEQT XZBW JJOA TK FHR TP
           b            e                        b
           T            X                        I            

ZWK PBW RYSQ VOWPZXQQ OEPH EK WASFKIPW PLVO JKZ HMN NVAEUD XY
e                       b           e
V                       O           L                        

F DWRJ BOYPA SF MLV FYYRDE LVPL MFYSIU XY FQEO NPK M OBPC FYXJF
b            e                       b             e
E            O                       T             I           

HOHT AS ETOV B OCAJOSVQU M ZTZV TPIY DAW FQTI WTTJ J DQGOAIA FL
             b           e                         b
             A           I                         I           

WHTXTI QMTR SEA LVLFLXFO
e                      b
S                      N

Not a lot to go with, but there’s a two-letter word in the 3rd line that’s half decrypted. Not too many two-letter words start with O, but the likely candidates are OF, ON, and OR. Let’s try each. First, OF:

SEAN WIEUIIUZH DTG CNP LBNXGK OZ BJQB FEQT XZBW JJOA TK FHR TP
           b            ea                       b
           T            XN                       I            

ZWK PBW RYSQ VOWPZXQQ OEPH EK WASFKIPW PLVO JKZ HMN NVAEUD XY
ea                      b           ea
VW                      O           LW                       

F DWRJ BOYPA SF MLV FYYRDE LVPL MFYSIU XY FQEO NPK M OBPC FYXJF
b            ea                      b             e a
E            OF                      T             I O         

HOHT AS ETOV B OCAJOSVQU M ZTZV TPIY DAW FQTI WTTJ J DQGOAIA FL
             b           e a                       b
             A           I Z                       I           

WHTXTI QMTR SEA LVLFLXFO
ea                     b
SH                     N

Hm. That gives me XN, VW, and LW digraphs, and a word starting with Z. Not entirely impossible, but seems harder to work with. Let’s try ON next:

SEAN WIEUIIUZH DTG CNP LBNXGK OZ BJQB FEQT XZBW JJOA TK FHR TP
           b            es                       b
           T            XV                       I            

ZWK PBW RYSQ VOWPZXQQ OEPH EK WASFKIPW PLVO JKZ HMN NVAEUD XY
es                      b           es
VE                      O           LE                       

F DWRJ BOYPA SF MLV FYYRDE LVPL MFYSIU XY FQEO NPK M OBPC FYXJF
b            es                      b             e s
E            ON                      T             I W         

HOHT AS ETOV B OCAJOSVQU M ZTZV TPIY DAW FQTI WTTJ J DQGOAIA FL
             b           e s                       b
             A           I H                       I           

WHTXTI QMTR SEA LVLFLXFO
es                     b
SP                     N

That looks much better. There’s still one pair that looks troublesome (XV in the first line), but transcription errors are not uncommon for a coded message written in the field, and one bad digraph is much better than three. So I’ll let it stand for now. But there’s still not much else to go on, as very few letters have been decoded at this point. Since I’ve got nothing else to work with, let’s try shortening the key. Trying a key length of 15 (half 30, but still fitting the intervals I’m working with), I get:

SEAN WIEUIIUZH DTG CNP LBNXGK OZ BJQB FEQT XZBW JJOA TK FHR TP
      es   b            es    b            es    b
      EM   T            XV    N            TH    I            

ZWK PBW RYSQ VOWPZXQQ OEPH EK WASFKIPW PLVO JKZ HMN NVAEUD XY
es    b           es    b           es    b            es
VE    V           TY    O           LE    N            AC    

F DWRJ BOYPA SF MLV FYYRDE LVPL MFYSIU XY FQEO NPK M OBPC FYXJF
b            es     b           es   b             e s    b
E            ON     E           IN   T             I W    E    

HOHT AS ETOV B OCAJOSVQU M ZTZV TPIY DAW FQTI WTTJ J DQGOAIA FL
      e s    b           e s    b           e s    b
      O M    A           I H    S           E E    I           

WHTXTI QMTR SEA LVLFLXFO
es   b            es   b
SP   H            HN   N

Several more plausible letters now pop out. In the second-to-last line is another two-letter word, this time ending with O. Best guess: it’s TO.

SEAN WIEUIIUZH DTG CNP LBNXGK OZ BJQB FEQT XZBW JJOA TK FHR TP
     hes   b           hes    b          h es    b           h
     PEM   T           EXV    N          M TH    I           I

ZWK PBW RYSQ VOWPZXQQ OEPH EK WASFKIPW PLVO JKZ HMN NVAEUD XY
es    b          hes    b          hes    b           hes
VE    V          STY    O          BLE    N           TAC    

F DWRJ BOYPA SF MLV FYYRDE LVPL MFYSIU XY FQEO NPK M OBPC FYXJF
b          h es     b         h es   b           h e s    b
E          T ON     E         E IN   T           D I W    E    

HOHT AS ETOV B OCAJOSVQU M ZTZV TPIY DAW FQTI WTTJ J DQGOAIA FL
     he s    b         h e s    b          he s    b          h
     TO M    A         N I H    S          ME E    I          E

WHTXTI QMTR SEA LVLFLXFO
es   b           hes   b
SP   H           OHN   N

If I knew the key players in the war, this would be all over now, as a General’s name is now popping out. But I don’t know that, so I have to keep working. That last change didn’t make anything terribly messy, so let’s keep trying. In the first line is a four-letter word starting with TH. Good candidates include THAN, THEM, THIS, THEY, and others. For brevity, let’s just look at one wrong answer (THEM):

SEAN WIEUIIUZH DTG CNP LBNXGK OZ BJQB FEQT XZBW JJOA TK FHR TP
     hesxk b           hesxk  b          h esxk  b           h
     PEMXY T           EXVAW  N          M THEM  I           I

ZWK PBW RYSQ VOWPZXQQ OEPH EK WASFKIPW PLVO JKZ HMN NVAEUD XY
esx k b          hesx k b          hes xk b           hesx k
VEN F V          STYT E O          BLE SB N           TACG N 

F DWRJ BOYPA SF MLV FYYRDE LVPL MFYSIU XY FQEO NPK M OBPC FYXJF
b          h es xk  b         h esxk b           h e sxk  b
E          T ON PB  E         E INBI T           D I WEF  E    

HOHT AS ETOV B OCAJOSVQU M ZTZV TPIY DAW FQTI WTTJ J DQGOAIA FL
     he sxk  b         h e sxk  b          he sxk  b          h
     TO MWE  A         N I HWP  S          ME EWJ  I          E

WHTXTI QMTR SEA LVLFLXFO
esxk b           hesxk b
SPWN H           OHNON N

Looks worse. Now, let’s try THIS:

SEAN WIEUIIUZH DTG CNP LBNXGK OZ BJQB FEQT XZBW JJOA TK FHR TP
     heste b           heste  b          h este  b           h
     PEMBE T           EXVEC  N          M THIS  I           I

ZWK PBW RYSQ VOWPZXQQ OEPH EK WASFKIPW PLVO JKZ HMN NVAEUD XY
est e b          hest e b          hes te b           hest e
VER L V          STYX K O          BLE WH N           TACK T 

F DWRJ BOYPA SF MLV FYYRDE LVPL MFYSIU XY FQEO NPK M OBPC FYXJF
b          h es te  b         h este b           h e ste  b
E          T ON TH  E         E INFO T           D I WIL  E    

HOHT AS ETOV B OCAJOSVQU M ZTZV TPIY DAW FQTI WTTJ J DQGOAIA FL
     he ste  b         h e ste  b          he ste  b          h
     TO MAK  A         N I HAV  S          ME EAP  I          E

WHTXTI QMTR SEA LVLFLXFO
este b           heste b
SPAT H           OHNST N

Much better. I bet that’s “RIVER” straddling the first and second lines (LIVER just doesn’t seem likely), “TACK” could be part of “ATTACK,” “TO MAK? A” is probably “TO MAKE A”, etc. I’ll try a few of those (and, in fact, fixing RIVER changed TACK to TTACK, which just strenghens my guess):

SEAN WIEUIIUZH DTG CNP LBNXGK OZ BJQB FEQT XZBW JJOA TK FHR TP
  nc hesterb        nc hester b        nch este rb        n ch
  NL PEMBERT        AN EXVECT N        ROM THIS SI        E RI

ZWK PBW RYSQ VOWPZXQQ OEPH EK WASFKIPW PLVO JKZ HMN NVAEUD XY
est erb        nchest erb        nches terb         nchest er
VER LKV        JNSTYX KNO        SIBLE WHEN         ATTACK TH

F DWRJ BOYPA SF MLV FYYRDE LVPL MFYSIU XY FQEO NPK M OBPC FYXJF
b        nch es ter b       nch esterb         nch e ster b
E        LNT ON THE E       INE INFORT         AND I WILL E    

HOHT AS ETOV B OCAJOSVQU M ZTZV TPIY DAW FQTI WTTJ J DQGOAIA FL
  nc he ster b       nch e ster b        nche ster b       n ch
  UR TO MAKE A       ION I HAVE S        SOME EAPS I       N DE

WHTXTI QMTR SEA LVLFLXFO
esterb        n chesterb
SPATCH        N JOHNSTON

Wow. Now it’s just filling in the blanks. And the key is pretty clear, too, or at least would be if I knew much about Civil War history, which I don’t. But it looks like my assumption about the XV being an error is borne out – looks like it’s supposed to be EXPECT. I’ll change the ?AN before EXPECT to CAN, N? to NO, ?ROM to FROM, ???SIBLE to POSSIBLE, and see what happens:

SEAN WIEUIIUZH DTG CNP LBNXGK OZ BJQB FEQT XZBW JJOA TK FHR TP
manc hesterbl   hm anc hester bl   hm anch este rbl   h man ch
GENL PEMBERTO   MU CAN EXVECT NO   JP FROM THIS SID   D THE RI

ZWK PBW RYSQ VOWPZXQQ OEPH EK WASFKIPW PLVO JKZ HMN NVAEUD XY
est erb l  h manchest erbl    hmanches terb l   hma nchest er
VER LKV G  J JOJNSTYX KNOW    POSSIBLE WHEN Y   AAN ATTACK TH

F DWRJ BOYPA SF MLV FYYRDE LVPL MFYSIU XY FQEO NPK M OBPC FYXJF
b l  h manch es ter bl  hm anch esterb l   hma nch e ster bl  h
E S  C POLNT ON THE EN  WS LINE INFORT M   JSO AND I WILL EN  Y

HOHT AS ETOV B OCAJOSVQU M ZTZV TPIY DAW FQTI WTTJ J DQGOAIA FL
manc he ster b l  hmanch e ster bl   hma nche ster b l  hman ch
VOUR TO MAKE A D  CCSION I HAVE SE   WOW SOME EAPS I S  HOIN DE

WHTXTI QMTR SEA LVLFLXFO
esterb l  h man chesterb
SPATCH F  K GEN JOHNSTON

Pretty much legible now. Though there are several obvious errors. The only thing that I can work with is the word straddling lines 3 and four – might it be ENDEAVOUR?

SEAN WIEUIIUZH DTG CNP LBNXGK OZ BJQB FEQT XZBW JJOA TK FHR TP
manc hesterblu fhm anc hester bl ufhm anch este rblu fh man ch
GENL PEMBERTON YMU CAN EXVECT NO HEJP FROM THIS SIDG OD THE RI

ZWK PBW RYSQ VOWPZXQQ OEPH EK WASFKIPW PLVO JKZ HMN NVAEUD XY
est erb lufh manchest erbl uf hmanches terb luf hma nchest er
VER LKV GENJ JOJNSTYX KNOW KF POSSIBLE WHEN YQU AAN ATTACK TH

F DWRJ BOYPA SF MLV FYYRDE LVPL MFYSIU XY FQEO NPK M OBPC FYXJF
b lufh manch es ter blufhm anch esterb lu fhma nch e ster blufh
E SCMC POLNT ON THE ENEMWS LINE INFORT ME AJSO AND I WILL ENDEY

HOHT AS ETOV B OCAJOSVQU M ZTZV TPIY DAW FQTI WTTJ J DQGOAIA FL
manc he ster b lufhmanch e ster bluf hma nche ster b lufhman ch
VOUR TO MAKE A DIVCCSION I HAVE SEOT WOW SOME EAPS I SWBHOIN DE

WHTXTI QMTR SEA LVLFLXFO
esterb lufh man chesterb
SPATCH FSOK GEN JOHNSTON

That filled in all the rest, but, again, there are lots of errors. YMU, HEJP, SIDG, OD, all in the first line. Three of those have errors under the same key letter, and that key position continues to look wrong through the rest of the message. Looking at the key, I can guess what it’s supposed to have been. Changing it from “MANCHESTER BLUFH” to “MANCHESTER BLUFF”, I now have:

SEAN WIEUIIUZH DTG CNP LBNXGK OZ BJQB FEQT XZBW JJOA TK FHR TP
manc hesterblu ffm anc hester bl uffm anch este rblu ff man ch
GENL PEMBERTON YOU CAN EXVECT NO HELP FROM THIS SIDG OF THE RI

ZWK PBW RYSQ VOWPZXQQ OEPH EK WASFKIPW PLVO JKZ HMN NVAEUD XY
est erb luff manchest erbl uf fmanches terb luf fma nchest er
VER LKV GENL JOJNSTYX KNOW KF ROSSIBLE WHEN YQU CAN ATTACK TH

F DWRJ BOYPA SF MLV FYYRDE LVPL MFYSIU XY FQEO NPK M OBPC FYXJF
b luff manch es ter bluffm anch esterb lu ffma nch e ster bluff
E SCME POLNT ON THE ENEMYS LINE INFORT ME ALSO AND I WILL ENDEA

HOHT AS ETOV B OCAJOSVQU M ZTZV TPIY DAW FQTI WTTJ J DQGOAIA FL
manc he ster b luffmanch e ster bluf fma nche ster b luffman ch
VOUR TO MAKE A DIVECSION I HAVE SEOT YOW SOME EAPS I SWBJOIN DE

WHTXTI QMTR SEA LVLFLXFO
esterb luff man chesterb
SPATCH FSOM GEN JOHNSTON

And that’s about it. Of the remaining errors, several seem to be confusing U with W, which might even be a consequence of the transcriber’s writing style. Others are simple off-by-one errors in encoding. If I completely clean it up, here’s what we get:

SEAN WIEUIIUZH DTG CNP LBHXGK OZ BJQB FEQT XZBW JJOY TK FHR TP
manc hesterblu ffm anc hester bl uffm anch este rblu ff man ch
GENL PEMBERTON YOU CAN EXPECT NO HELP FROM THIS SIDE OF THE RI

ZWK PVU RYSQ VOUPZXGG OEPH CK UASFKIPW PLVO JIZ HMN NVAEUD XY
est erb luff manchest erbl uf fmanches terb luf fma nchest er
VER LET GENL JOHNSTON KNOW IF POSSIBLE WHEN YOU CAN ATTACK TH

F DURJ BOVPA SF MLV FYYRDE LVPL MFYSIN XY FQEO NPK M OBPC FYXJF
b luff manch es ter bluffm anch esterb lu ffma nch e ster bluff
E SAME POINT ON THE ENEMYS LINE INFORM ME ALSO AND I WILL ENDEA

HOHT AS ETOV B OCAJDSVQU M ZTZV TPHY DAU FQTI UTTJ J DOGOAIA FL
manc he ster b luffmanch e ster bluf fma nche ster b luffman ch
VOUR TO MAKE A DIVERSION I HAVE SENT YOU SOME CAPS I SUBJOIN DE

WHTXTI QLTR SEA LVLFLXFO
esterb luff man chesterb
SPATCH FROM GEN JOHNSTON

Or, looking just at the plaintext:

GENL PEMBERTON YOU CAN EXPECT NO HELP FROM THIS SIDE
OF THE RIVER LET GENL JOHNSTON KNOW IF POSSIBLE WHEN
YOU CAN ATTACK THE SAME POINT ON THE ENEMYS LINE
INFORM ME ALSO AND I WILL ENDEAVOUR TO MAKE A DIVERSION
I HAVE SENT YOU SOME CAPS I SUBJOIN DESPATCH FROM
GEN JOHNSTON

Total time to break the message (using, obviously, modern tools): negligible (it took me longer to write the interactive tool I used than to actually break the code). Could a professional cryptanlyst have cracked this by hand, 147 years ago? Almost certainly.

As I said before, knowing more about the context of the message would definitely have provided quite a bit more leverage. Knowing the names of key generals would have helped with three of the longer words in the message. Knowing who messages important enough to be encoded were generally sent to might’ve helped, too (if that would lead one to guess that the message was more likely to open with “GENL:” as opposed to “DEAR SIR:"). And, certainly, knowing that you’d cracked dozens of previous messages with the key “MANCHESTER BLUFF” would have meant this would have been broken just minutes after receipt. Three very strong strikes against the message right there.

But even without any of that knowledge, I was able to break it, and I’m just a beginner at this. I really think it was the word breaks that did it for me. If those hadn’t been there, there’d have been nothing I could do – nowhere to start, and almost all of my analysis (like the two- and four-letter word guesses) wouldn’t have been possible. I suppose I would have looked for a history of similar messages, to see what the message might have started with, and gone from there. What would that have gained me?

SEANW IEUII UZHDT GCNPL BHXGK OZBJQ BFEQT XZBWJ JOYTK FHRTP
manc              manc              manc              manc
GENL              UCAN              PFRO              THER 

ZWKPV URYSQ VOUPZ XGGOE PHCKU ASFKI PWPLV OJIZH MNNVA EUDXY
            manc              manc              manc
            JOHN              OSSI              ANAT       

FDURJ BOVPA SFMLV FYYRD ELVPL MFYSI NXYFQ EONPK MOBPC FYXJF
      manc              manc              manc
      POIN              SLIN              SOAN             

HOHTA SETOV BOCAJ DSVQU MZTZV TPHYD AUFQT IUTTJ JDOGO AIAFL
manc              manc              manc              manc
VOUR              RSIO              OUSO              OIND 

WHTXT IQLTR SEALV LFLXF O
            manc
            GENJ

Adding the T in POINT makes OSSI into OSSIB, which isn’t too hard to read as POSSIBLE:

SEANW IEUII UZHDT GCNPL BHXGK OZBJQ BFEQT XZBWJ JOYTK FHRTP
manch es        f manch es        f manch es        f manch
GENLP EM        O UCANE XP        L PFROM TH        F THERI

ZWKPV URYSQ VOUPZ XGGOE PHCKU ASFKI PWPLV OJIZH MNNVA EUDXY
es        f manch es        f manch es        f manch es
VE        L JOHNS TO        P OSSIB LE        C ANATT AC   

FDURJ BOVPA SFMLV FYYRD ELVPL MFYSI NXYFQ EONPK MOBPC FYXJF
    f manch es        f manch es        f manch es        f
    E POINT ON        Y SLINE IN        L SOAND IW        A

HOHTA SETOV BOCAJ DSVQU MZTZV TPHYD AUFQT IUTTJ JDOGO AIAFL
manch es        f manch es        f manch es        f manch
VOURT OM        E RSION IH        Y OUSOM EC        J OINDE

WHTXT IQLTR SEALV LFLXF O
es        f manch es
SP        M GENJO HN

And now it’s all over. Finish out ATTACK, make a couple of other educated guesses, and the message is complete. So even without word breaks, it’s possible, but it’s only easy if you’ve got a good crib (the “GENL” at the beginning). Although I probably wouldn’t have been able to do it, honestly (just based on my own experience with this cipher type).

The ease with which I broke this makes me wonder if any of the Confederacy’s coded messages were safe from the North. Especially considering they used the same key over and over again. What would have helped them? I can think of three important rules right off the bat (all of which apply even today):

  • Don’t provide any context to the attacker. Remove all word breaks and present the message as short blocks of text.
  • Don’t reward the attacker for good guesses. Ensure the message doesn’t start with a predictable word.
  • Don’t use the same key day after day after day.

How could they have accomplished that last recommendation? When G. Mark first challenged me to “extract the key,” I predictably jumped to an overly complex solution. Getting “the key” is simple, if you know the plaintext (which is in the articles) and the ciphertext (which is in the pictures). So perhaps the key for this message is just a secondary key, and there’s a larger master key I need to recover, and that’s what G. Mark was asking for?

Obviously, that’s not the case here, but it did make me think about how you could at least change the key daily. Take a long phrase, say for example, the Confederate Motto “With God our Vindicator.” Encrypt that phrase with the date of the message (“JULY FOUR”), and you get “FCEF LCX FDL GGSRCTJNZP”. Use that as the key for the message. If you change up the secondary key (maybe on odd days it’s “month day” and on even days it’s “day month”, and change the phrase periodically (every 6 months or so), then you’ve got a pretty good key schedule, for its time, at least. And every bit of it is easily memorized and applied, even in the field, so there are no codebooks to get lost.

On the other hand, I don’t know what the codebreaking skills of either side were like in the Civil War – it’s possible that nobody even gave these codes a second glance, and even simple ROT-13 messages would have been secure. But somehow, I doubt that. I guess it’s time to break out my copy of The Codebreakers and refresh my knowledge of crypto history….