His pattern is something that has no known pattern
On and off I have looked at this cipher and tried to crack it. Certainly not in any obsessive way but more a recreational exercise in genetic algorithms But here is where the power of a good genetic algorithm is useless since every morning I would awake and look a large portions of the cipher solved with great excitement only to find false leads. One morning I awoke to a long well structured sentence in the middle of the cipher explaining how the "teens" had been killed on a farm. In a couple other places the word "pig" and "horse" were displayed. The program had no knowledge of what it was trying to decipher and in my mind it was solved. I skipped work and tried feverishly to solve the rest but of course it was gibberish.
Last week I was bored and looked at it again only this time I thought the best approach was to look for a pattern to even verify it is real before coding to find a solution. This man was a very mentally ill (note how politically correct I was there) person that loved taunting authority so there was always a possibility in my mind the entire cipher could just be characters he wrote randomly hoping to tie up police resources cracking a cipher that had no meaning (hell I would have done something like that). So last Sunday I decided until I could find some small pattern indicating it was real I would hold off on trying to decipher it. I began examining the most baffling character in the cipher:
The "+" sign occurs 24 times in the cipher and in normal English this somewhat limits it to being one of the letters E, T, A, O, I, N, S, H, or R. Also it appears twice in a row 3 times throughout the cipher further limiting what it could be (repeats of say HH or II in English are very rare). In fact the only letters that seem to fit the letter frequency and account for the doubles is E or S but then these have been tried for years and no one has had any luck, I figured examining the "+" character would be a good start. I started by recording the positions by number where it occurred...
20
40
64
65
72
81
... and it was about here I noticed something odd. I had went through 81 characters of the cipher and yet not one of the first 6 had landed on a prime number. Hmmm, noteworthy but nothing more at this point, then I continued...
105
128
133
140
142
159
172
... I am over half way through the cipher and still not one "+" character has been placed on a prime number?? There are 68 prime numbers up to 340 with a higher percentage of them lying in the first half of the cipher yet not one "+" character has coincidentally landed on one. Now keep in mind the structure of the English language has no known ties to prime numbers (nor does any language I know of). Cutting to the chase the "+" symbol only lands on one prime number, 211, throughout the entire cipher.
When you consider 20% of all the numbers up to 340 are prime and 24 "+" characters were placed on the grid statistically speaking nearly 5 of them should have landed on prime numbers by pure chance, but he only ended out with 1. Of course this could happen and its not a super rare occurrence but the probability of this happening by accident is about 2.5%. Very high probability of a pattern but still could happen by chance, lets look at another character:
The letter "B" is the next highest occurring character coming in with 12 occurrences. In the standard English language structure we could reasonably anticipate this probably being any of 19 letters in the alphabet so its open to interpretation but something odd (know where I'm going with this?) is where it occurs in the cipher:
21
35
147
168
181
203
216
240
261
286
315
319
Both of these added together (36 characters) account for over 10% of the entire cipher so statistically speaking they should land on roughly 10% of the prime numbers by accident but the reality is they only hit less than 3%. I ran one million computer simulations placing characters randomly on a 340 grid and only 1% of the time could I place 36 of them and only land on two prime numbers.
Now lets look at another prime anomaly in his cipher:
The "X" character only occurs twice accounting for about 0.6% of the over all characters so logically it should occupy 0.6% of the prime numbers basically meaning it shouldn't even land on one once by chance, but it does not only once but both times it occurs! There is about a 4% chance of this happening by accident.
If you poke around there are other distribution patterns in reference to primes that are off average by a significant amount, enough where it becomes apparent that if this is an accident it was a highly improbable one. But for the moment lets speculate what this means...
If we assume the prime numbered characters have a meaning then we have to decide what meaning they have. Are they they alone the code with the rest to be ignored or should they be ignored themselves when deciphering? My theory is the prime numbered characters are actually the cipher and the rest is filler meant to drive authorities mad. If the actual cipher is buried in just 68 characters then you are free to fill in all the others with interesting little things to catch peoples eyes. "FB backwards C" appears twice in the cipher and two triples like that are the stuff code breakers drool over, would a man so experienced in ciphers and letter frequency leave something that easy right out in the open? Look at all the other coincidences people have found over the years and now plot over them with the prime numbers highlighted, see anything that catches your eye?
Remember his 408 cipher was cracked almost immediately after it was published and this had to be somewhat of an embarrassment to him. If we guess the time it took for him to carefully plan and plot out a devious cipher like the 408 then chances are it took him longer to create it then it took someone to crack it. So this one was going to be one with traps and secrets, a cipher where authorities would eventually have to beg him to provide the solution and be embarrassed they didn't think of that (and in his mind they would have to acknowledge his superior intellect).
Seems unlikely so many prime curiosities could occur in this cipher by chance. If you rotate the cipher, reverse it, or pick any other way of interpreting its layout then the prime numbers fall into a standard distribution pattern with the characters. Only when read top down from left to right does the prime pattern fall outside the probability of chance. One other thing that quickly caught my eye was the only correction on the cipher (something scratched out then replaced with a backwards K) occurred at a prime number interval.
Oh yes, there is a pattern to this cipher and only a very devious person would make the pattern something that has no known pattern, prime numbers.
Dan, this is a very interesting observation. I posted my thoughts about it here: http://www.zodiackillerciphers.com/?p=319 I also found some similar behavior in the solved 408. Why would the 408 have similar prime-phobic qualities? Is there some connection we're missing, where the cipher construction process itself skews the distribution somehow?
ReplyDeleteThe "X" symbol occurring on two primes is interesting. But I think to test it fairly, we'd have to shuffle the cipher and count how many times ANY low-frequency symbol occurs on all primes, instead of just a single low-frequency symbol.
I'm still baffled about why the distribution would be skewed against primes, though...
I cannot predict what it means but from what I've found so far this is highly improbable that it happened by chance. Look at the lower occurring characters in the cipher and the deviations in reference to primes is out considerably not in an slightly improbable way but actually a highly improbable way.
ReplyDeleteI'm not sure what this will deduce to but I'm confident based on the statistical deviations I see in the non-prime/prime numbered characters that this was done intentionally. If time permits I'll create some more simulations this weekend and post the results.
I'm not saying these patterns couldn't have happened by chance, but its improbable enough to say they shouldn't have happened by chance...
The odds of at least one of the low-frequency symbols hitting primes is actually quite high. Consider the following:
ReplyDelete1) A symbol that occurs only once (such as the square-with-dot) has a 1/5 chance (20%) of landing on a prime number.
2) A symbol that occurs only twice (such as X) has a (1/5)^2 chance (4%) of all landing on prime numbers.
3) A symbol that occurs only three times (such as E) has a (1/5)^3 chance (0.8%) of all landing on prime numbers.
And so on, for all of the symbol frequencies.
In the 340 cipher, there is one symbol that occurs only once, and eight symbols that occur only twice. So, the probability that a shuffle will produce a low-frequency symbol (one that occurs once or twice) that falls on all primes is:
1/5 + 8 * (1/5)^2 = 52%.
So, it is very easy to produce a random cipher text that has at least one low-frequency symbol that always occurs on primes.
Carrying out the calculation to include ALL symbol frequencies brings the chances all the way up to 61%. In other words, when we randomly shuffle the 340 cipher, there is a 61% chance of producing a shuffle that has a symbol that always occurs on primes.
I ran an experiment to test this. Out of 1,000,000 shuffles, 474,058 (47%) of them produced at least one symbol that always occurred on primes. A bit lower than I expected, maybe because once a prime is selected, fewer are available to choose from, possibly affecting the assumptions in the calculation above.
So, I think the argument in favor of non-primes being statistical outliers is much stronger.
By the way, a random side note: There are 68 prime numbers between 1 and 340. 68 happens to be a factor of 340: 68 * 5 = 340. And 68 breaks down cleanly into 4*17.
I agree the strong point of the argument is the very high occurrence characters tending to land on composite numbers. But I think there may be a strong statistical deviation in the overall distribution in reference to prime numbers in occurrences all the way down to 5. I ran a simulation tonight to get the standard prime numbered distribution based on the 340 structure but won't be able to compare it against his distribution and see if there is enough difference to warrant calling it a true pattern. It could resolve itself as nothing but have to admit the + and B prime number aversion is peculiar.
ReplyDeleteGood read. I am currently running a test on the 340 with the symbols rearranged in ascension based on their lowest prime factor (excluding one of course). My next test is hoping to be based on the highest prime factor.
ReplyDeleteAlso, there is another error in row 19. The B at dead center was originally written as a P and then changed to a B. You can tell this by the size of the loops. Z also hesitated on the last +. The horizontal line is jagged and elongated.
Finally, I once mentioned to David that the cipher symbols get larger as the cipher goes on. Row 20 is markedly larger than Row 1, which indicates the cipher was written from top to bottom, and that Z was tiring.
Doc
In case your interested, here is the cipher ascii rearranged based on lowest prime factor...
ReplyDeleteH E > Ì V Ë ² T ± N + ¢ O D y < K
£ Ÿ Ã + Z W £ · H S Ð ^ ¾ V Ð + R
± ¼ + Ô Ä µ P ˆ Ë Ð R F O » C F ±
¢ µ K º ± Ã G • L ¢ ± Æ · + N ¤ ¹
¼ < + + R F Ã A ³ - Ì V ^ + Ð < B
- + / Ô I y Ð T K ± Ã R I µ ³ • ˆ
F ° S · N µ B ¢ ¾ Ì F ^ µ ³ • V Ô
+ B ² „ ¼ E V Z - I • ¤ K O ^ Æ Ñ
± Ã + ² C + Ì B £ + £ C W P S T ¢
Ð F Ä < Ô ¸ O » C > D N Ë ¤ O A K
+ R P G B W Æ M ¢ J Ì O ¸ Ê + ¾ -
> + u ¤ J ¸ L Â B K + ½ u E Â » +
¼ I ƒ N Â + » > + B • R ° W + Ã /
Ë B - H Z ƒ Ð º B ½ + / Ä V Ê ± ¤
µ M ¤ F ³ ³ Ð T ¤ £ M ¾ ^ ¤ I Ì I
+ J B Ã ´ I H Ð I G Z O Ÿ µ B L Ÿ
R C O Ä L Ã Ð · • » „ u G L Ð » ´
K M ¤ F ³ ^ Ë D · Ñ ƒ X O y Ä M Ÿ
° u O F R D B < Ì J T M + G Æ ± Ã
Ÿ X u à ¢ M G L < F W I W ½ y S I
There are some interesting patterns in there.