Create Data Files

CryptoCrack uses a number of language data files containing the frequencies of single letters, digraphs, trigraphs, tetragraphs and pentagraphs. These files are used to determine the best fit of the resulting plaintext to the selected language. The data files have been derived from text files freely available on the Internet and, where possible, have used files more than 30Mb in size for each language to maximize their accuracy.

Source file - The name and location of the text file from which the frequencies are derived. The default file path of the source file is obtained from the Options settings.

Target folder - This is the location of the language folder in which the individual data files will be saved. The default target folder of the source file is obtained from the Options settings.

It is suggested that the Target Folder is set in the Options window before running this tool. This makes it easier to update for creating data files in the future.

Data files - The data files to be created. The digraph, trigraph and pentagraph files are used by the n-Gram search function and may be deselected if not required. The tetragraph files are used when solving a cipher to determine the best match and are required to solve ciphers.

Data file language - The name of the language for which the data files will be created. The data files will be created in a folder of this name using the location given by the path in the Target folder.

Use 26 letter alphabet (A - Z) - Some foreign languages contain characters outside the range A to Z. These are Danish, German, Norwegian and Swedish. When any of these languages is selected this option is enabled.

If this option is selected the characters outside of the A-Z range will be replaced by their nearest matching character. So, for example:

Å and Ä are replaced by A

Ö and Ø are replaced by O

Ü is replaced by U

Un-checking this option will create data files including these additional characters.

This option enables foreign ciphers which have been enciphered with either 26 letter alphabets or extended alphabets to be solved.

Create files – This will create the files based on the source text file and will be created in the folder set in the Target folder field. With very large source files this may take a few minutes to complete.