PostScript and the bad IDEA

The strength of cryptography doesn’t only depend on the algorithm selected. It is mostly how cryptography is used that is important. Here I examine an application that made use of a cryptographically strong algorithm in its license verification routine, but due to mistakes made during the implementation, it was possible to circumvent the cryptography.

The International Data Encryption Algorithm (IDEA) is a symmetric block cipher. It uses a 128-bit key to encrypt a 64-bit block of data. The same key is then used for decryption. When used properly the algorithm is secure, and no practical attacks are known. Messages encrypted using the IDEA can’t be decrypted in a reasonable amount of time without the key. Even when the attacker has a sample of decrypted messages.

The application

A few years ago I encountered the IDEA while reverse engineering an unusual application. It had a typical shareware licensing model with a name/password pair required to unlock the full functionality. What was surprising is the language used for its implementation.

Never before or after have I seen an application written mostly in PostScript. This is the language on which PDF is based. It is primarily used to produce documents suitable for printing. Nevertheless, the language itself is Turing-complete and usable for general purpose programming.

PostScript is interpreted, and as such any application using it needs an interpreter. The application I was reverse engineering packaged its own interpreter¹. Not only that, the interpreter was extended with a few functions not present in standard PostScript. One of those functions was idecrypt which, as the name suggests, implemented the IDEA. Apparently the function had been added exclusively for the license verification routine.

Its signature can be approximated like this

void idecrypt(const uint8_t* key, uint16_t* message, int len);

The reason cryptography is used in license verification is to prevent “clean” keygens. Meaning the keygens that don’t require patching the executable (in order to replace a hardcoded key, hash, etc).

Sometimes it is enough to have a single valid name/password pair to overcome the encryption, but usually a patch is the only way forward.

In my case the verification was implemented in such a away that having a valid pair would have been enough to produce a keygen. It is an unfortunate scheme, but at least one can’t write a keygen without purchasing at least a single license. I think this fact actually deters a lot of people that do reverse-engineering for fun.

At one of its stages, the verification routine extracted an IDEA key from the input password and tried to decipher a hardcoded string using that key. It then compared the result with another hardcoded string to see if the key is correct. As the key is kept secret by the application developer, only the developer is able to produce a valid name/password pair. Or so the theory went.

The normal way to deal with this scheme is to patch the hardcoded buffer in the executable and write a keygen that embeds the right key into passwords. That’s boring.

Working around the cryptography

The first mistake on the developer’s part was to provide a trial name/password pair. This pair only unlocked some features. The verification of the trial pair triggered an alternative execution path that contained a different hardcoded buffer ². As such the recovery of the key embedded in the trial password only allowed generating more trial passwords.

Still, discovering the key embedded in the trial password proved useful. The key was VoYaGeUr. Remember that the size of an IDEA key is 16 bytes, but VoYaGeUr is only 8 bytes long.

The analysis of the idecrypt implementation revealed another curious detail. Instead of using 16-byte keys, the function used 8-byte keys by first duplicating every byte of an 8-byte key. So the VoYaGeUr became VVooYYaaGGeeUUrr. Clearly that’s a major vulnerability. Doing so reduced the key space in half. That meant that I only needed to bruteforce 2^64 keys instead of 2^128 keys. That’s a good start, but it’s still infeasible.

Another problem with the VoYaGeUr key was its apparent simplicity. Is was limited to the ASCII range, once again considerably reducing the potential key space. What if the real non-trial key was also limited to ASCII? Brute-forcing the complete ASCII range would’ve taken me around a month. That’s way too much.

Now could it be that the key was not just ASCII, but an actual English word? I tested this hypothesis using a word dictionary, limiting the search to 8 character words. The trial key contained characters in both upper and lower cases, so I tried all possible combinations of letter case³.

The search took less than a minute. The key was KaterinA.

Conclusion

These are mistakes the developer shouldn’t have made in this unfortunate use of the IDEA⁴.

Artificially reducing the key length.
Limiting the key to printable ASCII characters.
Providing a trial pair that revealed the above details about the real key.

Even though the scheme was mathematically sound, these mistakes had reduced the verification routine to a fancy pile of useless shifts and XORs.

It also used a binary format for its programs. Each program was minified and binary encoded. This provided (purposefully or not) a level of protection on its own. PostScript, being a stack-machine, is not particularly readable without the right indentation and spacing. Compare this

/t1 c B1 sub b add def /t2 d B2 sub a add def /m100 t2 t1 sub def m100 100 mod 0 eq and /m m100 100 idiv def 1 m le and m 12 le and

to the same code with indentation and annotations (not exactly relevant without the context)

/t1 c B1 sub b add def   % t1 = c - 7000(B1) + b
/t2 d B2 sub a add def   % t2 = d - 16000(B2) - a
/m100 t2 t1 sub def      % m100 = t2 - t1	 
m100 100 mod 0 eq and    % true && (m100 % 100 == 0)
/m m100 100 idiv def     % m = m100 / 100
1 m le and m 12 le and   % .. && 1 <= m && m <= 12

↩

Quite fittingly the string to be decrypted was Paranoid for the trial pair. ↩
256 combinations per an 8-character word: from all lower 00000000 to all upper 11111111, just like a byte. ↩
It’s tempting to say that security through obscurity is usually not the brightest idea, but in this particular case, given the niche market of the program and the sheer amount of the obscurity, it actually resulted in a pretty decent level of protection. ↩