Whitepixel v2: configurable charset, higher performance (33.1 billion password/sec!)

Barely after I released whitepixel v1 last week, version 2 is already available. It delivers basic configurable charsets, plus a solid +15% performance increase, reaching 33.1 billion password/sec on 4 x HD 5970 graphics cards (see detailed hardware description)! These single MD5 hash brute forcing speeds are simply unheard of to this day, on one computer. Browse the whitepixel project page to get your hands on this open source (GPLv3) GPU-accelerated password hash auditing tool.

(On a side note, after seeing whitepixel demonstrate that high ALU utilization ratios for MD5 on AMD GPUs were possible, Ivan Golubev, the developer of ighashgpu, reworked his tool to increase his ALU utilization. He released ighashgpu 0.91.17.1 which raised his single MD5 cracking speed by 12%. This is one of the reasons why I like open source: anyone can learn from anyone else's source code and the knowledge of one developer can then spread out and indirectly benefits a much larger user base. Even I, a user of ighashgpu, benefit from it :-) )

Crazy BFI_INT hacking

Whitepixel v2 is +15% faster than v1 thanks to the BFI_INT instruction (integer bitfield insert) present in AMD Radeon HD 5000 series GPUs. This instruction is similar to the Cell B.E SPU selb, or the PowerPC AltiVec vsel instructions: it selects bits from 2 registers depending on a mask. It allows implementing both the F() and G() functions of MD5 in a single clock cycle.

However, this native instruction is not exposed at all, not even to low-level CAL IL code. Well, there is one IL instruction, ubit_insert, whose purpose is to implement some DirectX 11 operation, which is compiled to a BFM_INT+BFI_INT native pair. But there is no instruction to expose the powerful functionality of BFI_INT alone. The solution? After compiling the IL code to native instructions, whitepixel dynamically patches the binary CAL object in memory by scanning its opcodes to replace some of them with BFI_INT. This works well, but is ugly and makes a lot of assumptions. I would not be surprised if it caused crashes, so I made this feature optional via the -e option.

Configurable charsets

Currently whitepixel v2 gives a very basic choice: selecting between lower, upper, digit, printable ASCII (0x20-0x7e), or all bytes (0x00-0xff). No combination of these charsets is allowed, but v3 will. I like to stick to the "release often, release early" strategy.