mrb's blog

Whitepixel v2: configurable charset, higher performance (33.1 billion password/sec!)

Keywords: amd attack bruteforcing gpu performance

Barely after I released whitepixel v1 last week, version 2 is already available. It delivers basic configurable charsets, plus a solid +15% performance increase, reaching 33.1 billion password/sec on 4 x HD 5970 graphics cards (see detailed hardware description)! These single MD5 hash brute forcing speeds are simply unheard of to this day, on one computer. Browse the whitepixel project page to get your hands on this open source (GPLv3) GPU-accelerated password hash auditing tool.

(On a side note, after seeing whitepixel demonstrate that high ALU utilization ratios for MD5 on AMD GPUs were possible, Ivan Golubev, the developer of ighashgpu, reworked his tool to increase his ALU utilization. He released ighashgpu 0.91.17.1 which raised his single MD5 cracking speed by 12%. This is one of the reasons why I like open source: anyone can learn from anyone else's source code and the knowledge of one developer can then spread out and indirectly benefits a much larger user base. Even I, a user of ighashgpu, benefit from it :-) )

Crazy BFI_INT hacking

Whitepixel v2 is +15% faster than v1 thanks to the BFI_INT instruction (integer bitfield insert) present in AMD Radeon HD 5000 series GPUs. This instruction is similar to the Cell B.E SPU selb, or the PowerPC AltiVec vsel instructions: it selects bits from 2 registers depending on a mask. It allows implementing both the F() and G() functions of MD5 in a single clock cycle.

However, this native instruction is not exposed at all, not even to low-level CAL IL code. Well, there is one IL instruction, ubit_insert, whose purpose is to implement some DirectX 11 operation, which is compiled to a BFM_INT+BFI_INT native pair. But there is no instruction to expose the powerful functionality of BFI_INT alone. The solution? After compiling the IL code to native instructions, whitepixel dynamically patches the binary CAL object in memory by scanning its opcodes to replace some of them with BFI_INT. This works well, but is ugly and makes a lot of assumptions. I would not be surprised if it caused crashes, so I made this feature optional via the -e option.

Configurable charsets

Currently whitepixel v2 gives a very basic choice: selecting between lower, upper, digit, printable ASCII (0x20-0x7e), or all bytes (0x00-0xff). No combination of these charsets is allowed, but v3 will. I like to stick to the "release often, release early" strategy.

Comments

Ouroborus wrote: Pretty nice, but it seems you're still looking at around 120 years to crack a difficult 10 character password. But an eight character, case sensitive, alpha-numeric password should take less than two hours. 01 Jan 2011 16:26 UTC

mrb wrote: It is 57 years to crack all 10-char printable ASCII passwords with 4x5970 (95**10/33.1e9/3600/24/365.25 = 57.3 years).

Actually a real attack would take less than 9 years, because one must take into account GPU performance doubling every 1.5 years.
01 Jan 2011 18:19 UTC

mr b wrote: how very exciting! i'll be sure and tell everyone who asks that you aren't the same mr b as I who can be found on twitter @veritaz 06 Jun 2011 01:59 UTC

Dennison Uy wrote: Why does it always have to be MD5? How does this perform on other standards like SHA-256? 06 Jun 2011 09:25 UTC

mrb wrote: Many applications with substandard password hashing algorithms (esp. web applications) hash passwords with MD5. So it became the de facto benchmark for password bruteforcers...

SHA-256 requires roughly 7 times more instruction (per call to the compression function). So a 4 x HD 5970 machine would bruteforce straight SHA-256 hashes at about 5 billion/sec.
06 Jun 2011 10:05 UTC

Anon wrote: In real world SHA-256 performance the HD 5970 can do about 530-800MHash/second, depending on clock speed and other variables.
That means 2.1-3.2BHash/s, not 5BHash/s
06 Jun 2011 14:32 UTC

mrb wrote: No, I have (unreleased) SHA-256 code (for Bitcoin) that runs at 1.1 Ghash/s on the 5970. 06 Jun 2011 18:01 UTC

ramon wrote: No comment. Just some questions. So you are using brute force and very fast GPUs to guess passwords. How does this work in the real world? Say I try to use this to guess a website access but the site will shut me out after the first three wrong guesses. What good is all that speed? What are the odds of guessing right in the first three tries? 07 Jun 2011 12:16 UTC

Jon L. wrote: Ramon, the point isn't to blindly guess at passwords, the point is to take a known MD5 hash and determine what the plaintext password is that created that hash. 07 Jun 2011 15:22 UTC

RayB wrote: So - the attacker must first have a list of hashes from the site - by hacking it or otherwise to know a hash whose password would work. Therein lies the rub.

Interesting if this GPU approach is scaleable, using multiple units, like the Access Data DNA - a manager and multiple workers to split up the task - using 1000 or more devices.
07 Jun 2011 16:34 UTC

klmdb wrote: try running the program on one of these:

http://www.tigerdirect.com/applications/SearchTools/item-details.asp?EdpNo=7194549&CatId=4044

XD
09 Jun 2011 14:03 UTC

mrb wrote: klmdb: I don't support Nvidia because they are much slower. The Tesla C2070 only does 1.4 Ghash/s with MD5. See http://www.golubev.com/gpuest.htm 10 Jun 2011 06:15 UTC

Greg B wrote: Ramon and Ray B: This isn't designed for hacking website passwords, such as to bust into yahoo mail accounts, pr0n sites, etc. This would be very effective at busting into password-protected data stores, such as USB keys or hard drives, where one physically had access to the medium. 14 Jun 2011 21:18 UTC

JodiTheTigger wrote: Mrb, care to share your unreleased bitcoin mining code? From what I can tell it's 50% faster than what is available. 20 Jun 2011 21:40 UTC

mrb wrote: It is not 50% faster. It is just a few percent faster than the best public code. Don't forget that 1 Bitcoin hash is defined as 2 SHA-256 hashes. My code does 569M Bitcoin hashes/s, or ~1.1G SHA-256 hashes/s on the HD 5970. 21 Jun 2011 05:53 UTC

Aaron wrote: Very, very cool. I'll definitely add this to my 'toolbox' when I build my desktop. Going to crossfire 2 Radeon 6790's (after flashing the bios to 6850, if I can), so I should be doing decently well with that. 03 Jul 2011 00:59 UTC

mywebs wrote: Would it make any difference if the MD5 hash also had a salt? When I use MD5 to hash a password I always use a sentence length salt that includes numbers and special chars to make sure a rainbow table can't be used.

I actually created some code to encrypt cookies, could also be used for passwords, that first uses MD5, then a 128 bit one way hashing function then a 256 bit one in succession. Then I compress it all back down to about 13 characters in such a way it would be impossible for anything but possibly a supercomputer to reverse due to integer length limitations.
28 Jul 2011 19:21 UTC

mrb wrote: As you point out, salts are only useful to prevent pre-computed attacks. For the purpose of bruteforcing, a salt does not typically significantly slow down attacks.

Bruteforcing attacks can be slowed down by iterating a hash function. Look at the standard MD5-based or SHA2-based UNIX crypt() function for how it is done.

You talk about "encrypting" cookies with one way hash functions, but it does not sound like encryption (as it would imply you can decrypt them, which is precisely what one-way functions prevent).
29 Jul 2011 04:59 UTC

db wrote: your program is much, much slower than oclHashcat. i have 4x 5970s and single hash raw MD5 performance with oclHascat-lite is ~ 49.2 G/s. step it up! 16 Aug 2011 12:27 UTC

mrb wrote: I know. One of my goals when releasing whitepixel was to entice competition between different tools. At first, oclhashcat users in the IRC channels could not believe my results, said I cheated by using a simplistic charset, etc. Now look at how fast oclhashcat is. I succeeded :-) 17 Aug 2011 05:24 UTC

klo wrote: Are you going to release a new faster version?? 22 Aug 2011 13:42 UTC

Peter wrote: Nice and impressive. But all my login code has a preset time interval for login attempts. In other words, you can try a password only every 'n' seconds.
Ah yes, and get a couple wrong in a row and you are locked out for even longer so you can clear your mind and thing about what to do next :-)
01 Sep 2011 00:03 UTC

mrb wrote: Peter, my tool runs an offline attack. It is not subject to settings controlling the maximum number of login attempts. 02 Sep 2011 03:04 UTC

m3g9tr0n wrote: Hi Marc!
I have to admit that whitepixel is the fastest raw MD5 cracker!
Could you please suggest me some tutorials related to GPGPU and CAL APIs for ATI GPUs because I am interested in that topic!
Thanks in advance and keep up with the good work!!!
16 Nov 2011 09:04 UTC

Ahmad wrote: Can the code of whitepixel be modified to work with SHA-256? 17 Nov 2011 14:50 UTC

mrb wrote: The official AMD doc on CAL is what I used to learn.

Yes the code could be modified to support SHA-256. (I wrote a private Bitcoin miner, which uses SHA-256, based on a fork of whitepixel.)
18 Nov 2011 01:40 UTC

m3g9tr0n wrote: Thanks Marc for your reply!
Have you ever try this library?
http://code.google.com/p/calseum/
24 Nov 2011 17:59 UTC

mrb wrote: No. It seems calseum would be rendered obsolete by OpenCL... 24 Nov 2011 18:11 UTC

Mr T, who is much more smart than you wrote: well MY website has rayguns and mutant porno lizards in jackboots that throw monkey poop at anyone who enters two wrong passwords in a row, so your code is crap. useless crap.
And and um On his lunchbreak my older brother like soldered together three minitels he got off that smelly french dude down the street and it is capable of doing like, um, a bazillion KADRILLION of those thingies every lightsecond. or something.
Oh yeah, and I uh, I like figured out a magic three-way encryptocoder function that obtuses not just cookies but SCONES, then recodercrypts them back and can't be broken except by bashing me on the head until I surrender.
And and and since my TRS-80 doesn't even HAVE a etcetera password shadow that I can find and doesn't have a modem anyhow so WHY DO YOU EVEN BOTHER GETTING UP IN THE MORNING, let alone turning on your computer?

Why does every third response on this kind of blog involve people saying
A: You shouldn't bother because it can't crack MY website
B: Someone they know has done it faster
C: they've created some super code that renders even the NSA helpless.
or
D:they know more about linux so you suck.

I don't know crap about this stuff, and even I know they're silly. wow.

Thanks for your work on this stuff. This is how some of us learn a bit.
22 Jan 2012 08:57 UTC

TV wrote: Hi Marc,

Crazy question, but will your software run on an ARM platform with an AMD/ATI card?
09 Mar 2012 04:06 UTC

mrb wrote: Not a chance. ARM platforms do not even support graphics cards made for x86 PCs. 09 Mar 2012 09:32 UTC

Simon Zerafa wrote: Hi,

Have you been able to repeat your benchmarks using more modern ATI GPU's? Say the 6990 or better?

If so what results did you get with Whitepixel or oclHashcat-plus? :-)

Kind Regards

Simon
07 Apr 2012 12:17 UTC

mrb wrote: Simon, no, however 6990 performance numbers with oclhashcat-plus can be found here: http://hashcat.net/oclhashcat-plus/ 21 Apr 2012 08:49 UTC

Dawn White wrote: I'm just trying to get my brain around calculating how long it would take such a rig using 4xHD 5970 units to crack WPA/WPA2 router passwords. For example many Thomson, VirginMedia routers use 8 lowercase characters. SKY routers on the older models use 8 uppercase. So I am talking about grabbing the 4-way handshake and then running say pyrit, aircrack-ng using dictionaries or using crunch to brute force. Any clarifications would be much appreciated as I am trying to either build a rig myself or buy one off the shelf. I know the hashcat team won some benchmark contests about a year ago, does anyone have an idea of the best price/performance rigs out there which would crack 8 symbol alph-numerics and ideally 10 symbol too as many newer routers are using 10 alpha-numeric symbols. Any links or advise is much appreciated.

TIA.
17 Mar 2014 23:47 UTC

Dawn White wrote: Addendum to earlier post. re: WPA/WPA2 pw cracking. Just to be more precise, I mean crack 8 symbol alphanumeric passwords in a reasonable time lets say a few hours or a few days. 18 Mar 2014 00:01 UTC

mrb wrote: Dawn White: I have no benchmark numbers for the HD5970, but I know that with a 8 x R9 290X rig, oclhashcat bruteforces WPA2 at 1.34M c/s. This means 8 alphabetic characters passwords can be attacked in 43 hours, 8 alphanumeric in 24 days, and 10 alphabetic in 3.3 years. 22 Mar 2014 06:12 UTC

Jim wrote: An EXTREMELY simple and effective password system is available to chess players (of sufficient skill) Using the first 5 move pairs of a common or favorite chess opening in algebraic notation has too many possibilities for brute force hacking. Just adding a simple word to the end of the string obviates loading an opening encyclopedia for searching example:

e4e5Nf3Nc6Bb5a6Ba4JIM is an opening most chess players would recognize and easily remember.
30 Jan 2016 14:37 UTC

hi-300 wrote: 33.1 billion 27 Sep 2016 10:53 UTC

burak wrote: 3b22bcf1dd25a1f8cd61fb2d0ac61027 09 Nov 2016 13:24 UTC