Cracking Efficiency Measurements & Common Substring Attack

Reading time ~5 min

Posted by Dominic White on 19 April 2018

Categories: Cracking

This was an epic week for password cracking, we had lots of new hashes and lots of competition to see who could crack the most the fastest.

BLUF: I put together a cracking technique, and tested it against other techniques, generating some insight into the best performing cracking techniques. Rockyou with hob064 rules won, but my technique came a close second, and had a faster crack speed. Get the script here.

You can use the technique with a list of common substrings from your own lists (sorry we can’t share ours). Or use the technique targetted specifically at a dump you’ve been going at to mine more cracks out of it.

Common Substrings

As my eyes blurred over some boring work, I had the thought; “what if we used the most common substrings found in already cracked passwords, to crack more”. For example, if users regularly use “companyname” or “!!” in their passwords, this would pull them out.

To this end, I wrote some dirty python. It took 38 minutes to run across one list. Before optimising I thought I should try awk, which is famously good at this sort of processing.

That lead me to a kernel of an idea taken from these forums. awk is magic, if hard to understand. I’ll leave doing that as an exercise to the reader. Needless to say, this is *much* faster than my pythonic attempts.

The way to use this, is to dump all the clears you’ve cracked so far to a file, then run this over that output. It’ll output some stats like percentage and number of times it was seen (and sort by percentage). Just cut on tabs to get the substrings only. Make sure you don’t unique anything, if a dump has lots of the same password repeated, you *want* that to show up as “more common”. If you unique either the hashes or the clears, you’ll lose that.

I then used this to generate a list of common substrings specific to various password dumps, and managed to crack a whole lot more that I hadn’t cracked before. I used hashcat’s -a1 combinator attack mode with the substrings as the right most list and other password lists as the left. I’d run it twice, once with -jc (i.e. capitalise first letter) and then again without.

I then took the most common substrings (everything >= 1%) by percentage from various dumps, and combined those to form a short super list of common substrings.

It looked like it was working well, but I wanted to see how it compared to other techniques.

Efficiency Measurements

It’s fine to some “something worked well” but what does that actually mean? Well, stand back, I’m going to try science!

I run 88 different tests on my laptop (kept constant) trying different techniques against different sets of hashes to see what worked best. I’d clear the potfile, run the text, then make a note of the time it took, the H/s, the number of hashes cracked and the percentage of the total that constituted.

The experiments combined several things:

4 different sets of hashes from projects. Two were part of the substring creation list (a & b) and two weren’t part of them (x & y).
Two base password lists; rockyou and facebook-firstnames (and in one case our private lists as a comparison).
Three rule sets best64, hob064 and InsidePro-PasswordsPro. I chose them as they were fast, and this experiment needed speed to scale the tests.
Where appropriate, I tried with -j c and without (i.e. uppercase first char), mostly this was for the substrings tests.

If you want raw results, my excel calcs are here

The overall results, were that a rules based approach with hob064 and rockyou featured in the top 4 for each password list as the most efficient, cracking on average 9,4% (ranging between 4,5%-18,2%) of the respective hash lists in 4-6s (your speed may vary). The second most effective, was using facebook firstnames with my substring list and uppercasing the first letter (i.e. -jc). This cracked on average 9,8% (ranging between 6,2-12,2%) of the passwords in the respective dumps between 7-8s. The next best technique (facebook-firstnames with best64) only averaged 3% and only did well against one password list, so it skewed its results. However, the substring attack had a significantly higher H/s on average than the rules based attack, which may give it an edge. To put this is a table:

Approach	Average % Cracked	Average time (s)	Average MH/s
rockyou rules hob064	9.4%	5s	190.25MH/s
fb-firstnames substrings -jc	9.8%	7.3s	914.35MH/s

I did a brief test of our private wordlists against one set of hashes. Those lists outperformed both rockyou and facebook-firstnames in effectiveness. So it makes sense to develop your own for your specific use cases. The first list with hob064 rules did 15% of the hashes in 2s, and the second list with my substrings and -jc did 13% in 2s.

I also did a quick check of a mask attack out of interest, I used facebook-firstname and -jc and it took 37s to get 6% of the passwords.

Finally, I checked what the overlap between the rules-based approach and the substring approaches was (i.e. are they finding the same passwords or different ones). This was less good, on average there was a 4.5% non-overlap between the rules and substring approaches. I suspect this has a lot to do with the wordlists.

Our Blog

Common Substrings

Efficiency Measurements