Phrases Generator

13th Jan 2017

A.k.a. The Supercalifragilisti script

Cracking 50% of a password list is easy.
Reaching the 60% is nice.
Achieving the 70% requires some work (or patience).
Getting beyond that needs some creative thinking.

Remaining passwords are harder to guess, since most likely they are very long or with a crazy amount of entropy. In the latter case there's nothing we can do: we have to rely on bruteforcing or get some valid wordlist and start generating random rules.
However we know that users tend to create passphrases: we can scrape Twitter to find some interesting word combinations, but sometimes people love to put quotes of their favorite book.

This is why I created the Phrases script.

The script

Phrases will do a very simple thing: given a text, it will create a floating "window" of several words, writing them into a new file.
For example given the first few lines of the Bible, you can have the following result:

In the beginning God created the heaven and the earth.
And the earth was without form, and void; and darkness was upon the face of the deep. And the Spirit of God moved upon the face of the waters.
In the beginning God
the beginning God created
beginning God created the
God created the heaven
created the heaven and
the heaven and the
heaven and the earth

Usage

This is a very light script, with no external library requirements.
You can simply invoke the script with:

python phrases.py [OPTIONS] original_file.txt

Options

Phrases Generator has some options for fine tuning:

-o OUTFILE --outfile OUTFILE  Output file (default to phrases.txt)
-w WORDS   --words   WORDS    Number of words for each row (default to 4)

Let's see it in action

Honestly I wrote this script just for fun, to check if my hunch was correct. I created 4 different wordlists based on the Bible, using from 4 to 7 words; then I ran Hashcat on a very large password collection (thank you LinkedIn).
I was baffled when I checked the recovered list (first column is the length):

25 withhisstripeswearehealed  
23 amnotashamedofthegospel  
22 rejoiceinthelordalways  
21 wonderswithoutnumbers  
20 theearthisthelords11  
20 lordofheavenandearth  
20 fearnotiwillhelpthee  
20 andthewordwaswithgod  
19 onehourwiththebeast  
19 hardennotyourhearts

This is just a small snippet of the final output, but as you can see I was able to recover words longer than 20 chars!

Tips and Tricks

Words are written without any modifications, preserving the original spaces. This is useful if you want to apply some Hashcat rules.
Here you can find some basic rules that I used in my tests:

# Lowercase everything and remove spaces
l@ 
# Uppercase everything and remove spaces
u@ 
# Capitalize the first word and remove all spaces
E@ 

Please note All lines end with a space, sadly it's hard to spot if you don't copy/paste the rules.

Conclusions

Obviously this shouldn't be your first option when trying to crack a password list: this is something you should use when you're pretty close to toss the towel and give up (or you're really willing to start bruteforcing 16+ chars...).
As usual, you should choose the right text for the right scenario: using the Bible vs the Ashley Madison dump produced no results ;)

Comments:

Blog Comments powered by Disqus.