How to crawl the web

26th Jul 2017

I usually work with CMS and, from time to time, there's the same old question: what's their marketshare? Joomla claims to be the 2%, WordPress something around 27%. Is there a way to get some solid data and fix this issue once and for all?
Well, the answer is simple: let's crawl the web and count how many sites are using a specific technology.

Critical information disclosure on

While performing some online assesment, a critical information disclosure has been found on The vulnerability has been fixed, this is the full disclosure about the issue.

MongoDB Scraper

Keep door closed at all times

MongoDB is a NoSQL database and it's very handful when you don't want the constrains of a fixed schema.
Sadly it comes with very unsecure default settings: if left untouched, MongoDB will allow connections without any username and password.
Accordingly to Shodan, there are more than 60k MongoDB instances freely accessible over the Internet. What if we start to crawl them all?

Hashtag scraper

27th Dec 2016

A better way of scraping

Common wordlists and mask attacks can crack a large amount of passwords, but to get even the last ones we have to get creative. Passwords are slowly turning into passphrases: several words packed together as the famous XKCD comic pictured some time ago.
This means that we have to find a way to guess what people are actually thinking and how they usually combine words.