Recently, Netflix’s ‘The Great Hack’ has sparked a lot of interest around data privacy. The documentary covers Cambridge Analytica and how they abused user’s data, so that they could push an “agenda” for the highest bidder. If you haven’t seen it, the level of manipulation is insane: https://www.youtube.com/watch?v=omc-5zj70M0

All this was possible because they had immense data mining capabilities. Because they had access to all this information, they could profile users and target them based on their profile. Now the film focuses a lot about its ties with Facebook. What it doesn’t highlight, is that it’s not just big companies such as Cambridge Analytica that can do this level of data mining.

Nowadays, anyone can download tools which can scan the web, pulling data from open data sources. Things like social media site, publicly available documentation and records. These tools and services, fall under what’s known as OSINT.

OSINT means open-source intelligence and is data collected from publicly available sources.

OSINT tools and services can help you collect information about anybody with an internet presence. Now this might not be the private data that Cambridge Analytica used, but you have to remember, not everything is private. Even if it is, there are ways to gather it. One way is to simply ask for it. I’m guessing that you have defiantly seen a login screen like this before:

Now the method on the left is simple. You setup an account, with an email address and a password. This account is only relevant to that site. The method on the right however is “so much easier” as it’s all done in one place and you never have to remember your credentials. Problem is, you are most likely granting these services access to your social networking profiles. Some go either further and can often write posts on your behalf.

If you grant these services access, they can legally pull your data and do as they please. Remember, not every application developer is a public facing company or has your ethics. It might just be a developer wanting to harvest data, to make easy money. It doesn’t just stop with social media networks though. This also occurs for any applications you install on your laptop, tablet and phone. The application will always ask you for permissions before it installs. The question is, do you check?

Don’t give yourself to much of a hard time though, these companies are good at what they do. They aim to be the most popular app, so the ‘exclusion affects’ sinks in. After all, you don’t want to be left out. Especially if celebrities are using it! If they use it, it can’t be harmful.

Once you have the most popular app, people will fall in line and download it. If all your friends have it, peer pressure starts to kick in. This is what they try to create in order to reach the top.

If you read the terms and conditions for Faceapp, you see what you are agreeing to:

Pretty much giving them access to use your photos and name to be used for commercial purposes. They are not the first though. There are literally thousands of services and application which are doing this. They can then use/distribute this data however they see fit.

This loops us back to OSINT. Most often they will distribute this out privately as this will make the most money. Sometimes though, it lands on a public data source. This can then be harvested by you and me.

Let’s go back to the OSINT Framework. The site below gives you an interactive diagram which you can use to identify OSINT services and tools. These tools and services can you gather specific data.

Interactive OSINT Framework: https://osintframework.com/

Similar to this, Jigsaw Security has created the “Awsome-Osint” list which again, highlights tools which can be used for OSINT: https://github.com/jigsawsecurity/awesome-osint

So, where do we start?

Well it depends if you just want to play around or seriously investigate someone. If it’s the latter, you must get into the right mind frame. You will need to know a bit about reconnaissance, enumeration and how data is connected. What you are trying to create is sort of like a data web. Think of it like an investigation board with the red string.

Let’s start with a simple username you find on Facebook. You use Sherlock, and this gives you 9 more social media sites. On one of these sites, you find an email address. This email address links the user to a few services you didn’t know the target used. On one of these sites there is a phone number. You use this phone number to identify other usernames in which the target uses and so on. Each spec of data brings you closer to another. This process can take some time but if you follow each thread, it will pay off.

Now we know the mindset, let’s look at a few tools.

ThreatCrowd

ThreatCrowd is a web version of Maltego.

If you haven’t heard of Maltego, it’s probably one of the most well-known OSINT tools out there. If you are running Parrot or Kali, it will come preinstalled. What Threatcrowd can do is allow you to start creating your web. If you have an email address, you can use it to identify links. If your target is a company, you can search based on IP, Domain or Organization name.

Sherlock

Sherlock is a tool which allows you to query the internet, searching for matching usernames. Say for example a friend’s username is test123, you could use Sherlock to see if they have created other profiles elsewhere.

To Install:

Git clone https://github.com/sherlock-project/sherlock.git

Cd sherlock

Pip3 install -r requirements.txt

If you wanted to run a quick search based on username, run the following:

Python3 sherlock.py [Username]

As you can see below, it’s found several sites which has the username test123

Now some may be a different user but unfortunately, you will have to find this out for yourself.

Userrecon-py

Userrecon-py is similar to Sherlock but uses different data sources. Say your target has their Instagram username as johnroe34. Tools like userrecon-py can be used to see if johnroe34 is used elsewhere. As humans, we often stick to what we know. If you look at your own usernames, I guarantee you have used it elsewhere.

To install:

git clone https://github.com/decoxviii/userrecon-py.git ; cd userrecon-py

sudo -H pip3 install -r requirements.txt

python3 setup.py build

sudo python3 setup.py install

Say our target is Batman and we have identified that he uses the public facing profile of “brucewayne”. We can query social networking sites by using the following:

Userrecon-py –target brucewayne

Now the process can be lengthy as you will need to go through each one, but the plus side is that you may identify a hidden profile. This will help you collect more information. You are also trying to identify any other usernames they may use. Maybe on his Instagram account he mentions that he’s moved to ‘Brucewayneisnotbatman’. Once you have information like this, the process starts again. You may come across a barrier due to your target having their Facebook, Instagram and LinkedIn profiles to private. This then gives you little to no information. What they may have forgotten about is their Myspace profile which contains a whole bunch of personal information. This is why you run tools like this.

SubList3r

SubList3r is another great tool which can be used for OSINT. SubList3r helps identify sub domains which can help to identify hidden assets. DNS recon is key if your target is an organisation. On the face of it, you may only see only a few domains. What you might not see is a bunch of URLs that are externally accessible and have DNS records. Tools like SubLis3r can help you identify these.

To install:

git clone https://github.com/aboul3la/Sublist3r.git

cd Sublist3r

sudo pip install -r requirements.txt

WeebDNS

DNS enumeration is very important. WeebDNS is another tools which can help you identify DNS records which may link to a service or site.

To install:

git clone https://github.com/WeebSec/weebdns.git

cd weebdns

sudo pip3 install -r requirements.txt

python3 weebdns.py

As you can see, MX records can identify services used. This can be very helpful if you are pen-testing the company.

Google Dorks

Your friendly neighbourhood search engine can also be used. I’ve already covered this, so here is a handy link: https://ctrlaltdel.blog/2019/05/02/how-to-hack-with-google-dorks/

OSMEDEUS

Osmedeus is an “all in one” service. It’s perfect if your target is an organisation as it includes a lot of directory and domain enumeration tools, including the ones mentioned above. Here you can start with a domain and expand on your web.

To Install:

git clone https://github.com/j3ssie/Osmedeus

cd Osmedeus

./install.sh

Here you can find hidden or public domains:

It will also find vulnerabilities of the sites so be careful when running a scan.

Osmedeus will spit out heaps of information so it might be better viewing the results in your browser. Osmedeus will spin up a web server on port 5000 so if you navigated to https://127.0.0.1:5000, you will be able to login.

To find the credentials, you can check the config.conf file for the password. It is often in core/config.conf. If you can’t find them, run Osmedeus with the -c and set the location of the config file.

Once in, it will look like this.

SpiderFoot

If your target is a user and you want to stick with an ‘all in one’ tool, SpiderFoot is for you.

To install:

Git clone https://github.com/smicallef/spiderfoot.git

cd spiderfoot

pip install -r requirements.txt

More options: https://www.spiderfoot.net/download/

Once installed, launch it and SpiderFoot will present a link. SpiderFoot works in browser and can be reach using the port below:

Once you load this up, you will see the scanning page:

Here you can search for a whole bunch of values. It also gives you the option to modify the scanning method. Some sites may alert the users of the search, so you can choose to use ‘Passive mode’. You might not get all the information though, so you have to way up the pros and cons.

SpiderFoot works by utilizing APIs for a bunch of OSINT data sources. Not all sources allow you to access them freely, so you may need to create an account. This is defiantly worth doing as it will strengthen your “investigation”.

Every service with a padlock under ‘Settings’ requires an API key. To obtains this, you will need an account with the service. Some are free, and some are pay to use.

For example, Hunter.io (Free).

Once you have an account and API key, you can simply copy and paste it into SpiderFoot. Once you are happy with your modules, you can begin your scan.

SpiderFoot will then query its data sources and return matching results. This can then be viewed under the ‘Browse’ section.

Things like social media profiles, usernames and email addresses can all be found:

It will also search the Darkweb and HIPB to see if they are mentioned or part of a leak.

It will do some of the matching and similarity checks for you, but it is worth following the threads yourself.

If you’ve read this and your worried, don’t be. You can still use these trending applications and services, you just need to be careful, that’s all. If you are using “Free” editing software, don’t use anything personal or sensitive. You are most likely, giving them rights to use the photo that you upload. Just be cautious on what you are sharing and who you are sharing it with.

If you are using your social media accounts to login elsewhere, check what permission you are granting them.

Also, make you profiles private. For most of your social media sites, they don’t need to be publicly available for all. Remember, if they are, people can export your information and photos.

LinkedIn is a tricky one as it’s essentially for selling yourself to gain a job. Again, you have to way up the pros and cons and action accordingly.

You also have to be careful who you accept on these sites. Because of the new data privacy controls and laws, data miners will have to add you as a “Friend” or “Connection” in order to harvest data from private profiles.

Once you have things like this locked down, you will be in a better position. Data is accessible everyone and data mining has been going on for years. You can’t stop what they already have but you enable the security controls above to stop them from accessing any more.