Category Archives: osint

Ep. 114 – Finding Love with Whitney Merrill

What do you get when you mix a lawyer, crypto junkie and a romantic together? Well, none other than our guest for this month, Whitney Merrill. – Feb 11, 2019
Contents Download Get Involved

Download

Ep. 114 – Finding Love with Whitney Merrill
Miro Video Player

Get Involved

Got a great idea for an upcoming podcast? Send us a quick message on the contact form! Enjoy the Outtro Music? Thanks to Clutch for allowing us to use Son of Virginia as our new SEPodcast Theme Music And check out a schedule for all our training at Social-Engineer.Com Check out the Innocent Lives Foundation to help unmask online child predators.

The post Ep. 114 – Finding Love with Whitney Merrill appeared first on Security Through Education.

Historical OSINT – A Portfolio of Fake Tech Support Scam Domains – An Analysis

The Rise of Tech Support Scams? You wish. The general availability of Tech Support Scams can be attributed to an overall increase in the standardization of social engineering type of fraudulent and rogue scams which can be greatly attributed to the overall availability of affiliate-network type of fraudulent revenue-sharing schemes. Keep reading. What can be best described as today's modern

Exposing Iran’s Most Wanted Cybercriminals – FBI Most Wanted Checklist – OSINT Analysis

Remember my most recently published "Assessing The Computer Network Operation (CNO) Capabilities of the Islamic Republic of Iran - Report"? The report details and discusses in-depth the most prolific Iran-based government-sponsored and tolerated hacking groups including the following groups: - Ashiyane Digital Security Team - Iranhack Security Team - Iranian Datacoders Security Team - Iran

Who’s Behind BakaSoftware? – OSINT Analysis

Remember BakaSoftware? The ubiquitous scareware-serving and distributing money laundering scareware affiliate-based network circa 2008? It appears that the time has come to expose the actual individuals behind the campaign and the actual network. In this analysis I'll discuss in depth the BakaSoftware franchise circa 2008 including in-depth and personally identifiable information on the

Cyber Security Project Investment Proposal – DIA Needipedia – Fight Cybercrime and Cyber Jihad With Sensors – Grab Your Copy Today!

Dear blog readers, I decided to share with everyone a currently pending project investment proposal regarding the upcoming launch of a proprietary Technical Collection analysis platform with the project proposal draft available on request part of DIA's Needipedia Project Proposal Investment draft or eventually through the Smith Richardson Foundation. In case you're interested in working with me

How To Locate Domains Spoofing Campaigns (Using Google Dorks) #Midterms2018

The government accounts of US Senator Claire McCaskill (and her staff) were targeted in 2017 by APT28 A.K.A. “Fancy Bear” according to an article published by The Daily Beast on July 26th. Senator McCaskill has since confirmed the details.

And many of the subsequent (non-technical) articles that have been published has focused almost exclusively on the fact that McCaskill is running for re-election in 2018. But, is it really conclusive that this hacking attempt was about the 2018 midterms? After all, Senator McCaskill is the top-ranking Democrat on the Homeland Security & Governmental Affairs Committee and also sits on the Armed Services Committee. Perhaps she and her staffers were instead targeted for insights into on-going Senate investigations?

Senator Claire McCaskill's Committee Assignments

Because if you want to target an election campaign, you should target the candidate’s campaign server, not their government accounts. (Elected officials cannot use government accounts/resources for their personal campaigns.) In the case of Senator McCaskill, the campaign server is: clairemccaskill.com.

Which appears to be a WordPress site.

clairemccaskill.com/robots.txt

Running on an Apache server.

clairemccaskill.com Apache error log

And it has various e-mail addresses associated with it.

clairemccaskill.com email addresses

That looks interesting, right? So… let’s do some Google dorking!

Searching for “clairemccaskill.com” in URLs while discarding the actual site yielded a few pages of results.

Google dork: inurl:clairemccaskill.com -site:clairemccaskill.com

And on page two of those results, this…

clairemccaskill.com.de

Definitely suspicious.

Whats is com.de? It’s a domain on the .de TLD (not a TLD itself).

.com.de

Okay, so… what other interesting domains associated with com.de are there to discover?

How about additional US Senators up for re-election such as Florida Senator Bill Nelson? Yep.

nelsonforsenate.com.de

Senator Bob Casey? Yep.

bobcasey.com.de

And Senator Sheldon Whitehouse? Yep.

whitehouseforsenate.com.de

But that’s not all. Democrats aren’t the only ones being spoofed.

Iowa Senate Republicans.

iowasenaterepublicans.com.de

And “Senate Conservatives“.

senateconservatives.com.de

Hmm. Well, while being no more closer to knowing whether or not Senator McCaskill’s government accounts were actually targeted because of the midterm elections – the domains shown above are definitely shady AF. And enough to give cause for concern that the 2018 midterms are indeed being targeted, by somebody.

(Our research continues.)

Meanwhile, the FBI might want to get in touch with the owners of com.de.

How To Get Twitter Follower Data Using Python And Tweepy

In January 2018, I wrote a couple of blog posts outlining some analysis I’d performed on followers of popular Finnish Twitter profiles. A few people asked that I share the tools used to perform that research. Today, I’ll share a tool similar to the one I used to conduct that research, and at the same time, illustrate how to obtain data about a Twitter account’s followers.

This tool uses Tweepy to connect to the Twitter API. In order to enumerate a target account’s followers, I like to start by using Tweepy’s followers_ids() function to get a list of Twitter ids of accounts that are following the target account. This call completes in a single query, and gives us a list of Twitter ids that can be saved for later use (since both screen_name and name an be changed, but the account’s id never changes). Once I’ve obtained a list of Twitter ids, I can use Tweepy’s lookup_users(userids=batch) to obtain Twitter User objects for each Twitter id. As far as I know, this isn’t exactly the documented way of obtaining this data, but it suits my needs. /shrug

Once a full set of Twitter User objects has been obtained, we can perform analysis on it. In the following tool, I chose to look at the account age and friends_count of each account returned, print a summary, and save a summarized form of each account’s details as json, for potential further processing. Here’s the full code:

from tweepy import OAuthHandler
from tweepy import API
from collections import Counter
from datetime import datetime, date, time, timedelta
import sys
import json
import os
import io
import re
import time

# Helper functions to load and save intermediate steps
def save_json(variable, filename):
    with io.open(filename, "w", encoding="utf-8") as f:
        f.write(unicode(json.dumps(variable, indent=4, ensure_ascii=False)))

def load_json(filename):
    ret = None
    if os.path.exists(filename):
        try:
            with io.open(filename, "r", encoding="utf-8") as f:
                ret = json.load(f)
        except:
            pass
    return ret

def try_load_or_process(filename, processor_fn, function_arg):
    load_fn = None
    save_fn = None
    if filename.endswith("json"):
        load_fn = load_json
        save_fn = save_json
    else:
        load_fn = load_bin
        save_fn = save_bin
    if os.path.exists(filename):
        print("Loading " + filename)
        return load_fn(filename)
    else:
        ret = processor_fn(function_arg)
        print("Saving " + filename)
        save_fn(ret, filename)
        return ret

# Some helper functions to convert between different time formats and perform date calculations
def twitter_time_to_object(time_string):
    twitter_format = "%a %b %d %H:%M:%S %Y"
    match_expression = "^(.+)\s(\+[0-9][0-9][0-9][0-9])\s([0-9][0-9][0-9][0-9])$"
    match = re.search(match_expression, time_string)
    if match is not None:
        first_bit = match.group(1)
        second_bit = match.group(2)
        last_bit = match.group(3)
        new_string = first_bit + " " + last_bit
        date_object = datetime.strptime(new_string, twitter_format)
        return date_object

def time_object_to_unix(time_object):
    return int(time_object.strftime("%s"))

def twitter_time_to_unix(time_string):
    return time_object_to_unix(twitter_time_to_object(time_string))

def seconds_since_twitter_time(time_string):
    input_time_unix = int(twitter_time_to_unix(time_string))
    current_time_unix = int(get_utc_unix_time())
    return current_time_unix - input_time_unix

def get_utc_unix_time():
    dts = datetime.utcnow()
    return time.mktime(dts.timetuple())

# Get a list of follower ids for the target account
def get_follower_ids(target):
    return auth_api.followers_ids(target)

# Twitter API allows us to batch query 100 accounts at a time
# So we'll create batches of 100 follower ids and gather Twitter User objects for each batch
def get_user_objects(follower_ids):
    batch_len = 100
    num_batches = len(follower_ids) / 100
    batches = (follower_ids[i:i+batch_len] for i in range(0, len(follower_ids), batch_len))
    all_data = []
    for batch_count, batch in enumerate(batches):
        sys.stdout.write("\r")
        sys.stdout.flush()
        sys.stdout.write("Fetching batch: " + str(batch_count) + "/" + str(num_batches))
        sys.stdout.flush()
        users_list = auth_api.lookup_users(user_ids=batch)
        users_json = (map(lambda t: t._json, users_list))
        all_data += users_json
    return all_data

# Creates one week length ranges and finds items that fit into those range boundaries
def make_ranges(user_data, num_ranges=20):
    range_max = 604800 * num_ranges
    range_step = range_max/num_ranges

# We create ranges and labels first and then iterate these when going through the whole list
# of user data, to speed things up
    ranges = {}
    labels = {}
    for x in range(num_ranges):
        start_range = x * range_step
        end_range = x * range_step + range_step
        label = "%02d" % x + " - " + "%02d" % (x+1) + " weeks"
        labels[label] = []
        ranges[label] = {}
        ranges[label]["start"] = start_range
        ranges[label]["end"] = end_range
    for user in user_data:
        if "created_at" in user:
            account_age = seconds_since_twitter_time(user["created_at"])
            for label, timestamps in ranges.iteritems():
                if account_age > timestamps["start"] and account_age < timestamps["end"]:
                    entry = {} 
                    id_str = user["id_str"] 
                    entry[id_str] = {} 
                    fields = ["screen_name", "name", "created_at", "friends_count", "followers_count", "favourites_count", "statuses_count"] 
                    for f in fields: 
                        if f in user: 
                            entry[id_str][f] = user[f] 
                    labels[label].append(entry) 
    return labels

if __name__ == "__main__": 
    account_list = [] 
    if (len(sys.argv) > 1):
        account_list = sys.argv[1:]

    if len(account_list) < 1:
        print("No parameters supplied. Exiting.")
        sys.exit(0)

    consumer_key=""
    consumer_secret=""
    access_token=""
    access_token_secret=""

    auth = OAuthHandler(consumer_key, consumer_secret)
    auth.set_access_token(access_token, access_token_secret)
    auth_api = API(auth)

    for target in account_list:
        print("Processing target: " + target)

# Get a list of Twitter ids for followers of target account and save it
        filename = target + "_follower_ids.json"
        follower_ids = try_load_or_process(filename, get_follower_ids, target)

# Fetch Twitter User objects from each Twitter id found and save the data
        filename = target + "_followers.json"
        user_objects = try_load_or_process(filename, get_user_objects, follower_ids)
        total_objects = len(user_objects)

# Record a few details about each account that falls between specified age ranges
        ranges = make_ranges(user_objects)
        filename = target + "_ranges.json"
        save_json(ranges, filename)

# Print a few summaries
        print
        print("\t\tFollower age ranges")
        print("\t\t===================")
        total = 0
        following_counter = Counter()
        for label, entries in sorted(ranges.iteritems()):
            print("\t\t" + str(len(entries)) + " accounts were created within " + label)
            total += len(entries)
            for entry in entries:
                for id_str, values in entry.iteritems():
                    if "friends_count" in values:
                        following_counter[values["friends_count"]] += 1
        print("\t\tTotal: " + str(total) + "/" + str(total_objects))
        print
        print("\t\tMost common friends counts")
        print("\t\t==========================")
        total = 0
        for num, count in following_counter.most_common(20):
            total += count
            print("\t\t" + str(count) + " accounts are following " + str(num) + " accounts")
        print("\t\tTotal: " + str(total) + "/" + str(total_objects))
        print
        print

Let’s run this tool against a few accounts and see what results we get. First up: @realDonaldTrump

realdonaldtrump_age_ranges

Age ranges of new accounts following @realDonaldTrump

As we can see, over 80% of @realDonaldTrump’s last 5000 followers are very new accounts (less than 20 weeks old), with a majority of those being under a week old. Here’s the top friends_count values of those accounts:

realdonaldtrump_friends_counts

Most common friends_count values seen amongst the new accounts following @realDonaldTrump

No obvious pattern is present in this data.

Next up, an account I looked at in a previous blog post – @niinisto (the president of Finland).

Age ranges of new accounts following @niinisto

Many of @niinisto’s last 5000 followers are new Twitter accounts. However, not in as large of a proportion as in the @realDonaldTrump case. In both of the above cases, this is to be expected, since both accounts are recommended to new users of Twitter. Let’s look at the friends_count values for the above set.

Most common friends_count values seen amongst the new accounts following @niinisto

In some cases, clicking through the creation of a new Twitter account (next, next, next, finish) will create an account that follows 21 Twitter profiles. This can explain the high proportion of accounts in this list with a friends_count value of 21. However, we might expect to see the same (or an even stronger) pattern with the @realDonaldTrump account. And we’re not. I’m not sure why this is the case, but it could be that Twitter has some automation in place to auto-delete programmatically created accounts. If you look at the output of my script you’ll see that between fetching the list of Twitter ids for the last 5000 followers of @realDonaldTrump, and fetching the full Twitter User objects for those ids, 3 accounts “went missing” (and hence the tool only collected data for 4997 accounts.)

Finally, just for good measure, I ran the tool against my own account (@r0zetta).

Age ranges of new accounts following @r0zetta

Here you see a distribution that’s probably common for non-celebrity Twitter accounts. Not many of my followers have new accounts. What’s more, there’s absolutely no pattern in the friends_count values of these accounts:

Most common friends_count values seen amongst the new accounts following @r0zetta

Of course, there are plenty of other interesting analyses that can be performed on the data collected by this tool. Once the script has been run, all data is saved on disk as json files, so you can process it to your heart’s content without having to run additional queries against Twitter’s servers. As usual, have fun extending this tool to your own needs, and if you’re interested in reading some of my other guides or analyses, here’s full list of those articles.

Further Analysis Of The Finnish Themed Twitter Botnet

In a blog post I published yesterday, I detailed the methodology I have been using to discover “Finnish themed” Twitter accounts that are most likely being programmatically created. In my previous post, I called them “bots”, but for the sake of clarity, let’s refer to them as “suspicious accounts”.

These suspicious accounts all follow a subset of recommended profiles presented to new Twitter users. In many cases, these automatically created Twitter accounts follow exactly 21 users. The reason I pursued this line of research was because it was similar to a phenomenon I’d seen happening in the US earlier last year. Check this post for more details about that case.

In an attempt to estimate the number of accounts created by the automated process described in my previous post, I ran the same analysis tool against a list of 114 Twitter profiles recommended to new Finnish users. Here is the list.

juhasipila
TuomasEnbuske
alexstubb
hsfi
mikko
rikurantala
yleuutiset
jatkoaika
smliiga
Valavuori
SarasvuoJari
niinisto
iltasanomat
Tami2605
KauppalehtiFi
talouselama
TeemuSel8nne
nokia
HeikelaJussi
hjallisharkimo
Linnanahde
tapio_suominen
vrantanen
meteorologit
tikitalk10
yleurheilu
JaajoLinnonmaa
hirviniemi
pvesterbacka
taloussanomat
TuomasKyr
MTVUutiset
Haavisto
SuomenKuvalehti
MikaelJungner
paavoarhinmaki
KajKunnas
SamiHedberg
VilleNiinisto
HenkkaHypponen
SaskaSaarikoski
jhiitela
Finnair
TarjaHalonen
leijonat
JollaHQ
filsdeproust
makinenantti
lottabacklund
jyrkikasvi
JethroRostedt
Ulkoministerio
valtioneuvosto
Yleisradio
annaperho
liandersson
pekkasauri
neiltyson
villetolvanen
akiriihilahti
TampereenPoika
madventures
Vapaavuori
jkekalainen
AppelsinUlla
pakalupapito
rakelliekki
kyleturris
tanelitikka
SlushHQ
arcticstartup
lindaliukas
goodnewsfinland
docventures
jasondemers5
Retee27
H_Kovalainen
ipaananen
FrenzziiiBull
ylenews
digitoday
jraitamaa
marmai
MikaVayrynen
LKomarov
ovi8
paulavesala
OsmoSoininvaara
juuuso
JaanaPelkonen
saaraaalto
yletiede
TimoHaapala
Huuhkajat
ErvastiPekka
JussiPullinen
rsiilasmaa
moia
Palloliitto
teroterotero
ARaanta31
kirsipiha
JPohjanpalo
startupsauna
aaltoes
Villebla
MariaVeitola
merjaya
MikiKuusi
MTVSportfi
EHaula
svuorikoski
andrewickstroem
kokoomus

For each account, my script saved a list of accounts suspected of being automatically created. After completing the analysis of these 114 accounts, I iterated through all collected lists in order to identify all unique account names across those lists.

Across the 114 recommended Twitter profiles, my analysis identified 5631 unique accounts. Here are the (first twenty) age ranges of the most recently created accounts:

All age ranges

Age ranges of all suspicious Twitter accounts identified by my script

It has been suggested (link in Finnish) that these accounts appeared when a popular game, Growtopia, asked its players to follow their Twitter account after a game outage, and those new accounts started following recommended Twitter profiles (including those of Haavisto and Niinistö). In order to check if this was the case, I collected a list of accounts following @growtopiagame, and checked for accounts that appear on both that list, and the list of suspicious accounts collected in my previous step. That number was 3. This likely indicates that the accounts my analysis identified aren’t players of Growtopia.

Someone Is Building A Finnish-Themed Twitter Botnet

Finland will hold a presidential election on the 28th January 2018. Campaigning just started, and candidates are being regularly interviewed by the press and on the TV. In a recent interview, one of the presidential candidates, Pekka Haavisto, mentioned that both his Twitter account, and the account of the current Finnish president, Sauli Niinistö had recently been followed by a number of bot accounts. I couldn’t resist investigating this myself.

I wrote a tool to analyze a Twitter account’s followers. The Twitter API only gives me access to the last 5000 accounts that have followed a queried account. However, this was enough for me to find some interesting data.

As I previously wrote, newly created bulk bot accounts often look very similar. I implemented some logic in my follower analysis tool that attempts to identify bots by looking for a combination of the following:

  • Is the account still an “egg” (default profile settings, default picture, etc.)?
  • Does the account follow exactly 21 other accounts?
  • Does the account follow very few accounts (less than 22)?
  • Does the account have a bot-like name (a string of random characters)?
  • Does the account have zero followers?
  • Has the account tweeted zero times?

Each of the above conditions give a score. If the total of all scores exceeds an arbitrary value, I record the name of the account.

I ran this tool against @Haavisto and @niinisto Twitter accounts and found the following:
Matches for @Haavisto account: 399
Matches for @niinisto account: 330

In both cases, the accounts in question were by-and-large under 2 months old.

Haavisto bot account age ranges

Account age ranges for bots following @Haavisto

 

Niinisto account bot follower age ranges

Account age ranges for bots following @niinisto

Interestingly, I checked the intersection between these two groups of bots. Only 49 of these accounts followed both @Haavisto and @niinisto.

Checking a handful of the flagged accounts manually using the Twitter web client, I quickly noticed that they all follow a similar selection of high-profile Finnish twitter accounts, including accounts such as:

Tuomas Enbuske (@TuomasEnbuske) – a Finnish celebrity
Riku Rantala (@rikurantala) – host of Madventures
Sauli Niinistö (@niinisto) – Finland’s current president
Juha Sipilä (@juhasipila) – Finland’s prime minister
Alexander Stubb (@alexstubb) – Former prime minister of Finland
Pekka Haavisto (@Haavisto) – presidential candidate
YLE (@yleuutiset) – Finland’s equivalent of the BBC
Kauppalehti (@KauppalehtiFi) – a popular Finnish newspaper
Ilta Sanomat (@iltasanomat) – a popular Finnish newspaper
Talous Sanomat (@taloussanomat) – a prominent financial news source
Helsingin Sanomat (@hsfi) – Helsinki’s local newspaper
Ilmatieteen laitos (@meteorologit) – Finnish weather reporting source

What the bots are following

All the bots were following similar popular Finnish Twitter accounts, such as these.

Running the same analysis tool against Riku Rantala’s account yielded similar results. In fact, Riku has been the recipient of 660 new bot followers (although some of them were added on previous waves, judging by the account ages).

Account age ranges for bots following @rikurantala

I have no doubt that the other accounts listed above (and a few more) have recently been followed by several hundred of these bots.

By the way, running the same analysis against the @realDonaldTrump account only found 220 new bots. To verify, I also ran the tool against @mikko yielding a count of 103 bots, and against @rsiilasmaa I found only 38.

It seems someone is busy building a Finnish-themed Twitter botnet. We don’t yet know what it will be used for.

toolsmith #130 – OSINT with Buscador

First off, Happy New Year! I hope you have a productive and successful 2018. I thought I'd kick off the new year with another exploration of OSINT. In addition to my work as an information security leader and practitioner at Microsoft, I am privileged to serve in Washington's military as a J-2 which means I'm part of the intelligence directorate of a joint staff. Intelligence duties in a guard unit context are commonly focused on situational awareness for mission readiness. Additionally, in my unit we combine part of J-6 (command, control, communications, and computer systems directorate of a joint staff) with J-2, making Cyber Network Operations a J-2/6 function. Open source intelligence (OSINT) gathering is quite useful in developing indicators specific to adversaries as well as identifying targets of opportunity for red team and vulnerability assessments. We've discussed numerous OSINT offerings as part of toolsmiths past, there's no better time than our 130th edition to discuss an OSINT platform inclusive of previous topics such as Recon-ng, Spiderfoot, Maltego, and Datasploit. Buscador is just such a platform and comes from genuine OSINT experts Michael Bazzell and David Wescott. Buscador is "a Linux Virtual Machine that is pre-configured for online investigators." Michael is the author of Open Source Intelligence Techniques (5th edition) and Hiding from the Internet (3rd edition). I had a quick conversation with him and learned that they will have a new release in January (1.2), which will address many issues and add new features. Additionally, it will also revamp Firefox since the release of version 57. You can download Buscador as an OVA bundle for a variety of virtualization options, or as a ISO for USB boot devices or host operating systems. I had Buscador 1.1 up and running on Hyper-V in a matter of minutes after pulling the VMDK out of the OVA and converting it with QEMU. Buscador 1.1 includes numerous tools, in addition to the above mentioned standard bearers, you can expect the following and others:
  • Creepy
  • Metagoofil
  • MediaInfo
  • ExifTool
  • EmailHarvester
  • theHarvester
  • Wayback Exporter
  • HTTrack Cloner
  • Web Snapper
  • Knock Pages
  • SubBrute
  • Twitter Exporter
  • Tinfoleak 
  • InstaLooter 
  • BleachBit 
Tools are conveniently offered via the menu bar on the UI's left, or can easily be via Show Applications.
To put Buscador through its paces, using myself as a target of opportunity, I tested a few of the tools I'd not prior utilized. Starting with Creepy, the geolocation OSINT tool, I configured the Twitter plugin, one of the four available (Flickr, Google+, Instagram, Twitter) in Creepy, and searched holisticinfosec, as seen in Figure 1.
Figure 1:  Creepy configuration




The results, as seen in Figure 2, include some good details, but no immediate location data.

Figure 2: Creepy results
Had I configured the other plugins or was even a user of Flickr or Google+, better results would have been likely. I have location turned off for my Tweets, but my profile does profile does include Seattle. Creepy is quite good for assessing targets who utilize social media heavily, but if you wish to dig more deeply into Twitter usage, check out Tinfoleak, which also uses geo information available in Tweets and uploaded images. The report for holisticinfosec is seen in Figure 3.

Figure 3: Tinfoleak
If you're looking for domain enumeration options, you can start with Knock. It's as easy as handing it a domain, I did so with holisticinfosec.org as seen in Figure 4, results are in Figure 5.
Figure 4: Knock run
Figure 5: Knock results
Other classics include HTTrack for web site cloning, and ExifTool for pulling all available metadata from images. HTTrack worked instantly as expected for holisticinfosec.org. I used Instalooter, "a program that can download any picture or video associated from an Instagram profile, without any API access", to grab sample images, then ran pyExifToolGui against them. As a simple experiment, I ran Instalooter against the infosec.memes Instagram account, followed by pyExifToolGui against all the downloaded images, then exported Exif metadata to HTML. If I were analyzing images for associated hashtags the export capability might be useful for an artifacts list.
Finally, one of my absolute favorites is Metagoofil, "an information gathering tool designed for extracting metadata of public documents." I did a quick run against my domain, with the doc retrieval parameter set at 50, then reviewed full.txt results (Figure 6), included in the output directory (home/Metagoofil) along with authors.csv, companies.csv, and modified.csv.

Figure 6: Metagoofil results

Metagoofil is extremely useful for gathering target data, I consider it a red team recon requirement. It's a faster, currently maintained offering that has some shared capabilities with Foca. It should also serve as a reminder just how much information is available in public facing documents, consider stripping the metadata before publishing. 

It's fantastic having all these capabilities ready and functional on one distribution, it keeps the OSINT discipline close at hand for those who need regular performance. I'm really looking forward to the Buscador 1.2 release, and better still, I have it on good authority that there is another book on the horizon from Michael. This is a simple platform with which to explore OSINT, remember to be a good citizen though, there is an awful lot that can be learned via these passive means.
Cheers...until next time.

Toolsmith #127: OSINT with Datasploit

I was reading an interesting Motherboard article, Legal Hacking Tools Can Be Useful for Journalists, Too, that includes reference to one of my all time OSINT favorites, Maltego. Joseph Cox's article also mentions Datasploit, a 2016 favorite for fellow tools aficionado, Toolswatch.org, see 2016 Top Security Tools as Voted by ToolsWatch.org Readers. Having not yet explored Datasploit myself, this proved to be a grand case of "no time like the present."
Datasploit is "an #OSINT Framework to perform various recon techniques, aggregate all the raw data, and give data in multiple formats." More specifically, as stated on Datasploit documentation page under Why Datasploit, it utilizes various Open Source Intelligence (OSINT) tools and techniques found to be effective, and brings them together to correlate the raw data captured, providing the user relevant information about domains, email address, phone numbers, person data, etc. Datasploit is useful to collect relevant information about target in order to expand your attack and defense surface very quickly.
The feature list includes:
  • Automated OSINT on domain / email / username / phone for relevant information from different sources
  • Useful for penetration testers, cyber investigators, defensive security professionals, etc.
  • Correlates and collaborate results, shows them in a consolidated manner
  • Tries to find out credentials,  API keys, tokens, sub-domains, domain history, legacy portals, and more as related to the target
  • Available as single consolidating tool as well as standalone scripts
  • Performs Active Scans on collected data
  • Generates HTML, JSON reports along with text files
Resources
Github: https://github.com/datasploit/datasploit
Documentation: http://datasploit.readthedocs.io/en/latest/
YouTube: Quick guide to installation and use

Pointers
Second, a few pointers to keep you from losing your mind. This project is very much work in progress, lots of very frustrated users filing bugs and wondering where the support is. The team is doing their best, be patient with them, but read through the Github issues to be sure any bugs you run into haven't already been addressed.
1) Datasploit does not error gracefully, it just crashes. This can be the result of unmet dependencies or even a missing API key. Do not despair, take note, I'll talk you through it.
2) I suggest, for ease, and best match to documentation, run Datasploit from an Ubuntu variant. Your best bet is to grab Kali, VM or dedicated and load it up there, as I did.
3) My installation guidance and recommendations should hopefully get you running trouble free, follow it explicitly.
4) Acquire as many API keys as possible, see further detail below.

Installation and preparation
From Kali bash prompt, in this order:

  1. git clone https://github.com/datasploit/datasploit /etc/datasploit
  2. apt-get install libxml2-dev libxslt-dev python-dev lib32z1-dev zlib1g-dev
  3. cd /etc/datasploit
  4. pip install -r requirements.txt
  5. mv config_sample.py config.py
  6. With your preferred editor, open config.py and add API keys for the following at a minimum, they are, for all intents and purposes required, detailed instructions to acquire each are here:
    1. Shodan API
    2. Censysio ID and Secret
    3. Clearbit API
    4. Emailhunter API
    5. Fullcontact API
    6. Google Custom Search Engine API key and CX ID
    7. Zoomeye Username and Password
If, and only if, you've done all of this correctly, you might end up with a running instance of Datasploit. :-) Seriously, this is some of the glitchiest software I've tussled with in quite a while, but the results paid handsomely. Run python datasploit.py domain.com, where domain.com is your target. Obviously, I ran python datasploit.py holisticinfosec.org to acquire results pertinent to your author. 
Datasploit rapidly pulled results as follows:
211 domain references from Github:
Github results
Luckily, no results from Shodan. :-)
Four results from Paste(s): 
Pastebin and Pastie results
Datasploit pulled russ at holisticinfosec dot org as expected, per email harvesting.
Accurate HolisticInfoSec host location data from Zoomeye:

Details regarding HolisticInfoSec sub-domains and page links:
Sub-domains and page links
Finally, a good return on DNS records for holisticinfosec.org and, thankfully, no vulns found via PunkSpider

DataSploit can also be integrated into other code and called as individual scripts for unique functions. I did a quick run with python emailOsint.py russ@holisticinfosec.org and the results were impressive:
Email OSINT
I love that the first query is of Troy Hunt's Have I Been Pwned. Not sure if you have been? Better check it out. Reminder here, you'll really want to be sure to have as many API keys as possible or you may find these buggy scripts crashing. You'll definitely find yourself compromising between frustration and the rapid, detailed results. I put this offering squarely in the "shows much promise category" if the devs keep focus on it, assess for quality, and handle errors better.
Give Datasploit a try for sure.
Cheers, until next time...