Deciphering the dark web: What it is and why it’s not so scary


12 Sep 2019

Image: © Pebo/Stock.adobe.com

What you don’t know about the dark web might be exploited by a ‘dark web intelligence’ vendor. Forrester’s Josh Zelonis offers a simple explanation and some helpful pointers.

The dark web is nothing fancy. It’s really just a different series of protocols.

Commonly, when surfing the web, transport layer security (TLS) is the cryptographic protocol that provides confidentiality for your communication with the server. The green lock on your URL bar is an assurance, but not a guarantee, that you’re communicating confidentially with who you think you are.

While TLS is designed to provide confidentiality and identity, dark web protocols are designed to provide confidentiality and anonymity. There are many of these dark net protocols, but Tor is by far the most common, likely because of its use of exit nodes to allow a user to obtain anonymity on the public internet by routing traffic across the Tor network.

Don’t trust anything

The quality of your collection strategy dictates how confident you can be in your analysis – garbage in, garbage out. This is an often-ignored part of dark web marketing.

Anonymous networks help segment your actual identity from the persona (or avatar) you develop on these dark nets. Because of this, the reputation of your developed persona is the only currency you truly have. On anonymous networks, reputation is everything.

Also, remember that there’s no guarantee the person behind the persona you are interacting with isn’t a criminal, a threat intelligence company or possibly even law enforcement. The story of the Besa Mafia is a great example of criminals scamming criminals, getting hacked themselves, and then law enforcement arresting people who were trying to hire these fake hitmen. It’s also not uncommon for law enforcement to take control of a hidden site and continue hosting it in the hope of de-anonymising users.

Basically, trust nothing on the dark web.

‘There is some really bad stuff on dark nets, but they also are a critical resource’

Developing personas to obtain and, more importantly, maintain access is time-consuming and most of the work involved with good tradecraft on the dark web. Be wary that some ‘dark web intelligence’ offerings skip the hard part and are just using technical collection to scrape information from essentially public markets and forums.

To say this is a commodity capability would be a major understatement as the ability to automate the scraping of websites is as old as the internet and, as we’ve established, dark networks really just reflect a difference in protocol selection.

The use of the iceberg metaphor is a clever bit of psychological warfare – I mean, ‘marketing’ to remind you that they have access to all this stuff under the surface that you don’t. As someone who evaluates these vendors, many of them don’t either. You mind find yourself saying, ‘I registered for access and all I got was this low-confidence assessment’.

Intelligence v collection

Any company selling you on dark web intelligence is only talking about its collection strategy, and there are big problems with that.

After collection, the next challenge would be processing and exploitation. Processing is frequently discussed as stripping out things such as HTML tags from the raw data that has been collected. If you think that is a big deal, I have a regular expression (regex) to sell you.

Where things get interesting is trying to exploit this data to get something useful on an analyst’s desk. For example, very few, if any, public sector vendors have swathes of analysts translating everything on the dark web on a daily basis from languages such as Arabic, Farsi, Spanish, Russian and Mandarin. How is this being done at the same scale as collection?

Furthermore, how does your translation software handle slang? Without specific knowledge of a particular group, you would have no idea if they are using the code name ‘Iowa’ when describing a target in Iran.

Then there’s something I call ‘the Target problem’. Target is a retail chain with stores in the US, Canada and India – many of you may be familiar with the brand. Now, imagine the data problem created in attempting to parse out relevant chatter about the Target brand from the rest of the noise on the internet. Incidentally, the string ‘target’ appears five times in this article and only three times in the context of the retailer.

A vendor cannot have an appreciation of these problems and not talk about their solution to them. If they are just trying to sell you on their ability to collect data from the dark web and then show you their platform, you don’t need to see the platform.

The bright side of the dark web

There is some really bad stuff on dark nets, but they also are a critical resource. Anonymous networks are critical to journalists, whistleblowers, survivors of domestic abuse, people with sensitive medical conditions, the politically oppressed and more.

I’m going to wrap this piece with a bit of a personal appeal. Please consider supporting projects such as the Tor Project or Tails. And, if you’re in a decision-making position at an organisation where people might assemble or seek to obtain information, please ensure that your site is useable when coming from a Tor exit node with JavaScript turned off.

Unlike so much that we do in the cyberdomain, this can actually save lives.

By Josh Zelonis

Josh Zelonis is a principal analyst at Forrester serving security and risk professionals by helping them continuously adapt their architecture, policies and processes to evolving threats. His research focuses on threat intelligence, vulnerability assessment and management, malware analysis and incident response.

A version of this article originally appeared on the Forrester blog.