Question

1. Why do I keep hearing about the Dark Web?

The term "Dark Web" is often a euphemism for a scary place where nothing good happens and only the bad guys lurk. This colorful caricature of a mysterious place somewhere in cyberspace is tantalizing for 21st-century audiences, and press reports about shady marketplaces like the Silk Road have eclipsed our view of the technological landscape.

Televisions around the world have been bombarded with advertisements for software that relies upon this conception. Before Rudy Giuliani was completely absorbed by Trump's scandals, he became Experian's spokesman against the Dark Web:

With sixteen years in cybersecurity I can tell you that criminals are scouring the Dark Web trying to steal your identity and your personal information.

Giuliani isn't a cybersecurity expert and doesn't know how the Dark Web works any more than he knows how Twitter does. You may also want to take Experian's cybersecurity advice with a grain of salt.

There is much more to the Dark Web than shady villians. In fact, it may be one of the last bastions of Internet privacy if the clear Web becomes completely surveilled, policed, and throttled. Institutional fears of the Dark Web are rooted in anxiety that networks will " go dark" and Internet communication will be beyond the all-seeing eyes of intelligence agencies, police, advertisers, and data brokers.

Of course, there is crime on the Dark Web and it is utilized by malicious hackers, aspiring gangsters, and child predators. In this way, however, it's not much different from the clear, public Web and social networks such as Facebook. Since the Dark Web is, well, dark, it is also difficult to know with any precision just how much "legitimate" versus "illegitimate" usage there is, even when those terms are clearly defined.

Question

2. What is the Dark Web?

The Dark Web refers to content that can only be accessed via "darknets", networks that are largely hidden from Web users and require specific software or credentials to access. You may hear the term used synonymously with "deep web" but they are not the same — any content that is restricted from easy access or not readily available in search engines can be referred to as "deep", but Dark Web connections require overlay or alternative networks such as Tor, I2P, and Freenet.

Hidden networks and clandestine servers are as old as the Internet, but gained traction in the late 1990s as people began looking for ways to share content without being under a microscope. As software such as Napster and LimeWire brought peer-to-peer (P2P) networks to homes around the globe, people discovered there were alternative ways to connect and share that didn't require a Web browser or programs shipped by an Internet Service Provider (ISP).

In 2002, Microsoft researchers recognized the potential of darknets, referring to them in the context of the wars raging over copyright infringement:

The last few years have seen vast increases in the darknet’s aggregate bandwidth, reliability, usability, size of shared library, and availability of search engines. In this paper we categorize and analyze existing and future darknets, from both the technical and legal perspectives. We speculate that there will be short-term impediments to the effectiveness of the darknet as a distribution mechanism, but ultimately the darknet-genie will not be put back into the bottle.

It was obvious that users could not be expected to stay on the clear Web and communicate only via public channels. People would not ask for approval to use alternative or overlay networks any more than they would ask for permission to talk to their friends, family, and colleagues. As taboos about sharing copyrighted-restricted content began to break down, the emerging Dark Web became synonymous with "piracy" and infringement. As we've seen, this association is still dominant in the minds of pundits and TV personalities, but it's only a small part of the truth and ignores the potential of the Dark Web to bolster human freedom.

Cybersecurity expert Alec Muffett invites us to question the terminology. When the moniker is used to describe Tor, he says, "It's just a pejorative label, and doesn't actually mean much... It equally could be described as the 'more secure Web'."

Question

3. What is Tor?

The Tor network enables anonymous communication, and was originally called "The Onion Router". Tor is what people often mean when they talk about the Dark Web. Tor protects your identity via a process called onion routing, which wraps your data in encrypted layers like an onion. Your traffic is encrypted and relayed three times as a Tor circuit, using randomly-chosen nodes from thousands of volunteer-run servers known as Tor relays. Though Tor circuits can provide a connection to "onion services", which are generally considered the largest component of the Dark Web, the majority of users only use Tor to access clear Internet services and websites anonymously.

If a person is utilizing good Operational Security (OPSEC), we know Tor provides reliable anonymity. Tor is Free and Open-Source Software (FOSS) and therefore its source code is transparent and open for inspection by security experts around the world. When there are security or privacy issues, or even potential threats, Tor and clients such as Tor Browser are patched often and quickly. For this reason, it is important to keep Tor updated and validate the software by learning how to verify signatures.

When Edward Snowden revealed the NSA mass surveillance programs to the world in 2013, he relied upon Tor to stay off the U.S. intelligence radar and collaborate with The Washington Post and The Guardian. One of the documents published from the Snowden trove is an internal NSA presentation called " Tor Stinks", which reveals the NSA's disdain for Tor's effectiveness:

We will never be able to de-anonymize all Tor users all the time... With manual analysis we can de-anonymize a very small fraction of Tor users, however, no success de-anonymizing a user in response to a [Target Office of Primary Interest] request/on demand.

The presentation also underscores the importance of good Operational Security (OPSEC) on the part of Tor users. Successful de-anonymization of Tor users relied upon cookies and scripts running in the browser, tracking the user silently without having to break the technology underlying Tor. For this reason, it is recommended that users only browse via Tor Browser on desktop or Android devices and use Onion Browser on iOS. Tor Browser has built-in settings that make it more difficult for you to be tracked via browser fingerprinting, malicious scripts, and other clandestine tracking methods that probably follow you around the Web in the other browsers you use.

Tor is not just limited to Web browsing. You can send any network traffic through the Tor network and, when set up correctly, this can provide anonymity for all kinds of activities besides browsing websites. It's possible that the privacy-respecting technologies you rely upon already are using Tor and you didn't know it. For example, the private chat app Briar utilizes the Tor network.

Question

4. What is an "onion service"?

People can host websites, chat protocols, and other services that are only accessible over the Tor network. Tor users connect to these onion services, formerly known as "hidden services", anonymously. In this way, Tor not only protects the identity of users on the client side, but also offers anonymity for a person or organization operating a server.

These onion services have the .onion top-level domain (TLD) and are only accessible when you are connected to Tor. So, for example, if you try to access a .onion URL in your normal Web browser it won't load. In Tor Browser, however, you will be able to access the onion service. This is the Dark Web that you hear about most often, the onion services that are only accessible via Tor.

Addresses for onion services are designed to be random and easy to generate without conflicting with other addresses. Onion services may change .onion addresses often, and may appear and re-appear on the Tor network. For this reason, they are difficult to index and study. The fact that new .onion addresses may be generated at any time is a security strength, allowing users to publish or share files via websites that cycle through addresses over time or even self-destruct. OnionShare is a good example of the latter, spinning up onion services that disappear at the host's discretion.

Question

5. Who uses Tor?

Tor is used by librarians, activists, human rights workers, and people in all walks of life around the world. Strong privacy and anonymity is invaluable to vulnerable Internet users, who may not only fear surveillance but also physical violence and detention. Survivors of domestic violence utilize Tor to escape abusers and stalkers and it allows unrestricted access to the Internet for undocumented workers who fear targeting by ICE/CBP in the United States. The censorship circumvention provided by Tor empowers free speech and journalism in even the most repressive places in the world, and Tor is often on the front lines of any nation-state Internet shutdown. The Tor Project funds and maintains the Open Observatory of Network Interference (OONI) to monitor censorship around the globe.

Tor is a vital tool to understand for whistleblowers and journalists, as it allows for anonymity while traversing the clear Web. Onion services also facilitate the sending, receiving, and publishing of documents, a capability that Wikileaks, other leaks websites, and mainstream journalists now rely upon. SecureDrop has become a standard component at major news organizations, using Tor to maintain the privacy and security of sources and encouraging tips and submissions.

Tor, like any technology, is a tool that can be used for good or evil. It is utilized by criminals, shady characters, and child predators as well as their victims. Studies on Tor usage are difficult to conduct, however, given the effectiveness of Tor's implementation. Onion services are well-hidden unless announced somewhere on the clear Web and users who access these services also have their anonymity strongly protected.

Question

6. Is the Dark Web a seedy place for "bad guys"?

On the clear Web, forums and social media can be terrible places to visit with awful and illegal content, and they don't require an understanding of Tor or other Dark Web technologies to use them. For people who are targeted by repressive regimes, however, usage of these clear Web outlets could lead to death. For this reason, it may be true that "good guys" need the Dark Web much more than "bad guys" do.

That is not to say that there isn't crime on the Dark Web. Tor has long been a tool for criminals and malicious hackers. Silk Road was infamous as a marketplace for all kinds of shady and illegal activity, and "Dread Pirate Roberts" Ross Ulbricht is now serving a life sentence for running it. When others launched a new version called Silk Road 2.0, they were shut down or arrested by authorities and the website was eventually seized by the FBI and Europol. These marketplace operators were identified because of informants, bad Operational Security (OPSEC), and sting operations, proving that "old-fashioned" law enforcement techniques can be successful on the Dark Web, just as they are on the rest of the Internet.

Similar marketplaces live on, not just via Tor but utilizing other Dark Web technologies such as I2P. Technologies with strong privacy protection continue to be used by malicious hackers, as part of botnets, and to exfiltrate data obtained during a breach. With worries about data breaches and cyber attacks at a fever-pitch, it's clear why many enterprise networks attempt to ban or block Tor usage altogether, seeing it as malicious-by-default.

The "cat and mouse" game of detecting and blocking Tor users will continue as long as the technology exists, and so will policing efforts by intelligence agencies. That may result in questionable investigations or overreach. In 2013, the FBI seized control of Freedom Hosting, which served websites over onion services, and used this seizure as an opportunity to exploit Tor users via a security hole in Firefox. Not only were websites taken down that had nothing to do with illegal activity, but users of Tor were attacked by malware in an attempt to identify them. This pattern continues, with the FBI using spyware to catch child predators and temporarily hosting child porn websites in sting operations. These stings are clearly effective, resulting in many convictions, whatever one thinks about the methods involved.

We are bombarded by stories of cybercrime and hyperbole about the Dark Web and it's worth taking a deep breath and trying to put the issues in perspective. Laurin Weissinger, Lecturer in Law and Cyber Fellow at the Center for Global Legal Challenges at Yale Law School, reminds us that the utility of Tor for criminals is not much different than that of any efficient and privacy-preserving technology:

In the [Cybersecurity class at Yale Law School] I did speak about the criminal aspects:

why would criminals move to the Tor network, and so on. We underline the fact that this is just a rational move by criminals. If you want to, for example, host a forum where illegal stuff can be bought and sold, you would use the most privacy-enhancing technology available, which is Tor. At the same time, there are also a ton of illegitimate websites on what I’ll call the clear web. We know that any technology that exists will be used for illegitimate, criminal, morally problematic reasons. If criminals are using things like the Tor network, email encryption, secure messaging, etc, it means that these technologies appear to be offering some level of protection.

Question

7. How much crime is on the Dark Web?

This question may be impossible to answer, but there have been a handful of studies of Tor onion services, as the largest and most well-known anonymity network.

A 2016 study concluded that 57% of Tor onion services contained illicit or illegal content. The veracity of that study is questionable, however, because the majority of hidden services that the researchers tried to access did not seem to serve any content, and the sample of websites that they could access were the only ones they considered.

Since they were using an automated website crawler to detect and classify content served up by onion services, there may be an interesting paradox at play here: onion services configured to serve "legitimate" content secretly between small groups and individuals would have been dismissed, especially if those sites actively block crawlers or have other access controls. Marketplaces of illicit and illegal content, on the other hand, would be designed for easy access via the .onion address and would not be hard to index and scrape by the website crawler.

The numbers can also be seen in a more positive light: of the 2,723 onion services declared active by the researchers, 423 were put in the largest illicit category: "drugs". What someone thinks of drug sales on the Dark Web, including drugs that are not narcotics such as Viagra, is going to vary by country, society, and culture.

A 2015 study found similar results, with drugs making up the majority of illicit activity. This study was more comprehensive, however, and the researchers were more successful at accessing services and classifying content into more diverse categories. They observed 45,000 active onion services on average and 80,000 total throughout the study (with many of those only appearing for a short period of time before disappearing). Of these, the majority of content was non-illicit or legal, in categories like "news", "blog", "search", "forum", "wiki", and "mail". What someone thinks of categories like "whistleblower", "gambling", "market", and "drugs" is largely going to be decided by their politics, religion, and ideology, and any categorization of website content requires further digging to be robust.

2018 and 2019 maps of onion services compiled by security firm Hyperion Gray reveal screenshots of illicit and disturbing content alongside more innocent-looking websites. The researchers are very careful in their conclusions, acknowledging limitations similar to those present in older studies:

[T]opic modeling is a noisy process with significant limitations. Furthermore, the dataset we have here is limited by the fact that we only look at the home page of each site. Indeed, topic #2 appears to relate just to login screens, suggesting that there may be a lot of content on the dark web that we can't even see without acquiring a login.

Question

8. What is I2P?

The Invisible Internet Project (I2P) is an anonymity network that focuses on its role as a darknet, not on sending traffic from the clear Internet through it. In this way it is distinct from Tor, but that's not the only difference.

I2P might be considered to have a more advanced design due to its dynamic routing of traffic. I2P uses a distributed network database and peer selection to choose the computers that your data travels through, in contrast to the directory-based process that Tor uses to build a circuit. This allows I2P to route around congestion and outages automatically, with routes updated constantly and dynamically.

Tor, in contrast, will connect via the same random path it chooses when you first connect to the network, unless you ask to generate a new circuit. This single, duplex circuit is more vulnerable if a spy manages to break in, as entire messages would be intercepted. I2P creates two independent tunnels that each contain half of your traffic, making it much harder for entire messages to be assembled if a connection is intercepted by an attacker.

On I2P, multiple messages are encrypted together to make it extremely difficult to distinguish between them. This called " garlic routing" in contrast to the onion routing of the Tor network

Question

9. What is a "mix network"?

Mix networks are designed to be private and anonymous darknets, a version of the Dark Web with a robust and advanced topology. Mix networks utilize a chain of proxy servers called "mixes" which gather messages from multiple senders and then shuffle them, creating a random order before sending the data to the next node on the network. Depending on the design of the mix network, this next connection might also be a mix node, which would shuffle the information again.

Parts of the messages are encrypted to make it near-impossible for an eavesdropper to decipher the communication. The encryption process differs between mix networks. Both the onion routing of Tor and the garlic routing of I2P are examples of mix networks, as is MIT's Riffle implementation, Freenet, and anonymous remailers such as Mixmaster.

Potential weaknesses in the low-latency mix networks of Tor and I2P and their outproxy design mean that you may hear "mixnets" being contrasted with onion routing and garlic routing, or find that users with a solid understanding of mix networks use Tor in combination with I2P.

Question

10. Is there a Dark Web beyond Tor and I2P?

There are many ways to send traffic via computer networks that might be considered part of the Dark Web. The Dark Web doesn't have a fixed definition that requires any specific technology, and is instead defined in contrast to the "clear" Internet.

As such, networks such as Freenet which utilize distributed data stores might also be considered parts of the Dark Web. Emerging mix networks such as Riffle are usually included as well. Sometimes any peer-to-peer (P2P) or friend-to-friend (F2F) network is included in the conversation, and both GNUnet and Retroshare are good examples given their strong privacy protection.

As more and more Internet and Web-based services "go dark", the Dark Web distinction starts to break down. It may be useful terminology to encompass a handful of technologies now, but is already starting to lose favor in hacker and geek circles and is instead becoming relegated to the marketing of cybersecurity software.

Since Tor is by far the largest network that is commonly called the Dark Web, and only a tiny percentage of Tor users actually use onion services that are hidden from the clear Web, maybe the concept of "dark" versus "clear" is overblown. As Roger Dingledine stated at DEFCON in 2017, "There is basically no Dark Web. It doesn’t exist... It’s only a very few webpages."

With this in mind, it's important not to fall into the trap of fearmongering that surrounds technologies that are lumped together as the Dark Web. If the term "Dark Web" really is inaccurate or misleading, maybe the hype about its existence and scale is unwarranted. Attempts to identify Dark Web users and content could be a waste of energy and resources.

There may only be small corners of the Internet where people can go "off the radar", and those areas may grow or shrink in the future, depending on your outlook. However, if a handful of small and anonymous or secret networks can cause such a fuss, the technologies that underpin their existence must be very disruptive and powerful indeed.