Web analytics have existed almost as long as the internet. Even in the early 90s, anyone who cared could look up their website's server logs and get the IP addresses of anyone who visited.
The ability to read those jumbled numbers as people has come far even in the last few years. Some website visitor identification software like Warmly now claim to identify up to 65% of anonymous website traffic.
What is the point of website visitor de-anonymization?
Website traffic is an incredibly rich source of information for B2B companies. Knowing who is coming to your site or looking at the pricing page lets sales teams know who is already interested in their product and streamline the process to find qualified leads.
For marketers, knowing what website visitors are reading and where they came from can tell the team if campaigns are driving visitors from the right ICP.
If prospects are already on your website, it’s also a naturally good time to engage with them, which can further increase conversion rates across your funnel.
Tools and Techniques for Identifying Anonymous Website Visitors
When a company is on your site, there are a couple different ways of getting information about them. IP location is the most common way of identifying broad stats such as the probable company. Visitor intelligence companies can then get a more complete profile of contact-level information through forms, outbound emails, logins and chat.
When anonymous users fill out a form with their contact information, the site can attach a cookie that associates that email address (a unique identifier) with the browser. The website will then be able to tell when that same browser returns. To be compliant with GDPR, the user must give opt-in consent to be tracked by cookies (e.g. by checking a box).
When a sales team sends an outbound email with links, they can set a query parameter that hashes a unique identifier connected to the prospect's email address. This query parameter passes that information through to the website, which can then use cookie information to link the person to their activity.
When web visitors sign in with a third-party like Google, Microsoft, or Facebook, the verification service passes the person's email onto the website. Both the website or the third-party can also attach cookies to track activity.
When anonymous users make a request on a website, such as logging in, the site also gets the users IP address, which combined with email and cookie information helps with visitor identification.
When unique visitors use the live chat on a company's website, the person or machine behind it may ask for emails or contact information. This data can then be attached to cookie, browser and other information to form a more complete profile of unique visitors.
Before remote work, it was easy for visitor identification software to use firmographic data and IP location to guess what company is visiting. For example, the algorithm might know where Amazon is headquartered and where the servers are located, and safely assume what IP addresses belong to Amazon. Now with so many users working remotely, IP location is most useful in combination with other data points and methodologies.
How Visitor Identification Tools Work to Identify Anonymous Visitors
Unless the user has given their explicit permission to be identified, contact-level information will be the result of a probabilistic waterfall. Most website visitor identification software is based on running data points through this waterfall to reconcile them and spit out the algorithm's best guess of who the visitor may be.
For example, a customer logging into Facebook from a certain IP address and then visiting a different website from the same IP address gives the company a strong guess that these individuals are likely the same person.
There are also data aggregator companies that integrate with different platforms that tell them in real-time when someone uses one of those services. They can then pair that login information with other engagement data to figure out the IP address most closely associated with a user. This can be incredibly accurate to find employees working remotely. The aggregator can also strip away the PII and compliantly sell that data to other companies.
Ethical Considerations for Anonymous Website Visitor Identification
Laws like Europe's GDPR prevent companies from getting personally identifiable information (PII) unless the person has explicitly opted in to sharing that information.
Visitor intelligence companies can collect contact-level anonymous visitor identification, but they are only able to share company-level information. Some companies are skirting the laws by revealing some detailed information, such as the division where the visitor works or their job title. However, that can be considered PII in some jurisdictions.
Identifying Contact vs. Company (Protecting PII)
Personally identifying information is not uniformly defined in the United States. The Privacy Act, which regulates how PII can used by federal agencies, notes that PPI "is not anchored to any single category of information or technology. Rather, it requires a case-by-case assessment of the specific risk that an individual can be identified."
For European users, the GDPR protects any information that relates to a living person "who can be identified, directly or indirectly, in particular by reference to an identifier such as a name, an identification number, location data, an online identifier or to one or more factors specific to the physical, physiological, genetic, mental, economic, cultural or social identity of that natural person."
Since the GDPR applies only to natural persons, not companies, most anonymous visitor identification companies will provide company-level information by collecting some form of outbound emails, logins, and chats and then stripping away the PII.
How can companies access this data?
The most common way of getting either the data for website visitor identification is buying it from vendors. Some service agreements includes clauses that allow the provider to collect information through third-party cookies and sell that data to other companies.
CRMs like HubSpot allow users to turn on link-tracking on forms and outbound emails. Similar to the visitor "hit counters" of the 1990's, companies can also put a discrete pixel on the website that lets you know when somebody comes through from a website link and cookies them.
Interestingly, some visitor intelligence companies are able to read natively in HubSpot cookies. For example, if you use Warmly and HubSpot together, your Warmly dashboard can use the cookies HubSpot already added to the prospect's browsers to identify them without the need for sending out additional outbound emails to get that information.
What is the difference between IP vs. Cookie vs. Browser Fingerprinting?
There are a few broad categories for gathering data that feeds into the algorithms and waterfall processes that can then spit out a probable guess for anonymous users. The three main types are IP addresses, cookies, and browser fingerprinting.
Internet Protocol (IP) Addresses
Internet Protocol addresses are unique strings of numbers that identify the device you use to connect to the internet and the general area where it is connecting from. It can be used with other firmographic information, such as where a company is headquartered, to triangulate who might be visiting your website.
Internet cookies (also known as HTTP cookies) are small pieces of data saved by your browser that identifies you to websites. The technology has been around since 1994. They enable many of the features we expect from websites. Session cookies keep track of what you put in your online shopping cart and are generally considered "necessary" cookies. A "persistent" cookie might help websites remember user settings such as language preference.
When you login into a website and check the box to "Remember me on this computer," you consent to the website using a first-party cookie to keep you logged in. While first-party cookies are limited to one website, third-party cookies can track your browsing history around the web.
In order to be compliant with the GDPR and California's CCPA, websites have to offer people a chance to opt out of non-essential cookies. Google has also begun testing restricting third-party cookies for a subset (1%) of Chrome users in January 2024 and plans to phase them out by default completely by the middle of the year.
Contact-level cookie tracking will always be more accurate than IP address because a single IP can be used by an entire company, or all the employees sharing the public Wi-Fi network at a WeWork.
Browser fingerprinting is a method of tracking that involves taking basic essential information about your computer and remembering that unique combination of features as a "fingerprint" of an individual.
Device fingerprinting is incredibly accurate. A study by Electronic Frontier Foundation found that 83.6% of browsers had a unique fingerprint. In browsers that enabled Java or Adobe Flash, it was 94.2%. In a recent study, researches trying to de-anonymize cybercriminals who use the antitracking browser Tor found their "proposed Tor anonymous traffic recognition method achieves 94.37% accuracy."
What is the future of visitor de-anonymization?
There's been a big push in the last two decades towards data privacy and ownership. The GDPR and ePrivacy Directive regulate how websites are allowed to track users in the European Union, disclose their activity and ask for consent. However, most regulation surrounds the use of internet cookies. Browser fingerprinting remains largely unregulated, especially since it uses publicly available information. The industry shows signs of moving in this direction in a post-cookie world—which seems imminent.
Google Chrome accounts for 63.56% of internet browsers worldwide and 52.31% of browsers in the United States, followed by competitors Safari, Microsoft Edge and Firefox, which already block third-party cookies by default. Google's plan to phase out third-party cookies by mid-2024 will take significant air out of the market for cookie tracking.
In addition to browser fingerprinting, marketers are experimenting with grouping people into cohorts of similar interests or demographics in order to gather information free of PII. Google is already promoting the Federated Learning of Cohorts (FloC) as an alternative tracking mechanism.
The last few decades have been a dance between B2B intelligence companies and privacy regulation, both by the government (GDPR, CCPA) and the market (e.g. browsers like Tor, Apple's ATT protocol on iPhones).
As the market moves away from third-party cookies, B2B go-to-market teams have to find new ways of identifying anonymous website visitors for boosting conversion rates, supported by a growing industry of website visitor identification software and methodologies.