As interest and news coverage of the U.S. presidential election reaches a fever pitch, scams and deceptions of various kinds have targeted the news-hungry populace. At NortonLifeLock, we have been helping to detect and protect our customers from malicious web content for many years. Of late we have been tracking election-related threats that pose a wide variety of risks. In doing so, we leverage Certificate Transparency logs, an openly available source of information by which we can identify potentially malicious websites well before they are visited by our customers.
In studying more than 3,689 sites manually and with automated tools, we estimate that there are at least 2,200 functioning website domains pertaining to the U.S. presidential election that have obtained valid TLS certificates since January of 2019. With so many relevant domains in existence, distinguishing official sites from lookalikes is a challenge.
Of particular concern are the dozens of unofficial sites that request financial contributions. Some of the sites we observed have been found to host malware. The content hosted on many of these sites is misleading, and we found more than a few that promote disinformation. A significant portion of the political opinion and merchandise sites we observed are hosted in foreign countries.
In this article, we provide additional details about how we collected website data from Certificate Transparency logs, our findings, and our recommendations for staying safe online while staying informed about the 2020 US presidential election.
Most internet users are familiar with the little padlock icon next to their browser’s address bar. This icon indicates whether the website being visited is accessed via a secure encrypted connection or not. Nowadays, most websites (and especially those with sensitive content) employ encryption in order to ensure the security and integrity of the connection to and from their users.
To that end, every website that wishes to use the encrypted HTTPS protocol and avoid prominently displayed browser warnings (by ensuring that the browser recognizes the connection as secured and belonging to the site) must obtain a TLS certificate from one of a small group of Certificate Authorities. Website providers have strong incentives to do so, given that the Google Chrome browser (which has 65% market share) pioneered a push to provide a prominent “Not Secure” warning for unencrypted web HTTP traffic. The HTTPS protocol ensures that a web browser can validate that the connection to the webserver is not tampered with and that it is encrypted with a certificate issued by a trusted Certificate Authority that matches the domain that is displayed in the browser.
Certificate Authorities (CAs) abide by the framework of Certificate Transparency, which means that they publish the details of all certificates they grant on public Certificate Transparency (CT) logs. This means that any and all website providers who wish to grant their users the expected assurance of a valid HTTPS connection will have their certificate and website’s domain name displayed in a publicly accessible CT log. This enables any interested parties to examine the site shortly after its creation, to examine the registration of its domain name by querying the publicly accessible WHOIS database, and to generally investigate further.
Ideally, certificates also give some guarantees as to the identity of the website owner, but most digital certificates are issued without meaningful identity verification. This can easily lead to incorrect assumptions and overconfidence on the part of website visitors with respect to the entities that published the website.
This shortcoming of digital certificates was one of the primary motivations for the creation of CT logs. By monitoring these logs, owners of trademarks and established web services can check to see if their brands are being used in domain names by other parties. For example, at NortonLifeLock we use CT logs to detect and react to websites that attempt to deceive customers by offering fake Norton and LifeLock technical support services, and also warn our customers about sites that impersonate other entities.
Monitoring the US Presidential Election
Figure 1: On the left, we plot the cumulative growth of Certificate Transparency log entries that include the word “trump” or “biden” as a substring from January 1, 2019 through October 8, 2020. When we require that the domain contain both “Donald” and “Trump” as a substring, or that include both “Biden” and “Joseph” or “Joe”, we see that the trend is similar, though the number of domains is substantially smaller. Note that these numbers capture any entry that involves the candidates’ names, both supporting and opposing the respective candidate.
The enthusiasm around the U.S. presidential election has resulted in a tremendous number of websites that advocate for or against the candidacy of Donald Trump and Joe Biden, as seen in Figure 1. For clarity, we should note that these numbers reflect new certificates issued from January 1, 2019 through October 8, 2020, and for domains that contain references to the candidates’ names. They do not reflect all websites on the internet relating to the candidates. Furthermore, note that the numbers for each candidate refer to all entries involving each candidate’s name — in favor, opposing, neutral, etc. Donald Trump’s status as a sitting president and a celebrity are potential reasons for why the number of websites devoted to him are far higher than the number of sites dedicated to Joe Biden.
Table 1: Breakdown of domains appearing in the CT Logs that contain the string “trump” or “biden”. We first filtered out irrelevant sites using word segmentation, and then a team of 14 NortonLifeLock employees manually examined a random sample of 3,424 “trump” sites and 284 “biden” sites. Of these sites, less than half were initialized sites that contained non-template content, while nearly two-thirds were relevant to the presidential election. We extrapolated based on these results to estimate that there are 2,040 functioning domains with “trump” as a substring and 151 with “biden”.
To get a better sense of the nature of these websites, we identified all websites in the CT logs that contain the candidates’ names as substrings, and further filtered them down by using word segmentation to remove irrelevant sites. When applied to website domains, word segmentation attempts to split the domain name into individual words. For example, when applied to “trumpetsthechicken.com”, it yields the individual words “trumpets”, “the”, and “chicken”, and when applied to “innabidentalcare.com”, it yields “innabi”, “dental”, and “care,” showing that neither of these sites is relevant to the U.S. presidential election despite containing the substrings “trump” and “biden”, respectively. As shown in Table 1, word segmentation filters out many irrelevant domains.
To gain further insights into these websites, we recruited a team of 14 NortonLifeLock employees who jointly categorized 3,689 websites, 3,424 of which contain “trump” as a substring and 284 contain “biden” (19 sites contain both “trump” and “biden”). A majority of these sites are not initialized, meaning they have no content at all (accessing the root domain page returns a 404 error), are listed as for sale, or display a generic website template with no content relevant to the domain name. The percentage of initialized “biden” sites was slightly higher (38.0%) than the percentage of initialized “trump” sites (34.1%), though the raw number of initialized “trump” sites is much higher (3,424 vs 284). Among the remaining 34.1% of sites that we manually categorized, a nearly identical fraction of the remaining sites is relevant to the presidential election for both candidates (64.4% for “trump” and 64.8% for “biden”).
Having manually determined that 752 of 3,424 “trump” sites are both initialized and relevant to the presidential election, and that 70 of the 284 “biden” sites are relevant, we use these ratios to estimate that there are 2,040 working sites with “trump” in their domain name, and 151 working sites with “biden” in the domain name. While it is tempting to read into these statistics and to treat them as an informal polling mechanism, we caution that a majority of the sites focused on the political candidates may be characterized as attack or negative sites, and therefore, the fact that President Trump has more than 10 times as many domains focused on him is not necessarily an indication that he has more support than Joe Biden. More importantly, what we did learn from our manual inspection of these sites and by running them through a variety of blacklists and measurement tools, is that many of these sites raise a variety of security concerns.
Under the Hood: What’s on Some of These Websites and Why You Need to Be Careful
Having gained a hands-on sense of thousands of election-related sites through direct experience, we set out to determine what security concerns might be lurking among websites of this nature. They include the following:
- Outright malicious sites that can compromise your computer with malware.
- Phishing websites and official-looking sites that are not official.
- Questionable requests for financial contributions for which the actual use of the funds seems difficult or impossible to verify.
- Deliberate disinformation and dissemination of debunked news reports, including some registered outside of the United States, possibly attempting to influence American voters.
- Misleading bait-and-switch domain registrations.
- Sites containing strong, NSFW language and imagery that are not child-safe.
Perhaps the most traditional form of malicious site is one that disseminates malware, potentially causing a visit to the website to result in one's device being infected by a computer virus or other threat. These websites may be owned by bad actors and permanently host malicious content but are more likely to be vulnerable websites that have been compromised and used to disseminate malware. In addition, a website may inadvertently host malicious advertisements that are able to compromise or deceive visitors into downloading malicious content.
To give a sense of this, an inspection of websites that feature Trump or Biden uncovered 33 websites that have been flagged as hosting malicious content. These websites host a mix of political videos, news, merchandise, videos, and more.
Phishing and Unofficial Sites
Phishing typically involves the practice of creating a false website that appears to be official at a casual glance. Such websites that appear in CT logs have typically obtained a valid TLS certificate for their sites, by which scammers ensure that their website features a lock icon next to their domain name, which makes their site seem more official.
Typically, phishing is designed to fool visitors into thinking that they are visiting an official site and therefore prompt them to provide account login credentials, personal information, or payment information. While phishing websites are most notorious for targeting financial websites such as banking sites, the financial and other damages that can result from political phishing sites is very concerning.
Making matters worse is the vast realm of domain names that prominently feature Trump or Biden as keywords and that can make a domain appear to be official. We observed 54 websites categorized as phishing sites, such as “trumpglory.com” and “trumpphonecase.com”, both of which have been identified by security companies as malicious and labeled by Google Safe Browsing as unsafe sites on major browsers, as shown in Figure 2.
Donations to Questionable Entities
Our analysis also revealed several sites asking visitors to make donations. Upon visiting the page, the user is directly greeted with a form asking for personal information, including credit card details (see Figure 3 for an example).
It is usually unclear how trusted the entity behind the site is, in terms of distributing the collected donations to the intended political recipients. For example, we observed instances where a donation button on a page redirected the user to the site owner's personal PayPal page. In total, we identified 21 pro-Trump and 10 anti-Trump sites that request donations and lack credible information about the collecting entity.
Bait and Switch Domains
Bait-and-switch domains typically redirect the user to a site with content that is very different in nature to what the user intended to see. For instance, “trumppence.republican” redirects to the policy page on ex-presidential candidate Andrew Yang’s campaign website, while “joebidenwebsite.com” redirects to “donaldjtrump.com”, which is President Trump’s official website. In most cases, a bait-and-switch domain gives the impression it supports one candidate or sells merchandise pertaining to that candidate, while actually focusing on and supporting the other, though some redirect to content that does not pertain to the presidential election at all.
Most bait-and-switch instances involve website redirections, but others give the impression that they support a candidate and instead host content that appears to be designed to persuade visitors of the candidate’s defects. One of the earliest instances of this was “joebiden.info” which was ranking above Biden’s official campaign page on Google’s search results in early May of 2020. While none of the bait-and-switch sites we observed are inherently dangerous to their visitors, they represent a substantial class of deceptive sites. We manually identified 53 bait-and-switch sites with "trump“ in their title and nine with “biden“ in the title.
Potentially Misleading & Disinformation
Our volunteers manually visited 3,689 sites and found nine of these contained misinformation or potentially misleading information. Eight of these sites had “trump” in the title while one had “biden” in the title. Of the ones with “trump” in the title, six were pro-Trump and two were anti-Trump. The site with “biden” in the title was anti-Biden. We identified these sites as trafficking misinformation if they contained a claim that we had found was debunked.
Foreign Election Sites
It is noteworthy that many U.S. election sites that attempt to influence voters and that peddle political merchandise are registered outside of the U.S. It is perhaps natural that U.S. presidential politics should attract the interest of much of the rest of the world, though the vast majority of these foreign-registered sites are not open about not being based in the U.S.
To identify such sites, we looked at two sources of information. First, when a domain is registered, the website owner must provide the domain registrar with personal information. This information is accessible through the publicly accessible WHOIS database, which retrieves the domain’s registration information.
Traditionally, WHOIS data, including home address and the name of the site’s owner, was entirely public, but domain privacy initiatives give website owners some degree of anonymity by protecting the personally identifying content of domain registrations. Still, the city and country used by the website owner in registering the domain often survives and gives us some insight.
The second source of information is the IP address returned by the webserver. IP addresses can be geo-located to a certain extent, though nearly all large websites, and many small ones, use intermediary web-hosting services that optimize user browsing experience by serving up the site’s contents from a location in the same geographic region as the client. Website owners can also use these services to obfuscate their site’s physical location, and website owners may also use other techniques to hide their location.
Despite all these caveats, many political websites are either registered to addresses outside of the U.S. or served up from servers outside of the U.S. We believe that the vast majority of these sites pertain to individuals living outside of the U.S., because there is little plausible incentive for the owner of a political opinion website focused on the U.S. presidential election to feign a foreign-owned IP address.
Here, we examined 9,900 domains. We were able to get WHOIS data for 7,735 domains. Of these, 1,998 domains have information about the country of registration that is not protected by WHOIS privacy, and 449, representing 22.5% of all websites, were registered outside the U.S. Most of these were registered in India (74), Canada (57), and China (41).
Of the 7,735 domains, we were able to get IP information for 7,057 of these, and we were able to read country information from 6,938. The distribution of countries in this dataset is primarily in the United States, but we see significant numbers in Canada and Germany as well, as shown in Figure 4. One possible explanation for the high numbers of Canada-based servers is our IP lookup scan was performed from Canada, which meant that sites which have multiple servers around the world might return a Canadian IP.
We plotted the server locations on a map, which can be seen in Figure 5 below.
Table 2: Domains using the “trump” or “biden” keyword whose WHOIS data city and country data indicates that they are registered inside or outside of the US. Many domains hide their true country of registration by using the WHOIS privacy feature. Those domains are excluded from this table.
Table 3: Domains that are served up from IP addresses that are in vs not in the US. The website requests were made from a Canadian IP address.
Strong Language and Other NSFW Content
The 2020 presidential election has been highly polarizing and has elicited strong sentiments, both in favor and against the two principal presidential candidates. This has manifested most plainly in the use of strong language. In addition, many political-humor sites contain sexually explicit commentary and even pornography and pornographic references that many parents would consider to be inappropriate for young children.
We use a Deep Neural Network capable of detecting pernicious content by recognizing language nuances embedded in text (for example, slang, implicit meaning from combined words/phrases). We use that to detect potentially toxic content in U.S. elections websites. We note that our neural network was not trained on political content.
We find 50 websites with such content, 27 (52%) of which we can associate with a specific candidate. Figure 6 depicts the percentage of websites with strongly phrased content associated with President Trump or Joe Biden as well as to each political party. In total, we observe 22 (44%) websites related to Trump and one (2%) related to Biden. ￼The larger number of websites associated with President Trump is possibly due to the considerably larger number of websites associated with him in general. When we group by political party, we see 24 websites for Republicans and five for Democrats. The remaining four Democrats websites are associated to Kamala Harris’s candidacy.
Figure 6: On the left, percentage of websites with strong (potentially toxic) language content associated with Trump and Biden over all election's websites with strong language content. On the right, percentage of websites with strong language content associated with candidates of each party.
Summary and Safety Recommendations
Table 4: A summary of our statistical findings pertaining to the US Presidential election, which highlights the number of domains we discovered that pertain to the election that feature the word “trump” or “biden” (20 domains feature both keywords). The denominators in the “Both” column reflect what was being measured, we observed 9880 sites of which we manually categorized 3689 instances.
Among the sites pertaining to the 2020 presidential election are many that contain problematic content. Table 4 summarizes our findings, which highlight websites that feature either deceptive content or outright malicious infect computing devices, or to cause financial loss.
With all these threats and deceptive content out there, how do you stay safe online while satisfying cravings for news about the presidential election? Below are general safety tips that also apply to other situations on the web:
- Be wary of sites that ask for personal information and your financial information. When seeking to donate to a political cause, the best way to ensure that your funds are used as you intend is to donate to your preferred candidate or political party directly, where clear legal guidelines are in place.
- Dangerous sites may be reached in a variety of ways. Even users that do not speculatively browse the web are likely to find their way onto malicious and deceptive political sites in one or more of the following ways. First, most of these sites perform careful search engine optimization to cause them to rank high on search engines. Second, hackers and promoters of disinformation are often able to cause stories to trend on such social media platforms as Twitter, Facebook, and Instagram. Third, malicious actors may spam email inboxes with links to their websites. In general, we advise users to be cautious about the sites they visit and about following links from social media or email out onto the wild-wild web.
- Do not download software from any but the most trusted websites. Several of the most malicious sites claim to provide videos pertaining to a political candidate, and as in the case of sites that offer free streaming of videos that would normally cost money, are unlikely to be truly offering something for nothing. Many if not most sites of this nature encourage visitors to download malicious software under a pretense that it will enable the user to view the video or stream.
- Finally, there are many fact-checking websites that exist solely to combat disinformation and to provide easy access to the facts. Many of these sites, such as Politifact.com and Snopes.com, are politically neutral and have searchable mobile apps.
Editorial note: Our articles provide educational information for you. NortonLifeLock offerings may not cover or protect against every type of crime, fraud, or threat we write about. Our goal is to increase awareness about cyber safety. Please review complete Terms during enrollment or setup. Remember that no one can prevent all identity theft or cybercrime, and that LifeLock does not monitor all transactions at all businesses.
Copyright © 2020 NortonLifeLock Inc. All rights reserved. NortonLifeLock, the NortonLifeLock Logo, the Checkmark Logo, Norton, LifeLock, and the LockMan Logo are trademarks or registered trademarks of NortonLifeLock Inc. or its affiliates in the United States and other countries. Other names may be trademarks of their respective owners.
We encourage you to share your thoughts on your favorite social platform.