How DNS Prefetching and Preloading Can Lead to Incorrect Conclusions

Coding used by web developers to improve the user experience (UX) of web browsing can cause data to be stored within a user’s device without the knowledge or interaction of the user. An untrained digital forensic analyst or a person reviewing the results of a forensic analysis that lacks proper context may make incorrect assumptions about a user’s activities.

Highlights

  • Domain Name System (DNS) serves as the phonebook for the Internet, translating the Internet Protocol (IP) addresses and domain names of Internet resources.
  • DNS prefetch is a tool used by web developers to improve the UX while browsing a website.
  • DNS preloading is another tool used by web developers to anticipate resources a user will need and download those to the user’s system before actually being requested in order to speed the browsing experience.
  • DNS prefetching and DNS preloading can create Internet artifacts on a user’s system that were not searched for, requested, or knowingly placed on that system by the user.
  • An untrained person may misinterpret DNS prefetching and preloading as user-initiated activity and make incorrect assumptions.

RECOMMENDATIONS

Digital forensic analysts and those reviewing digital forensic reports should:

  • Scrutinize Internet artifacts before reaching any charging, disciplinary, or finding of fault decisions.
  • Understand the difference between cache, cookies, searches, typed uniform resource locator (URL)s, and other forms of Internet evidence.
  • If reporting on Internet history for non-technical audiences, contextualize the forensic findings instead of simply providing a data dump of information and leaving the analysis to untrained individuals.
  • Be cautious of relying strictly on Internet artifacts such as the presence of cache, DNS entries, or cookies for decision making without other corroborating evidence.

TECHNICAL DETAILS

DNS prefetching causes browsers to resolve IP addresses before a user requests the information and DNS preloading causes a browser to connect to Internet resources and download information without any knowledge or interaction of a user. The prefetching and preloading creates entries into a system that can be mistaken for user-initiated web activity and lead to incorrect conclusions during a digital forensic examination. This post describes how this can happen and the technology behind DNS prefetching and preloading.

I was asked to consult on a criminal defense case by another digital forensic analyst who had completed an independent forensic analysis of a defendant’s computer and had questions about Internet history. The law enforcement forensic analysis revealed multiple Internet browsing artifacts to websites that appeared to be related to illegal activity and these artifacts were used in part to support a criminal indictment. The defendant adamantly denied visiting any websites with names the same or even similar to what were highlighted in the law enforcement report.

Law enforcement had essentially created a “data dump” of browsing artifacts and provided that to prosecutors with no contextualization of the data, leading an untrained prosecutor to the conclusion that the defendant was involved in criminal activity.

 

The original forensic analyst asked me to help determine why those artifacts existed on the defendant’s computer. The defense analyst also did not see other artifacts that would normally be found such as search terms, downloads, visited pages, typed URLs, and others to support the prosecutor’s theory.

This case highlights a challenge with forensic analysts performing what is often called “triage forensics”, which essentially means that a cursory exam is done on the digital device with the intent to locate enough evidence to support the allegation(s). Sometimes a full forensic analysis (sometimes referred to as “trial forensics”) isn’t completed until a defendant disputes the allegations and by then, the wheels of the justice system are already well in motion against a defendant and often times at immense reputational and financial expense.

There are undoubtedly issues with what has been described so far, but for this post I’m going to focus on the digital evidence. I should also say that while I’ve highlights some shortcomings of a law enforcement process, my intent is not to insinuate that this represents all law enforcement analysts because I know many who are exceptional.

Upon my examination, I did find entries in the computer for websites identified by law enforcement. There was no question that the forensic artifacts existed, but the problem was the lack of context or true analysis to explain why the artifacts existed. After some testing and further evaluation, I was able to determine the cause of these artifacts were DNS prefetching and preloading.

To demonstrate how this technology works, I’ve created some videos and screenshots. I used Google Chrome as the browser for this example, however all browsers I have tested work the same for this particular artifact.

I began with a browser session that had no Internet history associated with it and validated that no artifacts existed. The below screenshot shows that no cookies were present in Chrome. I did the same to ensure there were no downloads, browser history, or other artifacts from any previous sessions.

screenshot showing no cookies

Next I opened Chrome and navigated to the website www.msn.com. Using Google Chrome’s developer tools, I captured all of the content that is loaded in order to present the website to the user. This includes images, JavaScript, CSS, and other resources. This is all done without user interaction except for the navigation to the single URL of msn.com.

In the video below, you will see the resources being loaded as the page loads. This process is transparent to the user (unless using a tool like Developer Tools). Just to load www.msn.com there were over 400 requests for resources.

Side note – websites often use content deliver networks (CDN)s to increase loading speed. In a very basic explanation, CDNs distribute commonly requested assets for websites across geographically dispersed servers. For example, a website may have static content like JavaScript, images, and CSS files hosted in Amazon Web Services (AWS) or use a CDN provider like CloudFlare to offload the work of a web server and have faster load times of the content.

In the next video, I drill down into some of the content that was downloaded to my computer when pulling up www.msn.com. You will see the JavaScript, CSS, and image files as I click through them. All of the images that are clicked on and shown in preview mode would also be downloaded to my computer’s hard drive (private browsing can affect this, but for purposes of this blog, private browsing was not used). On the left side of the page, you will see sources of content such as bing.com, cdn.taboola.com, and others.

The below screenshot shows the DNS prefetch that occurs with this site. Similar to the concept of Windows Prefetch in the Microsoft Operating System (OS), DNS prefetch tells a website to go get information and make connections to other web resources early on during the page loading process to speed things up. A practical example of this is when a web developer places a simple contact form at the bottom of a webpage. Part of the contact form might be Google Captcha, used to reduce the likelihood of spam submissions to the form. Instead of waiting until a user scrolls to the bottom of the page to load the Google Captcha JavaScript, the web developer does a prefetch at the top of the page, already loading that content so when the user gets to the bottom, there is no delay. Imperva has a nice writeup on DNS prefetching here.

Prefetching can be done for anything and it is simply a line of code entered into the site. A screenshot below shows the DNS prefetching done on msn.com. A developer could hard code any prefetch they wanted into a website and cause a browser that is navigating the site to reach out and translate the domain names listed in the coding.

MSN.com prefetch code

Now just imagine if a website you visited was coded to prefetch malicious or criminal domain names. These prefetches would be done without your knowledge and would leave artifacts behind on your computer that an untrained forensic analyst (or one that didn’t take the time to do a true forensic examination) could draw some incorrect conclusions.

Going back to the Google Chrome history and artifacts on my system, below is a screenshot of the same view shown earlier of the cookies but after I navigated only to msn.com:

Chrome cookies view

After just going to the single website www.msn.com on my system, you can see there are 169 cookies present on my hard drive. From the screenshot above, you see multiple domains that I never intentionally or knowingly visited – but my computer did automatically because of how the website was coded.

The date/time stamps shown above are in WebKit format, so a simple conversion will show them in UTC or local time.

time conversion screenshotAlso now present on my computer are additional files from some of these websites, such as Facebook, Twitter, Google, etc. Remember, these sites were never intentionally navigated to.

directories created on system after visiting msn.com

Although all of these files are now present on the computer, by looking at Google Chrome’s history from the application itself, it still only shows msn.com was visited:

Google Chrome history

Using the forensic tool Hindsight, over 620 entries are made in the software for a single visit to www.msn.com. The entries include cookies, cache, preferences, and URLs.

Screenshot from Hindsight forensic tool

By looking at the artifacts in another forensic tool, similar results are found. The screenshot below shows Autopsy’s analysis of the Chrome history. According to Autopsy, there were 155 items of Internet cache, 197 cookies downloaded just from visiting msn.com, and then the single item of web history.

Screenshot from Autopsy forensic tool

The same test was done with Wireshark running to capture the network traffic from my workstation. As expected, Wireshark showed the same as Chrome developer tools, with all of the DNS queries and responses being shown. Below is a screenshot from Wireshark showing some of the queries:

Screenshot of Wireshark

An untrained incident responder or forensic analyst looking at network logs may also come to an incorrect conclusion that a user searched for or navigated to these websites because of the DNS queries present on the network.

CONCLUSION

Based on the testing and analysis, we were able to show that the websites in question were not visited by the defendant, nor did the defendant search for those websites.

Performing a digital forensic analysis is much more than simply pressing the find evidence button and then handing over a few hundred pages of results to someone. Forensics should include a thorough analysis of the digital evidence by a trained analyst along with proper contextualized results and explanations to stakeholders.

 

Digital forensic analysts should look at the totality of the circumstances with a device including Internet cache and cookies, but also typed URLs, viewed pages, timelines of activity, downloads, and the user’s normal pattern of behavior among other things when performing their analysis. They should also act as subject matter expert consultants to those consuming their forensic reports and provide the necessary explanations, context, and opinions when necessary.

Related Posts

Leave a comment

Privacy Preferences
When you visit our website, it may store information through your browser from specific services, usually in form of cookies. Here you can change your privacy preferences. Please note that blocking some types of cookies may impact your experience on our website and the services we offer.