
Consider the following look-back scenarios:
- In a browser you open a 6-month-old preserved instance of a web page that displays a stock price. The stock price is supposed to update every 15 minutes. How do you know if you’re looking at the stock price that was displayed at the time of content collection?
- You view a preserved web page that contains an embedded YouTube video. Are you looking at the current version of that video on YouTube or the version that was available when the web page was preserved?
Defensible eDiscovery technology for precise web content preservation is a solution to traditional data capture question marks – unknowns that could negatively affect the outcome of a court case or compliance review.
The following are three critical features you need in the defensible eDiscovery technology creating your web content collection
Feature #1: Native Format Captures
It’s fairly easy to fake web content. That’s why native-format data captures aren’t just for viewing web content – they’re also necessary to demonstrate that captured web content hasn’t been altered from exactly the way it was delivered from the web server.
A web page isn’t a pre-set object, but rather a collection of elements assembled by your browser to display the elements – HTML code, JavaScript code, style sheets, images, etc. – as a “page.” Each of these elements must be collected in a manner that enables you to track back to a precise digital moment in time.
For each item, your defensible eDiscovery technology must record the request made to the server, including any submitted cookies and additional HTTP headers. And then the technology must record the exact response from the server. For a reliable eDiscovery solution, each request and response must be time-stamped and hashed independently.
Feature #2: A Comprehensive Audit Trail
The web content you preserve must be authentic and unaltered. There should be no question of accuracy when you present data captures in court or to a compliance auditor.
Beware of anyone who suggests a single date/time stamp is enough to authenticate the data capture of a web page. You need to record both a request to a Network Time Protocol server and its response, each of which should be hashed and time-stamped separately.
Additionally, information influencing the delivery of web content – such as cookies submitted, geo-location detail, browser and device settings, and other key configurations – must be recorded.
Feature #3: Proper Web Content Preservation
Web content isn’t like other electronically stored information (ESI) – it’s possible to alter content with calls to the live web if it’s not preserved properly.Also, the technology enabling the display of certain web content can quickly become obsolete (i.e., Flash). Non-preservation formats like .mht expose captured web content to the live web, and APIs do not preserve the content as it was presented in the ordinary course of business.
You need defensible eDiscovery technology that keeps your web content collection isolated from the live web to prevent spoliation and to maintain the data elements as they were originally presented.
Modern web content preservation is based on having the right crawling technology to capture accurate content and management tools to protect your collection from being altered. With the above three core features, your eDiscovery solution effectively supports compliance audits and litigation proceedings.
Ready to achieve a defensible web content collection?
