Don’t Spoliate Your Online Content: Why Ediscovery Professionals Should Care About ISO 28500

| January 16 2019

We’ve all been there: something happens that causes your organization to reasonably anticipate litigation, whether it’s the receipt of a preservation letter, a breach of a contract, or even service of a filed complaint. Immediately, a whole chain of actions and events snaps into place. Once your duty to preserve relevant evidence attaches, you start industriously issuing legal hold notices, interviewing custodians, targeting your preservation and collection, and adjusting the scope of your inquiry as issues and types of evidence arise, are investigated, and are deemed worthy or unworthy of further consideration.

Suppose, now, that you become aware of an allegation involving content on your organization’s website or on a social media channel. Are you prepared to preserve and collect that online content without spoliating any part of it? The answer may be more complicated than you realize.

The Duty to Preserve and Collect Online Content for Ediscovery

In an increasing number of cases, relevant, discoverable information may be found online. Perhaps your organization is accused of using someone else’s intellectual property on an online product description. Perhaps you’re being sued for workplace discrimination or harassment based on something an employee posted on one of the organization’s social media pages. Even if the online content isn’t the “smoking gun” in the case, it could easily be relevant and discoverable.

Courts have long held that websites should be treated the same way as any other discoverable electronically stored information (ESI). Over ten years ago, in Arteria Property Pty Ltd. v. Universal Funding V.T.O., Inc., No. 05-4896 (PGS), 2008 WL 4513696 (D.N.J. Oct. 1, 2008), the court imposed sanctions for the defendants’ spoliation of online evidence.

In that case, the plaintiff had requested “electronic snapshots or paper copies of the [defendants’] website” from the time of their earlier negotiations. Specifically, it sought to prove from statements on the defendants’ website that the defendants had held themselves out as “one of the leading lenders serving the real estate market,” with over “50 years of Commercial Mortgage Banking experience.” The defendants never produced this evidence.

The court did not hesitate to find that the defendants were required to preserve the online statements, just as they were required to preserve all relevant information, once they were aware of the plaintiff’s claim. It noted that it could see “no reason to treat websites differently than other electronic files.” The defendants “had the ultimate authority, and thus control, to add, delete, or modify the website’s content.” Therefore, the court granted the plaintiff an adverse inference jury instruction for the defendants’ failure to preserve their online statements.

Okay, easy enough. You know you might be obligated to preserve your website information. Snap some screenshots and off you go, right?

Hold that thought. When’s the last time you really looked critically at your website, or any website, to see what you’d need to do to preserve it? If you’re essentially producing a “paper copy” of your website through your current collection methods, you’re running a big risk.

Are You Attending LegalTech 2019?

Meet us in Booth 2301 to learn more about dynamic web archiving for compliance and ediscovery professionals. Request your 20-minute, personal demo from Team Hanzo today.

Meet In the Suite

P.S. We’ll have snacks.

Relying on Screenshots and PDFs Can Cause Spoliation of Evidence

Today’s websites are rich with complex, dynamic, and interactive elements. They may have videos that a screenshot cannot capture. Many websites use image or text carousels to present testimonials, highlight recent blog posts, or provide links to other information. A screenshot at one moment could capture entirely different information than a screenshot three seconds later.

Then there’s all the interactive content used on modern websites, from dropdown “hamburger” menus to mouse-over text. Your website may use scrolling features to reveal new information as the visitor proceeds down the page, or scrolling may trigger deployment of popup questions or chatbots. You may have fillable calculators, interactive charts or graphs, or questionnaires on your site.  

And that’s before we even mention linked content, which may include links to both internal and external sites, or social media “reactions” that require clicking through to see who reacted in what way or how a conversation unfolded.

When it comes time to preserve and collect information in response to a trigger event, all that complex, dynamic, interlinked, interactive content can cause you a problem if you’re still relying on screenshots or PDFs.

This is exactly what happened in Leidig v. BuzzFeed, No. 16 Civ. 542 (VM) (GWG) (S.D.N.Y. Dec. 19, 2017). The plaintiff in that case was sanctioned for failing to preserve the websites where it had published the stories at the heart of its own defamation claim. There, the plaintiff produced in discovery only “documents bearing no metadata, including manually manipulated PDFs … and screenshots.” The court determined that he had allowed potentially relevant website information to be spoliated after he was aware of his own claims. The court found his “amateurish collection” to be entirely insufficient. Nor could the plaintiff rely on the Internet Archive’s Wayback Machine to satisfy his preservation obligation; the court noted that he offered nothing to prove that the Wayback Machine’s archives were “reliable, complete, and admissible in court.”

Simply put, screenshots and PDFs—and even the Internet Archive—are no way to satisfy your preservation duties when it comes to your complex, dynamic, interactive online content. And the flaws of these capture methods extend beyond losing a tremendous amount of data and the attendant spoliation claims (as if those weren’t bad enough). In addition, a website captured through screenshots cannot be smoothly navigated to demonstrate where statements were made or how those statements were accessible. Rather than a working replica of the site, you’re left with a clunky process of flipping through pages in a quasi-paper production. And, as the court noted in Leidig, how can you establish the authenticity of your collection method with screenshots?

Fortunately, there’s a better way.

Complete Website Preservation Using ISO 28500-Compliant WARC Files

You’ve probably heard of the ISO, the International Organization for Standardization, which operates worldwide to create standards for a variety of products and technologies. Since 1947, the ISO has created over 20,000 standards to “aid[] in the creation of products and services that are safe, reliable and of good quality.”

What does that have to do with ediscovery or ESI? Plenty: ISO 28500 establishes a standardized file format that allows the collection of entire navigable websites with no loss of information—meaning no spoliation.

As the ISO notes, “Storing and managing the billions of saved web page objects itself presents a challenge.” That’s why the ISO created its standard WARC (Web ARChive) file format, which generates a universally recognized file based on a crawl of a website. WARC files offer “a standard way to structure, manage and store billions of resources collected from the web and elsewhere.”

And that’s not all they do. WARC files collect and preserve exactly what happens when a web crawler accesses a website. That information can be reproduced in a fully functioning replica website. In other words, not only do ISO 28500 WARC files collect all of the content on a website in its original format, but they also allow it to be played back anywhere, anytime, exactly as if the viewer were on the live website—while being isolated from the internet to prevent any modification of the preserved site. WARC files also gather metadata, producing a digital chain of custody. That makes it straightforward to create and maintain an audit trail that can establish the file’s authenticity.

In short, with website archives based on ISO 28500 WARC files, spoliation of online content is a thing of the past.

Are you fed up with the limits of screenshots and PDFs? Tired of running the risk of spoliation with every case? Contact us today; we specialize in archiving complex, dynamic, interactive online content for ediscovery and compliance using ISO 28500 WARC files. We’d love to show you how you can capture confidence with your website preservation and collection methods.

Related posts

Meet Team Hanzo: Julia Vitti, A Dynamo With A Thirst For Knowledge & A Helping Spirit

Meet Team Hanzo: Julia Vitti,...

This month I have the pleasure of introducing you to Julia Vitti, Hanzo’s newest account executive.  She's a veritable ...

Read More >
Webinar Recap: Three Things You Need to Know About Slack’s New Legal Hold

Webinar Recap: Three Things...

If your organization uses Slack for business communications, how are you preserving Slack data for potential ...

Read More >
Legal Holds in Slack: Should You Collect Data to Preserve It or Preserve in Place?

Legal Holds in Slack: Should...

A head-to-head match—and the winner is …  Ediscovery can feel like a fight sometimes, with every choice feeling like a ...

Read More >

Get in Touch to Learn More

Hanzo’s purpose-built, best-in-class solutions can help your readiness to respond to the next discovery request, investigation, or audit. Contact us to learn more.

Contact Us