What Is Hanzo Preserve?

What is Hanzo Preserve?

Hanzo Preserve is the industry’s leading web content collection and preservation platform, considered the Gold Standard in legal defensibility and trusted by Fortune 500 and Am Law 100 companies alike. Hanzo Preserve, the foundation of the Hanzo Platform, enables organizations to collect, preserve, and analyze web content for litigation, internal audits and forensic investigations.

Is Hanzo Preserve a new product?

Hanzo Preserve is based on the core technology that has been built and deployed by Hanzo as the Gold Standard for eDiscovery and Litigation archiving purposes. For those who have come to know and love Hanzo technology, you will see some additional innovation released with hanzo Preserve - including a new Dashboard and Viewer, with a lot more on the way.

If I own Hanzo technology, do I need to upgrade to Hanzo Preserve?

If you have an active support and maintenance agreement, you will automatically be upgraded to Hanzo Preserve and all new features on the 2017 roadmap.

What other products form part of the Hanzo Platform?

Hanzo Preserve is the core offering in the Hanzo Platform. Over time, we will be releasing new product categories that include solutions for to proactively control the escalating costs and risks associated with internal and customer-facing web content and for organizations looking to harness, analyze and act on intelligence from the web - the world’s largest source of unstructured data.

Can I get an on-premise version of Hanzo Preserve?

Absolutely. Enterprise customers can purchase an on-premise version of Hanzo Preserve through our sales organization.

Where can I find out more about Hanzo Preserve?

You’ve got a number of ways to access information for Hanzo Preserve. We recently announced it in the news. You can also visit our website and watch the Hanzo Preserve video, or you can join one of our Thought Leadership Webinars, featuring keynotes from progressive industry visionaries.

How does Hanzo work?

Essentially, Hanzo’s software visits a website and collects what a person would see, and we store the content exactly as it was delivered from the target site. We navigate through the site, so we get all of the web pages and related content you need, including text and metadata from each page. We also make a PDF of each page and store that alongside the native web content.

Once the collection is complete, we make a working replica (links work, videos play, etc.) of the site available so you can see how the site performed when it was live. We also create exports using the PDFs of each page and the native content.

What web content do we capture?

Any website, message, forum, or blog, plus:

  • Confluence
  • Slack
  • HipChat
  • JIRA
  • SharePoint
  • Jive
  • Chatter
  • Yammer
  • LinkedIn
  • Instagram
  • Flickr
  • Facebook
  • Twitter
  • Pinterest
  • Google+
  • YouTube
  • Google Drive and Docs
  • Google Sites
  • Tumblr
  • Evernote
  • GitHub
  • Scribd
  • Foursquare
  • E-books
  • SlideShare

What is ISO 28500?

ISO 28500 is the standard for web content collection and preservation. It was designed by an international body of experts in digital preservation, the IIPC, which includes people from the national archives and libraries, including The National Archives of the UK and the Library of Congress. It specifies a methodology for collection and it specifies a storage format called a WARC file.

Why are the ISO 28500 standard and WARC files relevant to my organization?

No matter what your reason for capturing web content, there are two things you don’t want:

  • You don’t want to be trapped in a proprietary format that works with only one vendor
  • You don’t want your captured web content to be un-viewable in the future

ISO 28500 WARCs make sure you avoid both of those issues. Virtually all other web capture methods are susceptible to those problems, and that’s what Hanzo wants to avoid for our clients.

What is a WARC?

A WARC file is an industry-standard format for storing collected web content and associated data. A WARC file is a container that provides structure to the data for processing, indexing and access. More importantly, a WARC file will preserve original web content exactly as it was delivered from the target site. It contains all of the metadata that allows a forensic examiner to verify the integrity of captured web content.

What is native format web content?

Native format web content is the unaltered format in which the web content was originally delivered to a browser. It includes all of the components that make up a web page: HTML, CSS, JavaScript, images, text, etc. It is critical for authentication and forensics purposes when collecting web content.

What constitutes difficult-to-capture content (or why can't I use just any web collection tool)?

There is a huge amount of web technology that makes it easy for people to use websites, but also makes it very difficult for many capture tools (except Hanzo, of course) to capture. Essentially, it is content that requires interaction with a web page; think drop-down list selections, mouse-overs, pop-ups, multimedia, etc.

A recent study found that nearly 70% of all pages use JavaScript, so if you’re not using a tool like Hanzo, you’re missing content on nearly 70% of your web pages.

How do I avoid spoliation?

Proper preservation methods are critical in avoiding spoliation of web evidence. It is vital that preserved web content be sealed off from any live web content so that the risk of alterations or changes to the original content is eliminated. Be sure to check with your provider to make sure they’re following proper preservation methods for web content.

On Demand

Is Hanzo a software or a service?

Both. Hanzo provides options to use the software as a service (SaaS) and under a license (on-premise).

For Hanzo's SaaS offering, what service provider does Hanzo use?

Hanzo uses Amazon Web Services, which gives us unmatched reliability and scalability.

Many customers ask, "We have sophisticated content on our website, like video, pop-ups for disclosures, logins and other interactive content - can Hanzo capture it?"

The short answer is yes, Hanzo can capture virtually anything you can see in a browser. Want to give us a test? Show us the hardest, most complex content on your site. We’ll show you how Hanzo’s technology can give you the most complete, accurate and defensible captures available.

How do I view it if I don't have Relativity™?

Hanzo provides a number of viewing options, including native format, where you can view the site just as it appeared when it was live online, plus a variety of other export formats, including offline working replicas of the captured sites.

How do I produce captured web content to regulators, investigators or opposing counsel?

You have a number of options, including producing exported PDFs, which are always instantly available with Hanzo, to e-discovery industry standard load files and a variety of native format production options.

How big is the website to capture?

Not sure how big the site is that you want to capture? No problem. Hanzo uses a number of tools to provide our clients accurate page counts, and our experience across thousands of web capture projects helps clients make sure they’re getting the correct capture scope in place.

How many users can I have?

As many as you want. Hanzo doesn’t charge for users.

How long will Hanzo retain data?

Hanzo stores content as long as you need it. Our clients set the retention schedule to meet regulatory requirements or litigation needs. For many clients in the financial services industry, the retention period is seven years.


When Hanzo captures my website, will it impact the performance of the website?

In general, no. Hanzo looks like a user on your website, so we impact the performance of the site like any other user would. Hanzo’s professional services team works with our clients to make sure we have the smallest footprint.

Can I schedule captures for a particular time of day?

Yes. Many of our clients opt to have Hanzo run in overnight hours when usage of the website is lowest.

We have analytics packages on our website. Will Hanzo impact site analytics?

No. Hanzo’s professional services team uses a variety of techniques to make sure Hanzo isn’t impacting our clients’ site analytics at all. When you’ve done as many website captures as we have, this kind of attention to detail comes naturally.

Can you capture web content behind a login?

Yes. It takes some serious sophistication to perform accurate, defensible captures behind a login, and the good news is that Hanzo does logged in captures all the time.

Do you have access control levels - can I control what users see?

Yes. And you can control the content that’s available to each user. You don’t need a support call to Hanzo to add users or manage permissions.

Can I manage my users?

Yes. You can manage users through Hanzo’s admin features in the app. You don’t need a support call to Hanzo to add users or manage permissions. Plus, ask Hanzo about LDAP and SAML integration.


My company uses active directory. Can Hanzo integrate with AD so we can manage users from a single source?

Yes. Hanzo provides LDAP and SAML integration to support easier user management for many customers.

Can I set retention schedules and legal holds in Hanzo?

Yes. Hanzo supports retention and records management, including exceptions for legal holds, plus reporting and notifications, such as upcoming records due for disposition.


Will the site look like it did originally?

Yes. Interactive elements, like mouse-overs, image carousels, drop-down lists and pop-ups, will play back like the original, as will all the links on the site, including video and other multimedia content.

Do I have to install any software?

No, not if you’re using Hanzo as a service. You can view content using a browser (Chrome, Firefox or Safari – we don’t recommend Internet Explorer). You can also download our viewer app, which many customers find easier.

If you’re using the on-premise instance of Hanzo, for you, Hanzo is software running on your organization’s network (but only if you’re using Hanzo on-premise).

How do I add new sites or change the scope?

It’s easy. You can add new sites through the Hanzo app, or our support team can add the sites for you. We do the heavy lifting for you.

What kind of reporting metrics can you send me?

Hanzo provides a variety of analytics and reports to help customers create a detailed picture of their web portfolios. You can monitor changes to sites, including text and image changes, plus quickly pinpoint critical items within your web content, like external links or forms. Additionally, you’ll receive a full suite of reports for compliance support.

Do you provide training?

Yes, although you don’t need much training at all to use Hanzo. We do the heavy lifting behind the scenes so you get a clean, easy-to-use app.

How much training does it take?

For most users, Hanzo requires little to no training. Of course, admin and engineering users will get much more training.


Is Hanzo compliant with SEC, FINRA, FCA and other regulatory requirements for worm storage?

Yes. Hanzo stores all captured web content in WORM storage.

Will Hanzo provide letters of attestation as a books and records custodian?

Yes. These letters are standard parts of the Hanzo agreement.

Will Hanzo provide affidavits/declarations, and can Hanzo serve as an expert witness, if required?

Yes. Hanzo has provided dozens of affidavits and declarations, and been called on to testify as an expert on numerous occasions.

The website I need to capture personalizes content based on the location of the user. Can Hanzo capture the site as if it were in a particular location?

Yes. Hanzo can trigger geolocation content so you can see how the site appeared to someone in a specific location.

What about other types of personalization of A/B site versions?

Hanzo uses a variety of methods to trigger all kinds of personalization characteristics of websites, including things like browser history triggers, preferences and A/B site direction.

What if the site has links to third-party content, like a link inside a tweet or post?

Hanzo generally captures sites including what we call +1 hop, meaning we will follow all links outside the target site to one hop away, and then stop. You control the number of hops, and, therefore, how much content you want to include in the capture.

How many hops? How many levels deep in a site can you go?

You control the number of hops and how many levels deep you want to go. Hanzo generally recommends +1 hop. For example, if a Facebook profile has links in posts or comments to sites outside of Facebook, Hanzo will follow each link and capture the resulting web page, but no links from that web page.

Can I use this with Relativity™?

Yes. Hanzo provides a Relativity .DAT file as a standard part of every capture. You can load content into virtually any eDiscovery review platform.

Is it just a screenshot?

No, it is much more than a screenshot. With Hanzo, any captured web content looks and works like it did when it was live on the web. Screenshots give you no interactivity, and they miss critical content on modern web pages.

Is it possible to customize captures, and if so, to what extent?

Yes. Captures can be customized in a variety of ways. The methods fall into these general categories:

  1. Frequency: You can control how often the site is captured (for instance, daily, weekly, monthly, etc.). Additionally, you can trigger captures based on events, such as changes to a website, and then launch captures when a change or new page is detected.
  2. Scope: Customization around scope typically involves inclusion or exclusion of:
    1. File types: Some customers opt to exclude video or PDFs because they have other systems of record for those content types.
    2. Third-party links: You can direct Hanzo to follow links that go outside the target domain. For instance, https://www.blackrock.com/investing/resources/tax-information has a link to an IRS.gov page for estate and gift taxes. You can include or exclude linked content.
    3. URLs within the target site: You can use pattern matching to exclude URLs within a site from a capture. For instance, some customers want to exclude anything with “jobs” in the URL, or exclude hashtag (#) links from a Twitter capture.
  3. Interactions: Hanzo can customize the capture to interact with the target site just like a person would. This could mean a login or a more sophisticated interaction such as entering data on calculators like the retirement calculator athttp://www.blackrock.com/cori-retirement-income-planning?cid=vanity:cori:coritool.
  4. Personalization: Hanzo customizes settings to trigger the behavior of a site for things like device (for example, if you want the mobile version of a site), the location (if you want to collect a site as it appears to someone in a particular state or country) or other personalization website features. Hanzo uses user agent settings, cookies, custom headers or IP addresses to trigger these personalization features.
  5. Crawl speed: This enables you to determine the time window within which you would like the crawl to complete. For example, many of our customers want captures to run in, say, a 1 a.m. to 4 a.m. window, when customer usage of the site is low. Other customers want to limit capture times and workload because of agreements with site hosting providers.
  6. Analytics: We also customize the crawl so that we do not trigger reporting or site analytics so that our captures do not lead to over-reporting of site utilization.