Picture this: you’re in a busy open-concept office space. There are clusters of people everywhere, hard at work, but they’re all chattering nonstop. They’re brainstorming with their teams, throwing ideas around within project groups, soliciting feedback from mentors in other departments, and probably—let’s be honest—gossiping a little with their workplace friends. People are moving from group to group and participating simultaneously in multiple conversations. There’s plenty of work getting done, but there’s also a constant buzz generated by the interconnecting discussions happening all around the space.
Now imagine that every word of every conversation, from the work-related to the not-so-much, is being captured in a running transcript.
Bad news: it is. It’s just that all those conversations are happening on Slack, not in person.
Slack has only been in use for about five years, but it has exploded in that time. Earlier this year, it hit 8 million daily users, with 3 million of those paying for higher service levels. And it’s not like people are getting on Slack for a minute or two out of their busy day or sending a few isolated messages. A year ago (when Slack only had 6 million daily users), estimates indicated that paying customers were actively engaged in Slack conversations for better than two hours per day. Those users had the application open for a stunning 10 hours a day.
Here’s the worse news: any part of those hundreds of conversations may be discoverable in litigation, depending on the issues at stake. Whether it’s an allegation that your company infringed on intellectual property or that William in Marketing has been sexually harassing all the pretty new hires, any line of text may be relevant to a disputed issue, obligating you to preserve, collect, and, eventually, produce it to an opponent or present it to a jury.
This has, to put it mildly, created a problem for ediscovery professionals.
NEED TO collect from Slack ?
Join this 20-minute live webinar to see how Hanzo can help you handily collect and preserve interactive data sources like dynamic web content, social media, and team collaboration applications including Slack, Jira, Confluence and others.
Collecting From Slack Is Like Drinking From a Firehose
Needless to say, Slack is pumping out data at an inconceivable rate. Trying to collect potentially relevant electronically stored information (ESI) from Slack is like trying to drink from a firehose. There’s too much data coming in and too much of it is entirely irrelevant. Plus, it can be harder than you’d expect to siphon off the trivial side conversations so that you can focus on the important ones.
If data volume were the only challenge to Slack collections, it would be manageable. Of course, it isn’t; there are a few additional wrinkles.
For one thing, Slack messages tend to be short and individually incomplete, pinging back and forth in a staccato rhythm as busy employees speak in a conversational shorthand. Any single message standing alone is likely meaningless. That means that you can’t capture just the individual messages—you need to collect them in their original dynamic context. (This is part of what makes separating the wheat from the chaff so difficult; they’re almost inextricably intertwined.)
Additionally, Slack’s popularity boomed in part because of its integrations with other applications and its ability to incorporate non-Slack content. Users may upload files, share images, or link to online content. Your capture methodology must include not only the messages themselves but also any linked or associated content regardless of what app (or website) it’s from.
Add all that together, and Slack collections can feel altogether “unpossible.”
Transforming the Firehose to a Manageable Stream
Fortunately, there are ways to limit the flood of Slack data and make reasonable collections possible. Here are three best practices to get you started.
1. Create clear policies for Slack use, backed by useful channel and group designations.
One of the challenges with collecting Slack is its casualness. Instant messaging encourages abbreviations—which aren’t always consistent—and rapid typing can lead to misspellings or mistakes. These can make searches for key terms hit-or-miss. While your collection and review tools should enable you to filter and search Slack text in a sophisticated way (see below), you can start things off on the right foot by crafting clear policies for Slack use. For example, you might limit the use of unusual abbreviations or establish rules for adding tags to work-related conversations. There’s a nifty side effect to having a policy about Slack communications: it reminds employees that their Slack conversations could be monitored or searched, encouraging them to keep it on-point and professional. (Good luck with preventing typos, though.)
Additionally, consider generating channel and group designations that will pre-sort relevant information. If everyone on the Morris project is in the @Morris group and all discussions about that project occur within the group, then any collections for litigation related to Morris should (ideally) be easy to locate.
2. Incorporate Slack into your ediscovery preservation and collection pipeline.
In many workplaces, employees initially started using Slack on the sly, without running it through their official IT pipeline. Even after the enterprise officially adopted and sanctioned the use of Slack, its “below the radar” vibe may persist. This can translate to a gap in the legal hold or broader ediscovery pipeline.
If this has happened in your organization, go back to the beginning and start over. First, make sure that everyone understands that Slack messages are potentially discoverable and subject to legal holds. You can also introduce your new Slack use policies at these meetings.
Revisit your legal hold notification process and make sure that Slack data is included in your hold notice. (We’ve got some exciting announcements coming about legal holds in Slack; stay tuned!) Incorporate Slack into both your initial custodian questionnaires and your custodian interviews to be sure you’re gathering information about what Slack messages may be discoverable. Spend some time inventorying your Slack environment and really familiarizing yourself with the different data streams within your enterprise. Your questionnaire and especially your interviews should drill down to find out which channels, groups, direct messages, tags, names, abbreviations, and words your employees are using to discuss relevant issues.3. Use tools that enable sophisticated searching and filtering of exported data.
Remember that the end goal of any ediscovery effort isn’t merely to find the most relevant and persuasive ESI, but to actually do something with it. In other words, you need to be able to extract potentially important data into a usable form. “Usable,” for our purposes, means that you can view that information in a fully functional native format, clicking on links and accessing associated files or other content. You also need to be able to search and filter your exported data so that you can continue to separate the wheat—the really good stuff you’re after—from the chaff surrounding it.
Don’t forget, too, that you’re going to have to produce discoverable ESI to your litigation opponent and perhaps, ultimately, present it as evidence in court. If your capture tools don’t allow that, you’re merely gathering background information, not generating courtroom-ready content. Slack Enterprise Grid enables users to extract data from any Slack channel for compliance or ediscovery purposes using its API (application program interface). Bear in mind, though, that that exported data won’t necessarily be compatible with your ediscovery review platform.
Still feel like you’re going to drown in the flood of Slack data? Don’t panic! Hanzo specializes in locating and capturing dynamic, complex, unstructured data from the web and from apps like Slack. We work extensively with ediscovery and compliance professionals and our capture methods are fully authenticated for use in court. Our sophisticated search capabilities and native-format review tools can tackle the most overwhelming ediscovery challenges, allowing you to navigate collected data in its full dynamic context as if it were live.