[Authors: Jason Covey* and Sergio Garcia]
Editor’s Note: On November 16, 2022, HaystackID shared an educational webcast on the proliferation of enterprise messaging content in the COVID-era, a proliferation accelerated by the pre-existing industry trend of legal teams’ grappling with the increasingly complex nature of enterprise messaging vs. traditional email. With this acceleration and complexity, many organizations now face new messaging management and discovery challenges that can negatively impact investigations and litigation if not properly handled.
This session was developed and shared by a team of industry forensics, legal discovery, and Microsoft 365 specialists, and highlights a collection of enterprise messaging topics and provides expert considerations on the emerging issue of enterprise messaging platform data in eDiscovery.
While the entire recorded presentation is available for on-demand viewing, a complete transcript of the presentation is provided for your convenience.
[Webcast Transcript] Are You a Team Player? Managing Enterprise Messaging Data in eDiscovery
+ John Wilson
ACE, AME, CBE, Chief Information Security Officer and President of Forensics, HaystackID
+ Jason Covey
M365 eDiscovery Consultant
+ Sergio Garcia
Vice President of Forensics, HaystackID
+ Rene Novoa
Director of Forensics, HaystackID
Hello, and welcome to today’s webinar. We have a great presentation lined up for you today, but before we get started, there are just a few general admin points to cover. First and foremost, please use the online question tool to post any questions that you have, and we will share them with our speakers. Second, if you experience any technical difficulties today, please let us know in that same question tool, and we will do our best to resolve them. And finally, just to note, a recording of this session will be shared via email in the coming days.
So, without further ado, I’d like to hand it over to our speakers to get us started.
Hello, welcome from HaystackID. I hope you’re having a great week. My name is John Wilson, and on behalf of the entire team at HaystackID, I would like to thank you for attending today’s presentation and discussion titled “Are You a Team Player? Managing Enterprise Messaging Data in eDiscovery”. Today’s webcast is part of HaystackID’s regular series of educational presentations developed to ensure listeners are proactively prepared to achieve their cybersecurity, information governance, and eDiscovery objectives.
This webcast is being recorded for future on-demand viewing. We expect the recording and complete presentation transcript to be available on the HaystackID website soon after we complete today’s live presentation. Our presenters for today’s webcast include experts with a deep understanding of digital forensics, legal discovery, and Microsoft 365, with particular insight into enterprise messaging platform data and eDiscovery.
We do have a quick disclaimer. Microsoft 365 is an evergreen technology platform that introduces updates on a continual basis, which measure in at least the hundreds per month. This is true of both Microsoft Teams as well as the Microsoft Purview eDiscovery tools that will also be a part of today’s discussion. Although HaystackID’s made every effort to ensure that the content presented is accurate as of the time of the presentation, future viewers should exercise appropriate due diligence in performing tasks in live tenant environments and in real-world project scenarios.
Next slide. Today’s agenda will address the following topics in enterprise messaging with a specific focus on Microsoft Teams, as well as Slack. We will also address issues that are common to both platforms across the entirety of the presentation. These topics will certainly ebb and flow as we have the discussions to get through all of it.
Next slide. So, I am John Wilson. I’m the CISO (Chief Information Security Officer) and President of Forensics here at Haystack. I’m an industry veteran that’s been doing this work since the late ‘90s. I have done cases from, you know, single one-offs to extremely large cases throughout the industry. Sergio Garcia was supposed to be with us today. Unfortunately, he cannot be with us, and then we have Jason Covey, our M365 eDiscovery consultant, and Rene Novoa, our Director of Forensics and R&D.
So, let us now begin the presentation by moving into greater detail and talking about Microsoft Teams, and I will now turn the presentation over to Rene.
Thanks, John. We’ll go to the next slide, please. Wonderful. Thank you, everybody, for joining us today, and we’ll get started with it, as we have a lot of material to cover between me, Jason, and John, that’ll be stepping up.
So, according to Microsoft, Teams is a workgroup application for consumers, enterprises, government, and education consumers. It provides organizations with workspaces that were designed to enhance employee collaboration through chat, file sharing, meetings, and associated applications. So, it also describes it as high velocity, which means that it is constantly changing. It’s constantly interacting with other applications inside the environment. When we talk about Microsoft Teams, we’re just talking about one application with so many communicative pieces to it. As you can see by the chart on the left, there are so many other applications within Microsoft’s ecosystem and how they all work together, which makes it such a high-velocity part of the Microsoft family. So, this represents a very significant departure from our long-held understanding of email and the contents of eDiscovery, and really is at the heart of some of the unique challenges posed by enterprise messaging content. As we remember back in the day, it was all email and how we communicated, and Microsoft Teams has really enhanced its features with an additional platform to chat on, and to communicate, and to collaborate with other team members.
Next slide. So, we’ve seen a large explosion of Microsoft Teams over the last few years, unfortunately, driven by COVID. There’s been almost an 893% growth in metrics in the use of Teams, because people had to find ways to communicate, how to collaborate while not being in the office, and have the ability to have so many more tools in comparison to Zoom, Slack, and some of these other platforms that were one-offs of doing visual communication. Still, really, this is an all-in-one where you’re able to bring so many other functions and features from inside of the Microsoft family, specifically Microsoft Teams.
So, I think we have an estimate of almost 300 million users. We’re closing in on 270 on this chart, but I think we’re pushing 300 million active users, which, as the slide indicates, not only just government, but educational use, and corporations alike, which is making it even more important for us as discovery practitioners, on the eDiscovery side, of how to collect, what to do with the data, and how to filter through all this information, because over this time, so many features, so many different intricacies of how we communicate were added. Well, a lot of IT and back staff were not ready for all these changes for security, for collaborations, and how we’re transforming the information, attachments, modern attachments, group chats, and secret chats. All these things are becoming very important, and it happened really fast, where we didn’t realize whether we should do it, were we ready for it, and we’re trying to play catch up. So, it’s been a great run, and we’re learning a lot, and like John had in the disclaimer, it is constantly changing over and over again, hundreds of times within a week, to a month, to a year that we have to be prepared for.
Go ahead, and we’ll go to the next slide, and the slide was to demonstrate the growth of it, and how, as practitioners, we’re seeing it more and more, and we want to start with the history to realize that if you’re just getting into eDiscovery, that you’re going to see Microsoft Teams as a specialty, more than just email and some of the other features of Microsoft.
So, one of the important things, as we start to understand anything about Microsoft Teams, is the licensure. So, over the next three slides, we’re going to spend a few moments addressing the effects of the licensing for both Teams as well as the Microsoft toolset, and what will facilitate the handling of Teams data and eDiscovery investigation scenarios. So, the Teams application itself is standard, the M365 Enterprise (E3 or E5) subscription tiers. For those who are not familiar with these classifications, it broadly generalizes E3 licensures as less than 300 users, while E5 is for organizations with more than 300. That’s always applicable but that’s a generalization of what we see as far as general organization licensures. The E actually stands for Enterprise but is similarly designated G for Government, A for Academic institutions, and F for Frontline workers. We will actually talk about that in later slides as having the premium service in some of these advanced licensures with these designations. So, it may not always be E3. There may be a G3 or an A3, but we’ll get into it later on in the slides. But it is important to know that the licensure does affect how we collect, import, and export information related to Microsoft Teams.
Isn’t it true that the licensure would also affect the interaction between all of the platforms within M365? As you add higher levels, more features become available, and more interaction between applications becomes available.
Absolutely, and we’re definitely going to cover that as we move through that. This chart also gives us more analytical tools to be able to export data and to be able to filter that information out. We also are going to be talking about how we don’t have to have the entire organization on an E5 license to have those great features. We can actually do some audit add-ons. So, that will be absolutely covered as part of it.
Next slide, please. As we were talking about, some of the important takeaways with the subscription tiers are the technical capabilities of the Microsoft toolset, as John had mentioned, on how to handle Teams data. Having the E5 is giving you the premium eDiscovery feature, which allows you to export Teams conversations in the form of a Teams HTML transcript, as well as automatic inclusions via association with modern attachment content. So, it does give us the ability to do more with the E5 license. So, what is important, like I mentioned before, is that we can upgrade specific individuals, specific custodians, however how you want to put it, or on those individuals, so that we may have those additional features so that we can export into a more threaded, a more viewable way as far as HTML. So, being able to understand the licensure of a corporation or a client, when we walk into it, we’ll understand what needed will be to advise or consult to upgrade and then be able to perform the task that’s necessary.
We can go on to the next slide, please, and this looks wild. I know this screen looks wild, but it’s to demonstrate the different licensures of Microsoft 365 and where eDiscovery premium is available. eDiscovery premium is available both if you have an E3 license or an E5 license between Office 365 and Microsoft 365. Even though one has cannibalized the other and is mainly now Office 365. Did I get that right, Jason? It is now Office 365, right? It got cannibalized by Microsoft, correct?
That is correct.
I wanted to make sure I didn’t get them backward, but they are two and one the same, but it is important to note this is where it is available, or it is available on having an audit add-on. So, it is a much cheaper, much easier upgrade if you have 300 or 400 licenses or employees, but only a handful of individuals are needed for discovery or for compliance or needed to be on legal hold, and we need to upgrade those licenses to do those specific collections, those advanced features, those premium features that do give us the ability to thread Microsoft Teams, that allows us to do the additional legal holds, some of the additional things that we can do only with email, but with Teams, OneDrive. It all comes as part of the eDiscovery premium package. We can then upgrade per month at a relatively low cost as opposed to upgrading the entire tenant, which can be very costly. So, it is very important to know that you can pick and choose the licensure and move it down back and forth, but in order to have this eDiscovery premium, which we’re going to get into depth quite a bit in the future slides, that we will have to upgrade those individuals. But the examiner, the person that’s doing the work, does not have to have an E5 license. They can work within the standard license. They do have to have certain privileges. They’ll have to be an eDiscovery manager. There’ll be some certain criteria that they’ll have to be given access to. But their licensure does not have to be E5, but the individuals that have been collected for will definitely need to be upgraded to an E5 license or the environment needs to be under an E5 licensure.
Jason, anything to add to that part before we turn this over?
I think not. If you don’t mind, I’ll go over a little more detail of this confusing graphic that has a lot going on.
Not a problem.
So, we put appeared this graphic from Microsoft’s larger nine-page document called their Modern Work Plan Comparison Document, and I tried to annotate it further to help emphasize the details that are relevant to Microsoft Teams. So, the red rectangle emphasizes the row for premium eDiscovery, and you have the bullets appearing in any columns where that option is available. Those columns are intersected by the yellow rectangles, which indicate the applicable licensing within which premium eDiscovery is available. Then we have that dashed green line that emphasizes the columns or licenses in which eDiscovery practitioners are most likely to encounter premium eDiscovery in the M365 tenants of organizations with more than 300 employees. So, we did this to draw the distinction between the less common frontline worker plans, that’s on the right side of the chart. As if that wasn’t enough information to digest, and I’ll tell you, looking at the full nine-page version of this document is really something, Microsoft has two additional details related to licensing that often come into play in the context of eDiscovery.
So, although it’s not uncommon to encounter organizations that only have E3 licensing, which we noted earlier still includes the Teams application, Microsoft provides two additional options via which to obtain access to premium eDiscovery. As specified in the text of the slide, that can be achieved via either an E3 license that has either a compliance or eDiscovery and audit add-on license, so E3 license plus compliance, or E3 plus eDiscovery, and audit add-on. The reason we emphasize this latter option, as Rene alluded to, is that it provides a less expensive scenario to achieve the premium eDiscovery functionality that we sometimes need when upgrading those individual custodian accounts to provide the greatest flexibility in handling Teams data, which we will address in greater detail later in the presentation.
Next slide, please. So, at this point, we’ll move on to the subject of Teams content types, and again, the most sought-after of all, which are Teams messages available under the Chat bubble icon in the Teams application.
Teams’ messages can be broken down into two primary categories. You have direct messages or one-to-one chats, and group messages or one-to-many chats. The 1:N reads as one-to-many. These two categories typically make up the most voluminous and valuable content. For an eDiscovery and investigation matters as, like all chat messaging content, we tend to see participants speaking in a less reserved, off-the-cuff manner that can be very revealing under scrutiny. As a result, some organizations opt to severely limit the duration of Teams and other messaging content by way of ultra-short duration retention policies.
As such, detailed knowledge of these policies, along with other nuances of information governance are a necessary component in the proper handling of both Teams and Slack, as well as other enterprise messaging data. It should be noted that these issues serve as major points of distinction versus other types of personal or mobile device messaging technologies.
So, there are both similarities in that it is still chat content, but also these significant differences in the context of an enterprise environment.
The other category of Teams communication content revolves around Teams channels, which are available under the Teams icon in the Teams application. Teams channel messages – are posts which are available under the tab of the same name – are made up of chat messages, posts, replies, and attachments shared in a standard Teams channel. With the exception of Teams private channel content, which we will be addressing separately, Teams public channel content tends to be less direct, more measured, and more similar to traditional email on average.
In light of its broader accessibility within a larger team, but organization-specific exceptions to that generalization can always exist. Next slide, please.
So, a quick question around that is where do distribution groups and that thing fit into that breakdown?
So, Teams groups, or Exchange groups, and Microsoft groups share the same paradigm. There are some underlying commonalities between how that’s handled in Exchange versus Teams content. The connection between Teams and Exchange is pretty deep, as we’re going to get into later in the presentation. The communication content that underlies teams is supported by Exchange. And distribution groups are a part of that. So, there are ways that those functionalities interact. I think this is something we alluded to earlier as well. There’s interaction between these different layers, so the application, that function around the Microsoft substrate that underpins everything. So, there’s absolutely interaction between those different components of Microsoft technology.
Got it. And then I’m going to roll us back for just a second. We had a question from one of the attendees. “When you upgrade a custodian to E5, how long does it take to the license to apply and the index to update?”
So, my answer to that is going to be it depends. It’s generally something – I think the general rule of thumb is to allow 24 hours or so, but we don’t have a whole lot of direct insight into how that happens. I think the general guidance is be patient because whenever we make a change like that within the environment, that command has to get in line with all the other commands that are going on tenant-wide within the different M365 workloads.
It depends on factors such as the size of the organization, the number of users, and if it’s a multi-geo tenant arrangement. Some of those factors can come into play, so it’s hard to give a precise answer because there isn’t necessarily granular visibility on that specific issue. But those are, what I would say, are the general rules of thumb. Giving it more time rather than less.
The other thing I could say is that when functionality is available, there are some clues that it’s available as new things will pop up in the UI. So, that’s a broad answer to that question.
Thank you very much, and proceed on.
All right, so we now arrive at an aspect of Teams eDiscovery that’s at the source of some significant heartburn for eDiscovery professionals. I am referring to the Teams private channel.
So, first off, the creation of Teams private channels can be disabled by Teams administrators. And from an IG perspective, that’s something that should be given careful consideration, as they lack organizational visibility in their current state when they’re made available to users. And we’ll talk a little more about that in the subsequent slide.
A Teams channel is a regular Teams channel with an apparent team, but is sometimes referred to as a public channel in order to distinguish them from private channels.
Anyone can join a public channel without any approval needed. In contrast, a Teams private channel is accessible by invitation only, and a channel owner’s approval is required to obtain access. We’ll get into some additional issues that come into play with private channels in just a moment. Next slide, please.
On the broader subject of non-messaging data types, Teams includes Outlook sync to calendar information, call history information and files that were shared either within a team or in the context of a chat. A collection of the latter two types falls into the problematic category particularly so for files, which we will be exploring in great detail. Next slide.
So, this slide provides some additional visibility into the location of Teams content, which continues to be a source of much confusion, which makes today’s discussion ever more relevant. With regard to Teams’ one-to-one or one-to-many messaging content, the key points to understand are that the messages are contained in Microsoft Exchange, in the individual email boxes of each participant, and that the file shared in this context only reside in the OneDrive location of the participant who shared the file. With regard to Teams public channel conversation content, messages are actually stored in a dedicated Exchange group mailbox that is automatically created at the same time as the Teams channel. And files are shared in a SharePoint location that is also automatically created and associated with the team.
Note that the title on the right graphic should actually read Teams Public Conversations to further underscore that distinction.
So, understanding these concepts and proper execution are critical to achieving accurate, complete collections of Teams data. And these points we’re going to be reiterating in subsequent slides. So, this theme will be revisited. Next slide, please.
If you’re paying extra close attention, you’ll notice that I glossed over private channels in the prior slide. And the reason is to contrast the distinctions in this slide. As opposed to public channels, where we just specify that Teams messages are stored in the automatically created Exchange group mailbox for private channels in order to ensure the security parameters, messages revert to being stored in the individual mailboxes of the private channel members. You’ll note that this is the exact same behavior we have with Teams direct or one-to-many chats. However, unlike Teams chats, file shared and private channels are stored in a dedicated SharePoint site that is also automatically created and associated with the private channel. It should be noted that even if a non-member were to somehow obtain the direct URL of a private channel SharePoint site, the security permissions would still prevent that user from accessing the content and it would actually display a dialogue that they have insufficient permission to proceed. Next slide, please.
So, as we have now seen the special nature of Teams private channels – along with their higher-than-average potential to contain sensitive information that could be highly relevant to a legal matter – definitely pose some unique challenges. First of all, there’s no direct method to even identify the existence of a Teams private channel via Microsoft’s eDiscovery tools. In fact, in the last few years, there have actually been situations where the existence of Teams private channels only came to light far later, during the actual review process, when messages were identified that included custodian participants whose data was otherwise not included in the collection. So, it’s only with investigation and remedial collection effort that the existence of the private channel was made known, and then addressed via supplemental collection efforts. Obviously, we try to avoid those types of things in eDiscovery wherever possible.
Given the extent to which a scenario like that could negatively impact review, discovery deadlines, expensive motions practice, and just generally cast that party in a bad light, it’s obviously best to avoid a scenario like that entirely. In our onscreen graphic, we’ve shown just a single piece of what’s a multi-step process that’s currently required to attempt to piece together the information that’s required to complete private channel data collection. However, as the technical processes to fully and properly identify these data sources in their entirety require administrative access that’s beyond the reach of most eDiscovery practitioners, some practical alternative approaches are definitely in order.
The first line of defense in avoiding the types of situations we’re talking about involves thorough custodian interviews that include detailed specific questions about the existence and use of any private channels, those channel names, known channel members, and any specific SharePoint URLs. Although this is additional work that can be tedious in larger matters on the front end, we recommend taking all of it as a non-technical counterbalance to what’s otherwise a pretty extreme lift required to piece this information together without that benefit that is available to the end-user custodian in just a few clicks.
With all that scary information divulged about Teams private channels, let me say something else that will hopefully ease your mind a little bit. Given the fairly extreme and long-standing difficulty that was introduced with Teams private channels and their segregation of the relevant M365 administrative responsibilities – this is particularly so in larger organizations – there has been intense pressure from Microsoft to provide a solution for eDiscovery managers independent of these constraints. That solution is taking the form of a forthcoming feature on the M365 roadmap. It’s number 88815 for Premium eDiscovery which purports to provide, and I quote, “Simplified picking of relevant Teams channels standard, private and shared to intelligently identify associated custodian mailboxes, group mailboxes, OneDrive for business accounts, and SharePoint sites that may contain Teams data that’s relevant to your case”.
Translation, Premium eDiscovery will soon identify any Teams private channels that are accessible to your custodians.
So, that’s going to be a significant change. It’s been delayed a couple of times, it is long-awaited, and it’s Microsoft’s response to the percolating, rapidly intensifying need to address these challenges as these different areas of IT administration with M365 are often spread across different departments that aren’t able to communicate with each other, et cetera. So, that’s the underlying need to address that.
As I said, that’s the biggest issue at the very largest organizations. Next slide, please.
And while we’re loading the next slide, we do have another question about the public channels, and the public groups. What retention rules get applied to those, and how are those controlled when they’re not going directly to a user’s mailbox?
Teams has its own retention schedule that’s separate of Exchange. So, that’s a source of a lot of confusion because although Teams makes use of Exchange as the mechanism for transmission of the messages, its retention policies are handled completely separately in the Teams administrative console. And, like I said, that’s an aspect of the M365 administration that’s often separated across different users and possibly even departments within an IT organization.
So, the answer, I’m afraid, is it depends. And that’s part of one of the themes we try to emphasize in talking about M365 bigger picture is that interaction between principles of information governance and eDiscovery needs have never been more closely interrelated than they are in the paradigm of M365.
So, completely different settings can be applied, and retention settings can be applied to these different content types. And those can disagree with each other. So, that’s why this is a fantastic question because it gets at the heart of an area that is probably lacking understanding. And we’ll get into modern attachments here in a little bit.
But the components of the Teams data are in different locations. We talked about SharePoint – or shared messages, or shared files in the form of modern attachments, those are contained in either OneDrive or SharePoint, which are essentially the same thing. But the messages are contained in Exchange.
So, for example, if your retention policy was radically different in terms of Teams content versus your OneDrive content if your Teams content is kept only for a week, your messaging content – but your OneDrive is kept for five years, there can be scenarios where you have an attachment, but you don’t have the parent message or vice versa. So, that’s a great question. And that’s at the core of what needs to be thought about in implementing these and ensuring that all the right personnel are at the table in those discussions. And that’s definitely an area in which we spend a lot of time counseling clients and love.
You’re definitely thinking about the right thing to answer that question, is my opinion on that.
Moving on. Let’s make sure we’re on the right slide. So, as it specifically relates to Teams data – and so we alluded to earlier in the context of licensing – some significant differences exist between the capabilities that are available in Standard versus Premium. And there are actually some pros and cons between them.
Generally speaking, Standard eDiscovery provides Teams messages as individual items i.e., as individual documents when viewed in Relativity or other review platforms with no context to the larger conversation thread, either within PST containers or as foldered MSG files, but does not include any attached files that may have been exchanged. The only exception to this is any content that can be displayed inline which typically includes most graphic file types, as well as HTML formatted elements.
In general, Standard eDiscovery equates to more difficult review unless it’s subjected to some complicated post-export processes that reassociate conversation threads, but also without any attachments that may have been changed. However, it is also possible to correct that but with even greater additional effort.
So, in the new paradigm, where enterprise messaging is effectively replacing email as the primary method of communication in many organizations, this is obviously problematic in many respects. In contrast, Premium eDiscovery provides an exclusive native format for Teams message content, which automatically preserves both conversation threads and modern attachments, while collecting and maintaining a family relationship for any modern attachment content that exists in the conversation thread. It does this in the form of the Teams HTML transcript files, which are generated at the time of the export.
We’re going to be showing you an example of this in a later slide so you can see exactly what it looks like. But this format provides a pretty significant benefit in terms of review, as the chronological context of the conversation is preserved and then presented in a manner that very closely resembles the appearance within the native Teams application.
Further, any files that are exchanged are included and accessible for review within the context of their original appearance in the conversation.
So, in other words, they are presented as attachments in a manner that is effectively identical to the long-established display of email families, and document review platforms that Rene alluded to earlier. However, the not insignificant downside of this approach is that there is a loss of metadata and granularity in analyzing specific messages via date filtering absent additional post-collection process. And there’s also a potential need for what can be some significant manual redaction to remove adjacent conversation. Content that could be irrelevant, off-topic, offensive, but reflective of the spontaneous nature of instant messaging we referenced earlier.
So, in summary, although Microsoft solved one issue at the present time, it’s an open question as to whether that solution created a new problem or a problem of greater or equal significance. Next slide.
So, this slide provides an overview of the features available in Standard versus Premium eDiscovery. We’re not going to delve into many of the specifics but just wanted everyone to make note of the much longer list of capabilities offered in the Premium column. Do note, however, the two highlighted Teams-specific items we just mentioned, the automatic inclusion of cloud or modern attachment content. At the moment, cloud attachments, cloud links, modern attachment is Microsoft-specific terminology for that, but it’s all referring to the same thing. Same concept if you have Gmail. If you have Google Docs that are linked, the same concept. And the compilation of Teams messages into more easily reviewable conversation threads.
One capability that deserves quick mention is the legal hold notification feature that is included in Premium eDiscovery, which is of particular interest to corporate law departments as its features are maturing to a point where they can begin to challenge some of the major players in that market segment.
I wanted to mention that for the benefit of any litigators today that might be in attendance, because you may start hearing about the entirety of legal hold being handled in M365, and that is the reason for that. Next slide, please.
To provide a bit of additional perspective on what we’ve been describing with regard to M365’s eDiscovery capabilities, this slide depicts Microsoft’s visualization of the EDRM, which we have annotated to overlay the stages that Standard and Premium eDiscovery Support.
As you can see, Premium’s eDiscovery functionality extends farther to the right side stages, and supports some significant eDiscovery processing capabilities, it provides a basic review experience that includes document coding and redaction. And probably most significant of all are the analytic capabilities that include email and conversation threading, deduplication, near duplicate analysis, themes detection, and even a predictive coding capability, which is currently slated for a significant revision sometime in 2023. Next slide, please.
In this slide, we’ll focus on Standard eDiscovery for the purpose of highlighting its impact on the handling of Teams data, which is obviously at the heart of what we’re discussing today. Standard eDiscovery is available to collect from any end user with at least an E3 license. However, Standard also remains available with higher licensing, which can be helpful in certain situations.
So, in other words, if you add E5, the E3 capabilities of Standard eDiscovery do not go away. In Standard eDiscovery exports, with Teams messages specified for collection – which we’ve shown in the left-hand graphic with the red arrow – Teams content can be exported either in PST format, and there are a few different sub-variations of that or as foldered MSG files. The industry best practice at this point is to export each custodian as a separate PST, and without including any other content types.
As an aside, you want to note the concept of the custodian is unique to Premium eDiscovery. Whereas in the context of Standard eDiscovery, where we’re pretty much referring to a user. So, that’s a minor terminology difference that exists, and you’ll find more than one of those in the Microsoft eDiscovery realm. In addition, the entirety of each custodian’s OneDrive must either be exported from the outset or individual files would need to be collected in a subsequent collection. Returning to the fact that we want to drill home is that modern attachment content is not included in Standard eDiscovery exports of Teams content.
And that’s really the problem with Teams content addressed via Standard eDiscovery in the modern attachment content, or the files that are shared with other users via Teams are not collected at all as part of the Teams data export process. Instead, the individual Teams message MSG files will simply display a link to the SharePoint URL of the cloud attachment, which is identifiable via the cloud-shaped icon that’s overlaid into the file type icon. So, if you see a Teams message where a Word document was attached, you’ll see the familiar blue icon for a Word document, but it’s going to have a little outline of a cloud shape on top of it. That helps you make that visual distinction.
The only exception to that are inline files that can be displayed within an email. And those are generally common graphics file formats, as well as some certain HTML formatted content like the Teams poll example that we’ve shown in the right-hand screenshot.
The other important issue to note is that message conversations will not be threaded in a Standard eDiscovery export. And that can only be addressed via post-export processing that we mentioned earlier. Absent that processing, review is complicated because there’s no context provided, and that causes difficulty for reviewers tasked with identifying relevant content and understanding conversations that are only appearing in a broken one-sided manner.
Lastly, any content collected via keyword criteria will only identify the message that contains the direct search hit as opposed to the contextual search results that are provided in Premium eDiscovery. In other words, you will have the message that contains your actual search hit but none of the surrounding conversations to provide that context. So, if I am sending Rene a message, and it includes the term “weekly meeting”, and that’s a single sentence, we are going to get that direct search hit only. It’s going to have the search hit for “weekly meeting” and absolutely no context. So, it’s going to be an MSG file that contains a single sentence that is representative of the nature of Teams chat content. Next slide, please.
So, in this slide, we’ve prepared an example to better convey the appearance of the Teams data export from Standard eDiscovery, and it shows what I was just talking about. So, in this example, we’ve got a single PST export for a single custodian, and you will see that the messages are contained in a folder called TeamsMessageData as viewed in Outlook.
In the middle screenshot, you can see what was just described with each individual Teams message appearing as a separate MSG file in Outlook.
We also want to emphasize that unlike email, with the exception of inline content, modern attachments are not present in this export format. That would include all common file types that you typically exchange like Word, Excel, PowerPoint, PDFs, et cetera. And obviously, that’s a huge consideration to understand at the outset of a collection is that problem could be compounded many times in a scenario with, say, 50 custodians where heavy Teams usage is in play.
Again, that data resides in the OneDrive location of the user that shared those files in the context of direct or group chats, and in the associated SharePoint location for Teams channel conversations. As an aside, in scenarios where Teams content is exported from Premium eDiscovery due to its greater usability, collection teams should be sure to exclude Teams content from any data that is collected from Standard eDiscovery, so that you avoid duplication of data in their exports.
So, to describe that in a little bit more detail, if you’re doing an export, and you want the advantages that are obtained via Premium eDiscovery export of Teams content, you want to make sure you export that Teams content only in Premium eDiscovery. And then if you go back to Standard for the remainder of your content, you want to make sure you’re actively excluding Teams from that, or else that data is going to exist in both places and cause some confusing deduplication that will have to be sorted out after the fact. Next slide, please.
So, we have a question here in regards to what happens to the versioning of files when you export. Do you only get the current version, or can there be older versions?
Okay, so that is a loaded question, but another good one. So, the answer to that depends on – and here’s the interaction of information governance and eDiscovery yet again – it depends on whether SharePoint versioning is enabled in your tenant, and how that’s going to happen. Within Premium eDiscovery, we have an additional capability that allows all available versions to be included in the export. That’s up to 500 versions that it can hold.
However, that’s a loaded question because you might get what you ask for and see 500 versions of the document. You have to think about the ways that that could complicate review.
That is a great question to discuss with your eDiscovery vendor and IT personnel to understand how that’s implemented and what it might actually look like. Because that is a loaded question and a hot-button topic currently as it relates to M365 eDiscovery.
There is a forthcoming feature that is on the public M365 roadmap – which we’re going to provide a visual link to here at the end of the presentation – that attempts to address that and provide only the version of the file that was shared at that point in time, frozen in time. However, that’s going to be implemented via a retention policy that’s separate and has to be instituted, at which point it will work from that point forward. It will not reach backward.
So, there are a number of questions and some current discussion in the industry going back and forth about whether that’s required, how that affects discovery obligations in anticipation of litigation, and whether you’re required to preemptively preserve versions of documents as they change in time. And that’s a great question, because that’s really at the heart of the difference between email and the challenges that we face with enterprise messaging content, because they are not the same. It’s unfortunate that they look so similar, particularly in the Microsoft ecosystem, but they are very different with what’s going on behind the scenes. Some of the things we’ve discussed along those lines today.
I would keep an eye on that, as that’s an emerging topic that’s going to be fleshed out publicly probably in the next year or two to come, but great question. If you do not select that option in Premium eDiscovery, or you do the export in Standard eDiscovery, you will get the most recent version of that document only. So, if you think about, I don’t know some kind of contract that has been modified and different people or documents could be repurposed – you can get something two years later when you’re doing a collection that goes back in time – you could get a document that literally has nothing to do with the current version of the document. They could have changed. If it were a contract, they could have changed names of the entity, it could be describing something completely different. So, that is definitely an area where you want to do your homework with regard to collections and make sure all the appropriate folks are at the table to help make those distinctions for you.
Let’s see, I want to make sure we leave enough time here. So, I’m going to fly through these next slides. Let’s see.
We mentioned the licensing requirements with regard to E5 versus E3 plus compliance for eDiscovery add-on. That is applicable, again, in the context of Premium eDiscovery. Next is the ability to export the threaded messages. Those are generally broken into 24-hour segments as Teams HTML transcript files, which facilitate that improved review experience. We’re going to see exactly what that looks like in the next slide.
Modern attachment content is automatically collected and included alongside the corresponding Teams messages. These files are further associated with their parent messages via an automatically generated CSV load file that is included with a Premium eDiscovery export, which contains multiple group IDs for downstream data handling in review platforms.
A significant difference that I want to mention exists between Standard and Premium eDiscovery when it comes to targeted collections performed via keywords, versus what I described earlier for Teams chat content. Instead of only individual messages with keyword hits being included, Premium eDiscovery includes an option at the collection stage to collect contextual Teams and Yammer messages around your search results.
So, what this means is that adjacent content is included from an approximately 12-hour period both before and after your actual search hit. So, this is effective – the effect of the feature is to provide additional context around keyword search hits, which results in that more effective meaningful search results, and you have the context of the larger conversation for reviewers.
Teams channel content is handled slightly differently than chat as noted in our on-screen graphic, which reflects the differing nature of channel content versus chat that we have alluded to throughout. In general, the individual channel posts – this is the case for both public and private – and all replies are included in a single HTML transcript file, which also serves to facilitate the more effective review experience.
One side note to be aware of with the Teams HTML transcript format is that all of the messages are displayed in UTC format only. And there’s no timezone customization currently supported. So, that’s obviously another scenario that you want to discuss with your vendor in order to identify whatever options are best for you in your specific scenario. Next slide, please.
So, in this slide, we finally see the actual output of a Teams HTML transcript from Premium eDiscovery. As you can see, the message author is displayed in the document text along with a date and timestamp for each message. You’ll also note the presence of a modern attachment link, which makes it clear to a reviewer that the content type is also present. The purple shading you see is actually contextual in nature and represents the custodian of the particular data being displayed. In this case, Lynne Robins’ Exchange mailbox in the custodial data source of the Teams conversation is what is being displayed.
In the upper graphic, we’re showing an example of a raw export from Premium eDiscovery. The CSV load file we’ve been referring to is called Export_Loadfile_1of1 and is accompanied by an error log that’s actually – it’s present even if there’s actually no errors there. So, it’ll be empty if it doesn’t have any actual errors. And then there’s also the native files folder that is similar to what is found in a load file production deliverable. And there’s an optional folder for extracted text which is not included on screen.
The last issue we want to emphasize here is just reiterating some of the potential downsides of Teams HTML transcript format. Those include the potential for sometimes extensive redaction of the transcript document to address irrelevant or non-responsive content. In addition, the reality remains that – absent that additional post-export processing we’ve referred to – the granularity to filter individual message content via dates and times within a review platform is generally otherwise lost. So, this is yet another area where we highly recommend communicating with your vendor to determine what the best course of action is depending on your needs of the case.
I recently had a discussion with a client that had six years of Teams data at issue, and it was a complete non-starter for them that that not be addressed. In a single plaintiff case or an investigation that has highly targeted content, it might not be an issue at all. So, that leaves wide ground that you have to pick whether it’s something that needs to be addressed to facilitate your review or not. Next slide, please.
This slide just relates to the availability of Teams voice call and voicemail data collection. The main point I want to make is that the call history is not available via the Standard eDiscovery tools. It is actually part of the Teams Administration Console. And that’s often not available to in-house eDiscovery practitioners. So, I just wanted to point that out is that’s outside of the reach of eDiscovery Microsoft tools, and it also only goes back 90 days. So, that’s something that has to be considered. And is really currently outside of the reach of Microsoft’s eDiscovery technology. Next slide, please.
This slide just reiterates most of the key takeaways from today’s presentation as it relates to Teams. We’ve covered most of them in detail at this point, so I will not belabor that. And if we could have the next slide.
These are some of the links to key resources for anyone interested in learning more about Teams data and Microsoft eDiscovery ecosystem will likely find very helpful. And these can also be a jumping-off point to wider resources. We’ve got the discussion of the Teams metadata fields that are specific in Premium eDiscovery exports, permissions required to access Purview Premium eDiscovery, which was alluded to in Rene’s portion of the presentation. And then most importantly, is perhaps the M365 roadmap, which shows you what’s coming from Microsoft and is perpetually subject to change. With that, I’m going to turn the presentation back to Mr. John Wilson.
Perfect, thank you so much. So, now we’re going to move into the Slack discussion. And we’re going to talk generally about the Slack challenges.
Slack is a collaboration tool that was first released in 2013 – sorry, we can move forward a slide – and quickly became the unquestioned leader in messaging and collaboration until more recently when Teams and Zoom really started to take off in the post-COVID era of things. And that really started to change the landscape quite a bit.
The application continues to be used widely through many industries and especially popular in the tech sectors. So, software development, or development, and fintech have been very strong Slack users, and having very large amounts of Slack content. And Slack presents some of its own challenges. HaystackID was an early market mover and one of the first to do collections from Slack all the way back in 2015. We’ve developed our own tools for managing that, in addition to utilizing the tools that are presented through the Slack marketplace, and through the Slack licensing. And we remain at the forefront of Slack collection and discovery technology. Next slide.
So, there’s various licensing for Slack. Again, similar to Microsoft Teams, it has significant impact on what you can do or what you can’t do. You have the free license, you have a Plus license, the Business+, and then Enterprise Grid. It’s important to understand that what you can export from each of the versions becomes very relative to the specific version.
So, Plus, you can only export the local messages. In the Business+, when you move up to the Business+, it allows you to export all messages. And then in Enterprise Grid, it actually allows for multiple sub-tenants and allows for full exports, including attachments and direct messages and things of that nature. Next slide, please.
So, when you’re talking about the Slack collections, there are multiple methods of doing it. You have the exports that you can do from the system itself. And then you have things where your people are using the API, the discovery API, or individual tokens, which is the technology that Haystack has developed. Within those, there are various challenges, the export collection is easily initiated by the customer. They go in, and they go into their console, and they initiate an export. It provides the point in time on that version of the instance. But it does require the Plus or higher for the public messages, and Business+ or higher for the private messages. So, that export method isn’t available on the basic, on the free license.
Individual tokens allow you to authorize by custodian and collect all of their messages related to that individual custodian. It can work with all license levels because, again, it’s developed outside of the Slack stack. And it’s just making the data calls to the Slack, getting an authorization token created that allows us to perform that. And that also allows us to collect the public and private content visible to that custodian.
And then you have the discovery API, which requires Enterprise Grid license. It provides rolling access to all custodians and requires approval from the primary owner of the Enterprise Grid. It does show all public and private messages but provides limited undelete functionality and provides tombstones for deleted messages so you can tell where content is missing. And next slide, please, and we’ll wrap this up.
So, the Slack collection types, we talked about what works for what. So, the user-based – which is where we’re doing the individual tokens – works on all accounts. The compliance exports only work on Premium or Enterprise. And the discovery API only works on Enterprise. Next slide.
And the differences as to what you can get access to from both of those it’s important to understand. So, discovery API does get your public messages and it can get all messages in the workplace. And the compliance export gets public messages and, depending on license level, can get all messages in the workspace. But the user-based or the individual token does not get all the messages in the workspace, it only gets all the messages that the custodian has access to. And from there, next slide.
We’re kind of out of time. But the token pulls collects everything. We’ve pretty much summed all this up already.
I want to thank everybody for being here today. Thank you to the entire team for the information and the insights that they’ve shared. We also thank everyone who took the time out of their schedules to attend today’s webcast. We know your time is valuable, and appreciate your sharing it with us today.
Thank you very much.
*M365 eDiscovery Consultant