eDiscovery Tips: Stranded with Voluminous Document Collection Part 2 – Search and Review

An Overview of Tools to Make Your Task Manageable, Part 2 of 2
By Cathy Fetgatter and Lauren Allen

As electronic discovery becomes more voluminous, a growing portion of your time is devoted to handling discovery. It may feel as though you are stranded on a desert island, surrounded by documents, with no hope of rescue in sight. Rest assured – there is help on the horizon! The right tools and processes can buoy your efforts, allowing you to conduct discovery more efficiently, effectively and defensibly, even when your collections are enormous and you are under extreme time pressures.

In the first of this two-part series, we looked at the collection and de-duplication aspects of conducting e-discovery. Here, we explore tools and processes that will make searching and reviewing more manageable.

Search Terms

Before you review your first document, you want to cull down your document universe as much as possible. Creating the right search terms can help you remove non-relevant documents, identify potentially relevant ones, and highlight privileged information. You can also use search terms to organize your review.

Defining search terms is an evolving process and search terms will often change throughout the life of the discovery period. Start your search term list with the terms you know you want, and then analyze your list. Are there iterations or synonyms that should be added? For example, proper names can appear in all kinds of combinations. So rather than having “John Smith” as a search term, you should include “John w/4 Smith.” This would detect documents with “Smith, John” and “John E. Smith,” as well as John Smith. Identify potentially problematic terms early on. Imagine that “general excise tax” is one of your key terms. It is commonly referred to as “GET,” which is also a “noise” word, or a frequently used word. If you analyze and test you term list early on, you can come up with a plan to address this issue. Case law also dictates that you have variations of your search terms so that you capture all instances of the term.

Searching for privileged documents is always a challenge in discovery, and this is an area where standardization can be particularly helpful. Take the time to create a template list of common terms that may indicate a privileged document such as “attorney” or “privilege.” Then, when you are actually conducting discovery, add the specific attorney names from the case at hand to your template list of privileged search terms. You can do the same for documents that fall under the Privacy Act of 1974 – your template list can include terms such as “Social Security Number” and other personally identifiable information.

By creating consistent processes and developing template lists that can be used across litigations, you avoid wasting valuable time reinventing the wheel in the rushed, stressful environment of a lawsuit. Although there are great benefits to standardizing some aspects of your discovery plan, be sure to acknowledge the specifics of the case as well. Identify issue areas early on and tailor your plan accordingly.

Once you run the search terms against the document collection, be sure to validate the terms. That way, you can determine how many were correct, how many false positives appeared, and how many you missed. This type of validation helps demonstrate a defensible process, and it helps your team develop metrics to evaluate the effectiveness of the search terms.

Various software options exist to assist with searching and the tool you choose depends on the size of your budget, the scope of the discovery, or the importance of the litigation. You should always keep reasonableness in mind, and one approach is to customize the scope of the search to match the budget of the case under the proportionality principle of the Federal Rules of Civil Procedure Rule 26(b)(2)(C).

Software is only the beginning of the solution, not the entire solution. Methodology is paramount, so the focus should be on the soundness of your method, rather than on particular tools.


Document review represents the most expensive part of e-discovery and can burn up to 60-80% of your discovery budget. This is where the right processes and tools can really pay off. Using the appropriate methods in the collection, de-duplication and search phases, you should be able to significantly cull down the amount of data that actually has to be reviewed. Your focus now should be on organizing your review for efficiency and accuracy.

Consider who is conducting the review. If you are using contract attorneys, think of ways to increase their learning curve. You can organize your review according to key search terms or key custodians. By doing this, the most relevant documents will be considered at the forefront of the review, which will familiarize the reviewers early on regarding the key issues of the case. Or you can organize your review chronologically so the reviewers see a natural progression in subject matter.

In Part 1 of this series, we discussed de-duplication technology. There are tools available that can detect “near-duplicate” documents. With these tools, you can set a similarity threshold and the software will identify “like” groups of documents that meet the threshold. You can organize your review according to these similar groups of documents. This will not only increase your metrics, it will result in more consistently reviewed documents since you are considering groups of similar documents all at one time.

While you should be communicating with opposing counsel throughout the discovery cycle, it’s particularly important to discuss the review. If you can get opposing counsel to agree on certain parameters – such as timeframe and search terms – it can greatly limit the scope of your review.

You should also identify a “project manager” for the review. This person doesn’t need to be a certified project manager, just someone who can keep tabs on everything involved, maintain logs, push deadlines, track and test the latest technology, and keep abreast of the case law. This person should have access to the multidisciplinary team, including IT, legal and technologists. A key role this person can play involves quality checks throughout the life of the review.

No one is an island – even if discovery feels that way. Tools to keep you from feeling stranded alone include performing proper collection and deduplication, using search terms, and conducting a targeted review. Plan proactively – don’t wait for a case to start before you begin thinking about your e-discovery plan. Know what technology is available, even if you aren’t using it for the case at hand. Identify common litigation themes, so you have templates and processes ready. Talk to opposing counsel as much as you can. Try to anticipate issues before they arise. Document the steps you take so that you can review your approach, identify your pain points, learn from your mistakes, and conduct a defensible discovery.

And remember, your most important decisions should be made before you review the first document.

Lauren Allen, Program Manager,IE Discovery

Lauren Allen

Lauren Allen is a Program Manager with IE Discovery, a leading provider of Discovery Management services. She has been employed at IE Discovery since 2001 and Lauren is a licensed attorney with the Commonwealth of Virginia and a certified Project Management Professional. She can be reached at lallen@iediscovery.com.

Cathy Fetgatter has been a Project Manager with IE

Cathy Fetgatter, Project Manager, IE Discovery

Cathy Fetgatter

Discovery since 2008. Cathy has a J.D. and M.B.A from American University and is a licensed attorney with the Commonwealth of Virginia and the District of Columbia. Prior to IE Discovery, she engaged in private practice with an emphasis on complex commercial litigation. She can be reached at cfetgatter@iediscovery.com.

~ by CDLB on July 6, 2010.

One Response to “eDiscovery Tips: Stranded with Voluminous Document Collection Part 2 – Search and Review”

  1. The authors have done a commendable job honing in on the key issues — the outrageously disproportionate cost of document review, and the seemingly unbreakable bond between data volume and cost. I do think that the prescription offered is incomplete; Using keywords as the basis for filtering out “non-relevant” material inherently leaves relevant material behind that no amount of sampling will remedy. In real data sets, people use pronouns and codenames and familiar substitutions. More broadly, I believe that the real solution to the problems raised here is to leverage available technology for automated review in the first instance, to model behavior by using “irrelevant” data to inform how people actually behaved, and for providers to de-link volume and cost. The “per GB” and “per document” models are under attack with a whole range of new business models. As buyers increasingly demand budgetary certainty and comprehensive analysis, a new breed of technology tools is filling the gap and eliminating the need for trade-offs in completeness and cost.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: