Last week we looked at the role of some software technology in e-discovery, concluding that while software is an important tool, it is not a substitute for attorney participation.
Freelance writer Jason Krause wrote an
interesting piece last month for Legal Technology News that delves further into the burgeoning studies being conducted to try to determine that optimal balance between technology and human activity in large scale data collection.
Krause highlighted the
Text Retrieval Conference (TREC), an initiative co-sponsored by the National Institute of Standards and Technology (NIST) and the U.S. Department of Defense. One of TREC's missions is to encourage research in information retrieval based on large text collections.
According to Law Technology News, "for several years now, the Text Retrieval Conference Legal Track has tested different types of computer searches to create industry best practices for searching electronic records in litigation. In 2008, the project added a new investigation into the role of human researchers in improving the search results from computers, called the
Interactive Task."
"Dan Brassil, manager of Linguistic Technology with H5
says, "Computer algorithms are getting better, but they will never get the same results as when there is a person in the loop or human intervention is part of the search process. The question is where the humans fit into the picture."
"Researchers in the TREC project are discovering there are roles that are best provided by machines and those done by human beings. "We use humans to do what they are very good at, which is to make nuanced judgments in specific cases," says Brassil. "But they are not so good at judgments across a lot of documents. People get tired, allow inferences to creep in, and you never know what a person will say in terms of consistency. That's where machines come in."
While the test groups employed fundamentally different approaches (e.g. using complex questionnaires to refine the up-front search scope vs. employing a computer-based learning tool to rank responsiveness), the TREC researchers concluded:
"Machines should do the grunt work of review, but members of a legal team need to:
• Consider scope, timing, and nature of the request to determine what approach may work best. Think about whether there is time to gradually seek every responsive document possible, or if a more targeted approach is needed.
• Identify the custodians who understand the documents in a collection and discover what they know about those documents.
• Capture the language from responsive documents and incorporate it into search terms that approximate the language actually used.
• Continually perform control checks. If responsive documents are not being found, reconsider and refine search strings.
Unfortunately, there is no definitive answer about the division of labor between man and machine. But the TREC topic authorities noted that teams that failed to think ahead about how to define relevant documents and relied on computing power to find documents fared the worst. "It's well understood that human review and machine review have limitations," says TREC Legal Track researcher Gordon Cormack. "In the next few years we hope to find the balance between them that mitigates those natural flaws."