Lawfirms

New York City law firm listings. You can also post and read comments about law firms.

Placement Agencies

New York City placement agency listings. Post and read comments about placement agencies.

Articles

Read news articles about new developments in the legal field affecting contract attorneys.

Do's and Dont's

What every temporary attorney needs to know BEFORE they start an assignment.

 

More Articles

--- 2007 ---

Unlocking Keywords to Ensure Effective E-Discovery

The Real Implications of the New Rules on EDD

Commentary: What Is the Proper Level of Judicial Review for Mergers?

Bankruptcy Work Falls, but Megacases Still Provide Hefty Fees

Practical Responses to New Federal Rules on E-Discovery

Why Most Document Retention Policies Are Ineffective

Has the Antitrust Division Lost Its Nerve?

Howrey Adds Trio of Veteran Litigators to N.Y. Office

--- 2006 ---

The Howrey Way...

Temporary Solution...

Making Document Review a Success

The Basics of Hiring a Contract Attorney

Slaves of New York


Login

Register FREE with Temporary Attorney and gain access to all New York City law firm, placement agency and restaurant listings.

Also post your comments and rate law firms and placement agencies!

Register


ARTICLES

Unlocking Keywords to Ensure Effective E-Discovery

By Craig Ball
Law Technology News
January 29, 2007


The notion that words hold mythic power has been with us as long as language. We know we don't need to ward off evil spirits, but we still say, "Gesundheit!" when someone sneezes. Can't hurt. But misplaced confidence in the power of word searches can seriously hamper electronic data discovery. Perhaps because keyword searching works so well in the regimented realm of automated legal research, lawyers and judges embrace it in EDD with little thought given to its effectiveness as a tool for exploring less-structured information. Too bad, because the difference between keyword searches that get the goods and those that fail hinges on thoughtful preparation and precaution.

TEXT TRANSLATION
Framing effective searches starts with understanding that most of what we think of as textual information isn't stored as text. Brilliant keywords won't turn up anything if the data searched isn't properly processed. Take Microsoft Outlook e-mail. The message we see isn't a discrete document so much as a report assembled on-the-fly from a database. As with any database, the way information is stored little resembles the way we see it onscreen after our e-mail program works its magic by decompressing, decoding and decrypting messages. Lots of evidence we think of as textual isn't stored as text, including fax transmissions, .tiff or PDF documents, PowerPoint word art, CAD/CAM blueprints and Zip archives. For each, the search software must process the data to insure content is accessible as searchable text. Be certain the search tool you or your vendor employ can access and interpret all of the data that should be seen as text.

RECURSION
Reviewing a box of documents that contains envelopes within folders, you'd open everything to ensure you saw everything. Computers store data within data such that an Outlook file can hold an e-mail transmitting a Zip archive containing a PowerPoint with an embedded .tiff image. It's the electronic equivalent of Russian nesting dolls. If the text you seek is inside that .tiff, the search tool must drill down through each nested item, opening each with appropriate software to ensure all content is searched. This is called recursion, and it's an essential feature of competent search. Be sure your search tool can dig down as deep as the evidence.

EXCEPTIONS
Even when search software opens wide and digs deep, it will encounter items it can't read: password protected files, proprietary formats and poor optical character recognition. When that happens, it's important the search software generates an exceptions log flagging failures for follow-up. Know how the search tool tracks and reports items not searched or incompletely searched.

SEARCH TERM TIPS
So far, I've talked only about search tools; but search terms matter, too. You'll get better results when you frame searches to account for computer rigidity and human frailty. Some tips:

  • Stemming: Computers are exasperatingly literal when searching. Though mechanized searches usually overlook differences in capitalization, they're easily confounded by variances in prefixes or suffixes of the sort that human reviewers easily assimilate (e.g., flammable and inflammable or exploded and exploding). You'll miss fewer variations using stemmed searches targeting common roots of keywords; e.g., using "explod" to catch both exploded and exploding. But use stemming judiciously as the more inclusive your search, the more challenging and costly the review. Be sure to include the correct stemming operator for the search tool.
  • Boolean Search: Just as with legal research, pinpoint responsive items and prioritize review using Boolean operators to find items containing both of two keywords, or keywords within a specified proximity.
  • Misspelling: It's scary how many people can't spell. Even the rare good speller may hit the wrong key or resort to the peculiar shorthand of instant messaging. Sometimes you can be confident a particular term appears just one way in the target documents -- e-mail addresses are prime examples -- but a thorough search factors in common misspellings, acronyms, abbreviations and IM-speak.
  • Synonyms: Your search for "plane" won't get off the ground if you don't also look for "jet," "bird," "aircraft, "airliner" and "crate." A comprehensive search incorporates synonyms as well as lingo peculiar to those whose data is searched.
  • Noise words: Some words occur with such regularity it's pointless to look for them. They're "noise words," the static on your ESI radio dial.

I recently encountered a situation where counsel chose terms like "law" and "legal" to cull data deemed privileged. Predictably, the results were disastrously overinclusive. I recommend testing keywords to flush out noise words. There's irrelevant text all over a computer -- in spelling dictionaries, Web cache, help pages and user license agreements. Moreover, industries have their own parlance and noise words, so it's important to assess noisiness against a representative sample of the environment you're searching. Noise words are particularly nettlesome in computer forensic examinations, where searches extend beyond the boundaries of active files to the wilds of deleted and fragmented data. Out there, just about everything has to be treated as a potential hiding place for revealing text. Because computers use alphabetic characters to store nontextual information, billions or trillions of characters randomly form words in the same way a million typing monkeys will eventually produce a Shakespearean sonnet. The difference is that the monkeys are theoretical while there really are legions of happenstance words on every computer. Consequently, searching three- and four-letter terms in forensic examinations -- e.g., "IBM" or "Dell" -- can be a fool's errand requiring an examiner to plow through thousands of false hits. If you must use noisy terms, it's best to frame them as discrete occurrences (flanked by spaces) and in a case-specific way (IBM but not iBm).

STRIKING A BALANCE
Effective keyword searching demands more than many imagine. You don't have to put every synonym and aberrant spelling on your keyword list, but you need to appreciate the limits of text search and balance the risk of missing the mark against the burden of grabbing everything and the kitchen sink. The very best results emerge from an iterative process: revisiting potentially responsive data using refined and expanded search terms.

LEGAL PUBLICATIONS

The American Lawyer
The Connecticut Law Tribune
Law.com
New York Law Journal