Internet Newsletter for Lawyers
Knowledge management is not just about vast relational databases and fast computers; somehow legal "meaning" has got to be incorporated into the system. Here is an example to illustrate the concepts.
A lawyer is working on a complex document which sets out an agreement on a financial structure. She is now dealing with a rather tricky area - let us say, the notoriously complex issues raised by sections 739 to 745 of ICTA 1988, or a group of sections from Part III of the 1985 Companies Act - it doesn't matter for the sake of the example. Although the issues are complex, a great deal of money, and the lawyer's own reputation, are at stake if she gets this wrong. She wants to search her own know-how - her firm are specialists at this particular stuff - or indeed anything else available to her. Time is short.
Unknown to her, but within the various sources which are available to her electronically, there is a key sentence. It's a piece of commentary by a colleague. If she does not know about it, she has a very good chance of making a very expensive mistake. The sentence reads: "The effect of the new regulations on the next section is, however, much more controversial. In essence, two classes are created, and it is vital to ascertain into which class your particular.....". Conventional software is completely powerless to deal with this sentence: a search for "s. 742 of the Income and Corporation Taxes Act 1988" will obviously not pick it up, however expressed. There are two vital pieces of information which must be teased out from the text, and no search engine, indexing system, automatic linker or any other wizardry will do it on their own:
* section 742 of ICTA 1988 has been amended by a new S.I. - whenever and wherever any lawyer is working on something where s. 742 is relevant, they should know about this;
* in the particular matter being worked on, the lawyer has to ascertain into which category or class or case her current matter fits: previously, this distinction was not made; now, it is vital.
What happens in the brave new world of metadata (indexes), taxonomies (tables) and relational databases?
* before the document is admitted to the know-how collection, it has been processed for all legal references;
* some quite simple software is being used by a human operator. The software sees the words "section" and "regulations" in a sentence and stops at that point, saying to the operator "looks like a legislation reference";
* the software shows to the operator any legislation references which have already been identified by him in processing the document. They may also have been obscure, but the process that follows has already been carried out, so he knows what they are. Three Acts, four SIs and one Council Directive have been mentioned in some form or another.
* He highlights the word "section" in the sentence and from the context deduces that, since the last reference was to section 741, this must be section 742. Calling up an authority file of legislation, he discovers that there is indeed a s. 742 in ICTA 1988, and "clicks" it. Immediately, an entry is made in a database which creates a relationship between the sentence within the paragraph of the document, and that specific piece of legislation.
* Our operator does the same job on the choice of SIs and Directives, again looking back to see what is meant from the context. Another entry is made, relating the same sentence to a second specific piece of legislation.
* The software counts two pieces of legislation that have been related to a single sentence, and therefore asks: "do you wish to make a relationship between the two pieces of legislation in this sentence", offering "brings into force, repeals, effects, unknown/complex" as choices. The operator selects the third (or perhaps the fourth as well: they do not have to be mutually exclusive).
That's all there is to it. But what has happened? The value of that piece of legal writing has been enormously enhanced - you might say, liberated. For example:
* we already know, from the standard "properties" of the document, who wrote it and when.
* we can therefore instantly warn anyone who sees a reference to s. 742 in any document earlier in date than this one, provided it has been processed in the same way, that there is later commentary about a change to that piece of legislation, without ever going anywhere near the text of the earlier documents. Later documents update earlier, as needed, automatically, at the point of use.
* further, we can, if sufficiently motivated, email the authors of any earlier documents to warn them of the change, and invite them to update their text. In practice, it's usually cheaper to take advantage of the fact that the new document being created, in our example, will in turn join the list of documents about s. 742; and therefore, if anyone in the future wants to see what the firm's practice now is, they just see what was done without anyone doing anything more than they would have done anyway.
* best-of-breed external information can be accessed, by the lawyer working on her matter, in the same way. Because the legislation has been fully and correctly identified, albeit behind the scenes, a link or a search on reputable online sources for any further commentary on the effect of this SI on that section can be done accurately, and above all automatically. Our lawyer has to do nothing more than click. She does not have to be trained on all those other sources, nor does she have to do any more "searches".
Something else has happened. There is no longer any such thing as the "false drop". If our lawyer searches for this particular section in her know-how, she will get all that is known about it, and nothing that is not about it. She won't get "article 742" of some totally irrelevant Treaty, or nine hundred "hits" about ICTA 1988. She'll get just what she needs. Bingo.
The typical legal taxonomies are
* legislation and its inter-relationships;
* case law and its appellate and citation relationships;
* subject matter, normally arranged as "drill-down" hierarchies from the most general ("intellectual property") to the most specific ("passing off"), and including such relationships as synonyms, parallel concepts, etc.
The typical legal metadata categories are
* administrative - the typical stuff of document profiles, like author, title, dates, location of documents, etc;
* referential - the identification of other documents, organisations, etc referred to in a document - eg citation of a case, mention of legislation, reference to a journal article;
* conceptual - typically keywords, taken from a subject taxonomy (hierarchical thesaurus), but also short summaries, catch-phrases, and notes.
Notice that these are arranged in the order of intellectual effort required to create or extract the metadata: much administrative metadata should be extractable automatically, for example, but automatic summarising software and indexing software in the legal field has proved both expensive and of very variable utility (though in other fields it may work very well indeed).
The taxonomy in our example above is the table of legislation - the "authority file" of legislation - against which the particular reference in the text to a piece of legislation has been matched. The metadata is simply the recording of the relationship: "this reference in this document is specifically this section of this Act", or "this piece of legislation amends this other piece of legislation". It's just as simple as that. It's metadata because it is recorded outside the actual text of the document (though you can put it back inside by the use of tagging - a whole other area, in to which we will not now trespass).
Obviously the technique illustrated in our example above applies to other areas. Case law is particularly important in this respect: interpretation and comment, vertical (appellate) history, horizontal (case citing case) history, and where the hearing is reported, are all critical pieces of information to relate to the bland piece of text ".... but the EMI case has changed all that. Now ....." And of course, most talked about, but often least understood of all, subject matter: "where else do we have anything on the taxation of offshore gains made by private companies?" The taxonomies simply provide the standards against which metadata is created, and thereby ensure two things:
* like always matches to like, however the text of documents may be written, and in particular, as we saw above, later material automatically links to earlier without back editing;
* users can find material using look-up tables and other aids, rather than relying on pure text searching (as in the old issue of S42, s 42, s.42, s. 42, section 42, and all that), and be sure that their results are indeed comprehensive.
Although you might think it simple enough to understand authority files of cases and legislation, people often come rather unstuck on the subject matter stuff. Faced with a group of people within the firm who clearly cannot agree among themselves, and who, after a year or more of debate, have still not come up with one single concrete piece of data, managing partners can be forgiven for crying "a plague on all your houses" and buying useless, but promising, software which claims to do it all automatically. It won't work, but it lets the warring factions get on with something useful instead. The key problem can be summarised:
* Many people think that you use subject taxonomies to describe documents. They are quite wrong. You use them to help lawyers subsequently find what is relevant to their work. It's not by any means always the same thing. (Remember the old chestnut about the Black and Decker sales conference? "What do we sell?" "Drills." "No we don't, we sell holes").
* Many people also believe in the unique nature of their work and the need to have a complete taxonomy just for them, ignoring the unpleasant restrictions on linking to anything but internal work that this will impose, and the huge costs of reinventing 90% of the wheels required.
In fact, the answer normally lies in combining some specific terms of art that will indeed help particular lawyers find material relevant to their work, with standard structures which are intelligible to enlightened external sources, newcomers to the firm, clients, etc.
Although the underlying technology is now hugely cheaper than it used to be, and is available in most firms who have Windows NT anyway, the creation of the editorial processes and the design of the database are very substantial pieces of work. But whereas everyone has to have the underlying technology, which used to be so expensive and is now relatively so cheap, these other technical things only have to be done once, as before. This is fortunate, as the number of people who understand both the way to do this technical stuff, and the practice of commercial law, are very small, even in law firms. Often there are plenty of people who understand either one or the other, but not both, and have the gravest difficulties in communicating with each other as a result. Hinc illae lachrymae - but you can now dry those tears, and move forward under the banner of metadata and taxonomies.
Derek Sturdy is managing director of Granite & Comfrey; the firm which does the actual work of KM for law firms providing the techniques, the thesauri and the work, as required. Email email@example.com.
Back to Contents.