Internet Newsletter for Lawyers |
|---|
A document stored in electronic format can be reused, either in whole or in part. If the document is a judgment, other judges or lawyers may want to quote passages from it. If it is an article from a journal the publishers may want to publish a revised version. A biographer may want to use his subject's correspondence in his book. And in each case it should be easy to do this without the need for retyping.
But there is a problem. Most word processors store their output in a format specific to that particular word processor: the files output by Word are totally different from those produced by Word Perfect. Although most modern word processors can read in files produced by their competitors they are not always completely successful at this and sometimes formatting information is lost.
Further, programs other than word processors can have significant difficulty in reading such files. There may, for example, be a need to convert a document into Braille, or into a version that can be used on the Web with full hypertext links to the electronic text of cases and books referred to. An academic may want to search hundreds of judgments for cases in which a particular judge has given a judgment. A Court might want to extract from a Claim Form details of the parties, to store them in a database.
All of this is made much easier if the data, that is, the contents of the document, is in a standard form. At the moment word processed documents keep all of the formatting instructions (such as the fonts to use and details of exactly where on the page particular words should appear) in the same file as the text of the document. However, if this sort of information can be kept in a separate file, the searching of the document, the copying of parts from it into other documents and its transformation into other formats becomes much easier.
One way of achieving this is to format the document using a "mark-up" language: the file contains the text together with descriptive mark-up (such as "this is a heading') rather than procedural markup (which will tell the screen and printer drivers exactly how to format the text). In many cases the way that the document is actually formatted when it is reproduced is of no real concern to the person producing it: for example a judgment may well be laid out in completely different styles in two different series of law reports, neither of which bear any resemblance to the original transcript.
In 1971 a mark-up language called "SGML" started to be developed and some years later it became an international standard. It is widely used in industry, by Governments, by academics and by most of the major legal publishers: HMSO has stored the Statutes in SGML since about 1986. The grammar of SGML is, however, extremely complicated. In about 1990 a much simplified version was introduced as HTML and this became the universal language of the Web. However it went to the other extreme and is too simple to be used for complicated documents. An intermediate solution which started to be developed in 1996 (and is still being developed) is XML, or the eXtensible Markup Language.
The fact that this language is extensible means that it can be developed in different ways for different purposes: a different dialect can be produced for each of them. The grammar for such dialect can be set out formally in a file called a "Document Type Definition". For example a DTD for a variant intended to be used in connection with Pleadings would specify the markup to be used when describing a Claimant in a Claim Form. Separate markup could be used for his surname and other names and titles, his date of birth and other details. (It is not compulsory to store the grammar in a DTD, but doing so ensures that the markup can be read easily on any computer which has access to a copy of the DTD, either locally or via the internet.)
As in HTML, the markup in an XML document is enclosed in angle brackets. So the description of a claimant could appear (probably all on one line) as:
<Claimant>
<Surname> Doe </Surname>
<Other_Names> John </Other_Names>
<Date_of_Birth> 31/1/75 </Date_of_Birth>
<Address>
<Line1>1 Somewhere Gardens</Line1>
<Line2> Hounslow </Line2>
</Address>
</Claimant>
Similarly a case being cited could be marked up to show the names of the parties and the reports in which it was to be found (including sources on the internet), so enabling the easy and automatic preparation of indexes and tables. On my own civil procedure site, www.hrothgar.co.uk/YAWS/, I mark-up authorities in judgments as, for example,
"Lord Woolf MR has explained that the use of the word "real" means that the prospect of success must be realistic rather than fanciful (<case>Swain v Hillman<ref> CAT 21 October 1999, <para>para 10</para></ref></case>)."
This makes it comparatively easy for my software to add links to both the name of the case cited and paragraph 10 in it (and to add a note to the end of the latter paragraph stating that it has been referred to in paragaph 21 of Tanfern v Cameron-Macdonald).
It must, however, be remembered that XML is not itself a wordprocessor or any other form of computer program. It is simply a way of marking up the output of such a program to make it as useful as possible. The XML would be produced by a wordprocessor or other program which would read in the DTD and produce an output file which complied with the grammar described in the DTD. In the case of a Claim Form, software which displayed such file on screen, such as a browser, or printed it, such as the wordprocessor itself or a typesetting program, would not show details of the Claimant's date of birth or telephone number or any other personal details: it would display the names of the parties to the case in the traditional manner following rules set out in a separate file, or style sheet. These personal details, and other details such as the name and contact telephone numbers of the Claimant's Solicitor, would however be of use to the Court itself which could extract them from the file and store them in a database.
One of the principles behind the design of XML is that an XML document should be capable of being read by a human, although designed to be read by a computer: this has the advantage that the document will still be useable even if the program that produced it, or the operating system that ran that program, ceases to be available. Others are that it should be support a wide range of applications (including word processing, spreadsheets, document management systems and databases) and that it should be easy to write programs that read it on any computer using any operating system.
The Court Service issued a consultation paper some months ago called "Modernising the Civil Courts". This suggested that XML DTDs should be developed for use in litigation and this was supported by most of the organisations that responded.
It seems plain that over the next few years "electronic filing" will be introduced in this country: all pleadings will have to be filed with the Court in electronic form where they will be stored in an electronic court file: it may be that eventually filing with the Court will, as happens in America, replace service on the other party or parties. Requiring that all documents which are to be filed are marked up in XML would make this much simpler: there would be no need for conversion between different formats, searching would become far easier and duplication of material could be eliminated. (There are several million claim forms issued each year. In their printed form every one has a copy of the Royal Arms at the top. An XML claim form would not include this image, which occupies about 5kb or memory, but simply a hypertext link to a single file on a Government site where it could be found.)
A very simple example of an electronic method of commencing a claim can be found at www.hrothgar.co.uk/XML/ABA/aba-3.htm. If you complete the form and click on the link "Click here it the details are correct" you will receive back a copy of the completed Claim Form with, at the bottom, an example of the XML that could be sent electronically to the Court. (Nothing you enter on this form is stored anywhere.)
At the moment Judgments are produced in RTF format and then filtered into HTML so that they can be stored in large database such as BAILII. This will eventually change and they will be marked up using XML, which will make this filtering considerable easier.
In America the Department of Justice has set up an XML technology group to work on the use of XML in criminal cases at xml.coverpages.org/ojp-justiceStandards.html. In relation to civil cases a not-for-profit company called LegalXML is attempting to develop DTDs for many different legal purposes including contracts, court filing, child support, judgments and transcripts. It might be possible to adopt some of these for use in this country but the American legal system and court procedures are substantially different from ours and it may be better for us to develop our own. (There are other bodies, both public and commercial, and mainly outside the UK, developing XML systems for lawyers and courts: a search for "xml" and "legal" on Google produces over 400,000 hits.)
Who should co-ordinate the work of developing standards in the UK is not clear; in its response to the MCC paper the Society for Computers and Law suggested that the SCL should do so and I consider that this would be sensible (although as a contributor to the SCL suggestion I may be biased). One reason is that standards like this need to be developed by people who are likely to use them, and, in the case of litigation, this means primarily practitioners, but also judges and court staff.
This Article has only set out a very simple outline of XML and the law. Further information in XML can be found at
Roger Horne (roger@number7.demon.co.uk) is a member of the Chambers of Miss Sonia Proudman QC at 11 New Square, Lincoln's Inn. He has for a number of years been interested in the processing of text by computer and uses his skills in that field to produce the well-known CPR Web site YAWS).
Back to Contents.