Two questions are of importance when discussing legal information retrieval:
- Where legal sources meet technology and their users and
- When legal sources meet technology and their users
In my opinion, the triad of these building blocks – legal theory, technology and users – has to be taken into consideration when one asks how legal information retrieval can be improved. Carol Kuhlthau said in 1991:
There appears to be a gap between the system’s traditional patterns of information provision and the user’s natural process of information use.
Unfortunately, this statement still holds true in 2010 and I will try to explain how. In order to show some of the inherent discrepancies between the triad of legal IR – as I will call it in the following – one could imagine a scale (of justice).
Considering that users on average use two search words and the amount of legal information is constantly growing, all the responsibility is put on the technology at the moment. In trying to show the possible risks with this scenario, I will – in the following – pinpoint three factors for each of the triad’s building blocks and – in concluding – try to combine these factors in order to show some possible paths for the future.
Generally, legal sources, have, inter alia, three characteristics:
- Legal sources are based on text, and therefore suffer from the ambiguities of language. Searching for “law” will not always lead you to legal matters but also to laws of nature, Moore’s law, etc.
- Legal sources are published as documents. Though legal professionals will rarely read through a complete act of legislation or collections of cases, it is documents that are retrieved, not smaller pieces of information.
- Legal sources are not a unified body of knowledge. Though legal sources commonly follow a certain structure – Act, Chapter, Section, Paragraph; Summary, Facts of the case, Conclusions, Legal Reasoning – this structure is not often taken advantage of. Combining references and links alone are not enough, though inbound links are definitely a start.
As legal sources, technology also builds upon, inter alia, three premises:
- Technology relies on mathematics and statistics. Many algorithms work on the basis of mathematical calculations and statistical probabilities.
- Technology focuses at information. Most IR systems judge relevance based on the information in the system not necessarily the situation of the user. Relevance is not static but dynamic as the user will learn more about a certain subject during the retrieval process.
- Technology likes lists. Most search results are presented in lists of decreasing relevance. Besides the previous point on the dynamics of relevance, certain information might be related in a network way and not necessarily in a hierarchical way.
If we mainly focus on legal professionals in this analysis, one can mention, inter alia, three characteristics:
- Users are lazy. Several studies – not the least The principle of least effort by George Zipf – have shown that users like to do as little as possible in order to retrieve the best possible information.
- Users are confused. Humans do not think in search words, but in concepts. Pressing all our confusion and knowledge into – on average – 2,5 search words, does not come natural to us.
- Users like pictures. Humans think in associations and memory can be improved by visual aids.
Starting with three factors for each part of the triad of legal IR, I would like to combine these nine factors into three possible paths that can either be taken combined or on their own. Hopefully, however, all paths can contribute to an improvement of legal IR, both for users, technology and legal theory.
text + focus on information + confusion = context
Taken the user’s situation more into account (e.g. what she is working with, her area of expertise) and using this information to affect relevance would not only lessen the ambiguity of the text, but also decrease the confusion of the user.
documents + mathematics/statistics + laziness = workflow
Using smaller information units instead of document units as the basis for statistical and mathematical calculations on how the user might be able to use the information would help the laziness of humans and increase their workflow.
structure + lists + visuality = visualisation
The inherent structure of legal information allows for information to be referenced and put into a larger knowledge base that could be visually presented and thereby serve as a visual tool for the user.
These combinations are only three possibilities out of several others, the pieces of the puzzle can be put together as one likes with possible different outcomes. The puzzle, however, remains. Many different factors come into play when legal sources meet technology and their users. The trick is to find a place where they can all meet.
One place like this is Visuwords, an online graphical dictionary. By combining lines, bubbles, colours and space the users meets a visually appealing and easy to understand structure of information that could facilitate her search for legal knowledge.
- Anne Aula, Rehan M. Khan and Zhiwei Guan, How does search behavior change as search becomes more difficult?, in Proceedings of the 28th International Conference on Human Factors in Computing Systems, 2010
- Donald O Case, Looking for information: a survey of research on information seeking, needs, and behavior, 2nd edition; Academic Press; 2007
- Christine Kirchberger & Staffan Malmgren, Inbound links – picking the low hanging fruit from the semantic web, presented at Workshop on legislative XML 2008 (LXML–2008) – the Law in the Semantic Web and beyond, JURIX 2008, 21st International Conference on Legal Knowledge and Information Systems
- Jon Kleinberg, The Mathematics of Algorithm Design, in T. Gowers and J. Barrow-Green (eds), Princeton Companion to Mathematics, Princeton Univ. Press, 2008
- Carol C. Kuhlthau, Inside the search process: Information seeking from the user’s perspective, Journal of the American Society for Information Science 42 (5), 1991, pp 361–371
This blog post was inspired by a presentation at a workshop on advanced methods for legal information retrieval, held by the Trust for Legal Information on 26 April 2012. My presentation slides are available as pdf.