Table of Contents
Fetching ...

Human Centered AI for Indian Legal Text Analytics

Sudipto Ghosh, Devanshu Verma, Balaji Ganesan, Purnima Bindal, Vikas Kumar, Vasudha Bhatnagar

TL;DR

This paper argues that powerful LLMs alone fall short in delivering trustworthy legal AI, especially for India, and proposes a human-centered, compound AI system for Legal Text Analytics (LTA). It introduces a new Indian legal dataset and a legal knowledge graph, and presents five LTA tasks—Case Similarity, Judgment Summarization, Petition Drafting, Question Answering, and Text2SQL—each designed to leverage human input and external knowledge to reduce hallucinations and improve usefulness for practitioners and self-represented litigants. The work details an InLegalLLaMA research direction, combining domain-specific pre-training, knowledge infusion, and instruction-tuning to build a robust Indian legal model; it also outlines practical components like a PIL-guideline-driven petition drafting workflow and retrieval-augmented generation for evidence-backed queries. Overall, the approach aims to democratize legal knowledge and speed justice by aligning AI capabilities with human expertise and real-world legal processes.

Abstract

Legal research is a crucial task in the practice of law. It requires intense human effort and intellectual prudence to research a legal case and prepare arguments. Recent boom in generative AI has not translated to proportionate rise in impactful legal applications, because of low trustworthiness and and the scarcity of specialized datasets for training Large Language Models (LLMs). This position paper explores the potential of LLMs within Legal Text Analytics (LTA), highlighting specific areas where the integration of human expertise can significantly enhance their performance to match that of experts. We introduce a novel dataset and describe a human centered, compound AI system that principally incorporates human inputs for performing LTA tasks with LLMs.

Human Centered AI for Indian Legal Text Analytics

TL;DR

This paper argues that powerful LLMs alone fall short in delivering trustworthy legal AI, especially for India, and proposes a human-centered, compound AI system for Legal Text Analytics (LTA). It introduces a new Indian legal dataset and a legal knowledge graph, and presents five LTA tasks—Case Similarity, Judgment Summarization, Petition Drafting, Question Answering, and Text2SQL—each designed to leverage human input and external knowledge to reduce hallucinations and improve usefulness for practitioners and self-represented litigants. The work details an InLegalLLaMA research direction, combining domain-specific pre-training, knowledge infusion, and instruction-tuning to build a robust Indian legal model; it also outlines practical components like a PIL-guideline-driven petition drafting workflow and retrieval-augmented generation for evidence-backed queries. Overall, the approach aims to democratize legal knowledge and speed justice by aligning AI capabilities with human expertise and real-world legal processes.

Abstract

Legal research is a crucial task in the practice of law. It requires intense human effort and intellectual prudence to research a legal case and prepare arguments. Recent boom in generative AI has not translated to proportionate rise in impactful legal applications, because of low trustworthiness and and the scarcity of specialized datasets for training Large Language Models (LLMs). This position paper explores the potential of LLMs within Legal Text Analytics (LTA), highlighting specific areas where the integration of human expertise can significantly enhance their performance to match that of experts. We introduce a novel dataset and describe a human centered, compound AI system that principally incorporates human inputs for performing LTA tasks with LLMs.
Paper Structure (11 sections, 7 figures, 3 tables)

This paper contains 11 sections, 7 figures, 3 tables.

Figures (7)

  • Figure 1: Human computer interaction can bring down information asymmetry in the justice delivery system
  • Figure 2: Prompt template for question-answers generation
  • Figure 3: Distibution of question lengths in our Question Answering dataset
  • Figure 4: Tasks in Legal Text Analytics
  • Figure 5: Format of a Writ petition to be filed in the Supreme Court of India. An LLM based solution for assisting self-representing litigants should elicit information to draft such a petition.
  • ...and 2 more figures