Aalap: AI Assistant for Legal & Paralegal Functions in India

Aman Tiwari; Prathamesh Kalamkar; Atreyo Banerjee; Saurabh Karn; Varun Hemachandran; Smita Gupta

Aalap: AI Assistant for Legal & Paralegal Functions in India

Aman Tiwari, Prathamesh Kalamkar, Atreyo Banerjee, Saurabh Karn, Varun Hemachandran, Smita Gupta

TL;DR

This work tackles data privacy and domain adaptation challenges for legal LLMs in India by introducing Aalap, a fine-tuned 7B Mistral model trained on an instruction dataset tailored to Indian legal tasks. Aalap emphasizes legal reasoning over recall and is evaluated against GPT-4, Legalbench, and the AIBE exam, showing improvements over gpt-3.5-turbo on several tasks but mixed results on cross-domain benchmarks. The model can be hosted on-premise, addressing data-security concerns, and the authors provide the Aalap dataset and weights on HuggingFace. Future work includes enriching training data with expert reviews, expanding multilingual capabilities, and integrating retrieval-based methods to better handle Indian legal precedents and sources.

Abstract

Using proprietary Large Language Models on legal tasks poses challenges due to data privacy issues, domain data heterogeneity, domain knowledge sophistication, and domain objectives uniqueness. We created Aalalp, a fine-tuned Mistral 7B model on instructions data related to specific Indian legal tasks. The performance of Aalap is better than gpt-3.5-turbo in 31\% of our test data and obtains an equivalent score in 34\% of the test data as evaluated by GPT4. Training Aalap mainly focuses on teaching legal reasoning rather than legal recall. Aalap is definitely helpful for the day-to-day activities of lawyers, judges, or anyone working in legal systems.

Aalap: AI Assistant for Legal & Paralegal Functions in India

TL;DR

Abstract

Paper Structure (16 sections, 2 figures, 5 tables)

This paper contains 16 sections, 2 figures, 5 tables.

Introduction
Related Work
Aalap Dataset
Common Data Sources
Data Creation
Data Limitations
Model Training
Training Procedure
Model Evaluations
Evaluation using GPT4 as an evaluator on Aalap test data
Evaluation using Legalbench data
Evaluation using All India Bar Exam
Discussion
Conclusion & Next Steps
Summary statistics of various task categories
...and 1 more sections

Figures (2)

Figure 1: Boxplot of Aalap evaluation scores by GPT4 grouped by tasks
Figure 2: Comparison of Training and Evaluation Loss During Fine-tuning: The graph illustrates the progression of training and evaluation loss over the course of fine-tuning, providing insights into the model's learning dynamics and generalization performance.

Aalap: AI Assistant for Legal & Paralegal Functions in India

TL;DR

Abstract

Aalap: AI Assistant for Legal & Paralegal Functions in India

Authors

TL;DR

Abstract

Table of Contents

Figures (2)