Abstract
This paper introduces an application of Large Language Models (LLMs) in the context of Retrieval-Augmented Generation (RAG) to the problem of privacy-preserving question answering in a legal setting. As Germany’s voluminous legal documents, including laws and court decisions, are not stored accurately in the inherent knowledge of LLMs, LLMs are prone to producing unreliable or non-existent references. By augmenting the inherent knowledge with ground truth facts retrieved from a Neo4J database, the answer-generating system can cite the facts directly. By using a locally run LLM, we mitigate the need for cloud-based data processing, preventing privacy-relevant data from leaving the system. Our preliminary results with selected legal questions show the system’s ability to provide plausible legal answers. This research lays the foundation for further studies, opening the possibility for integrating more sophisticated RAG techniques and building a user interface with deterministic quoting for precise citation and ease of use. Our study presents a step towards deploying AI in sensitive legal settings, promising a future where legal questions can be answered correctly by LLMs without sacrificing data privacy.