Abstract

Retrieval-augmented generation, a technique to improve the accuracy of responses of large language models (LLM), can result in an expanded attack surface for the LLM and an increased potential for exfiltration of sensitive data. This disclosure describes techniques to reduce the attack surface for LLMs that use retrieval-augmented generation. In a data upload phase, access permissions of data resources are recorded and quantified. Implicit permissions granted by the LLMs to different principals are assessed. If a data classification scheme is added to the data resources, a score can measure the risk of a prompt provided to the LLM, thus forestalling the possibility of users with lower classification level access to obtain data at higher classification levels. By analyzing relationships stored in a permissions database, hidden permissions can be exposed, which AI models might have otherwise inadvertently granted. This provides insight into potential security risks and enables cloud customers to govern and monitor the access to data used by LLM deployments.

Creative Commons License

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.

Share

COinS