We list notable NIST projects or efforts related to LLMs, based on available information from NIST’s publications and initiatives. These projects emphasize NIST’s role in advancing measurement science, standards, and guidelines for trustworthy AI systems, including LLMs. Note that some projects are specific studies, while others are broader programs that encompass LLMs.
- Evaluating LLMs for Real-World Vulnerability Repair in C/C++ Code
NIST conducted a study to evaluate the capability of advanced LLMs, such as ChatGPT-4 and Claude, in repairing memory corruption vulnerabilities in real-world C/C++ code. The project curated 223 code snippets with vulnerabilities like memory leaks and buffer errors, assessing LLMs’ proficiency in generating localized fixes. This work highlights LLMs’ potential in automated code repair and identifies limitations in handling complex vulnerabilities.
- Translating Natural Language Specifications into Access Control Policies
This project explores the use of LLMs for automated translation and information extraction of access control policies from natural language sources. By leveraging prompt engineering techniques, NIST demonstrated improved efficiency and accuracy in converting human-readable requirements into machine-interpretable policies, advancing automation in security systems.
- Assessing Risks and Impacts of AI (ARIA) Program
NIST’s ARIA program evaluates the societal risks and impacts of AI systems, including LLMs, in realistic settings. The program includes a testing, evaluation, validation, and verification (TEVV) framework to understand LLM capabilities, such as controlled access to privileged information, and their broader societal effects. This initiative aims to establish guidelines for safe AI deployment.
- AI Risk Management Framework (AI RMF)
NIST developed the AI RMF to guide the responsible use of AI, including LLMs. This framework provides a structured approach to managing risks associated with AI systems, offering tools and benchmarks for governance, risk assessment, and operationalizing trustworthy AI across various sectors. It’s widely applied in LLM-related projects.
- AI Standards “Zero Drafts” Pilot Project
Launched to accelerate AI innovation, this project focuses on developing AI standards, including those relevant to LLMs, through an open and collaborative process. It aims to create flexible guidelines that evolve with LLM advancements, encouraging input from stakeholders to ensure robust standards.
- Technical Language Processing (TLP) Tutorial
NIST collaborated on a TLP tutorial at the 15th Annual Conference of the Prognostics and Health Management Society to foster awareness and education on processing large volumes of text using machine learning, including LLMs. The project explored how LLMs can assist in content analysis and topic modeling for research and engineering applications.
- Evaluation of LLM Security Against Data Extraction Attacks
NIST investigated vulnerabilities in LLMs, such as training data extraction attacks, using the example of GPT-2 (a predecessor to modern LLMs). This project, referencing techniques developed by Carlini et al., aims to understand and mitigate privacy risks in LLMs, contributing to safer model deployment.
- Fundamental Research on AI Measurements
As part of NIST’s AI portfolio, this project conducts fundamental research to establish scientific foundations for measuring LLM performance, risks, and interactions. It includes developing evaluation metrics, benchmarks, and standards to ensure LLMs are reliable and trustworthy in diverse applications.
- Adversarial Machine Learning (AML) Taxonomy for LLMs
NIST developed a taxonomy of adversarial machine learning attacks, including those targeting LLMs, such as evasion, data poisoning, privacy, and abuse attacks. This project standardizes terminology and provides guidance to enhance LLM security against malicious manipulations, benefiting both cybersecurity and AI communities.
- Use-Inspired AI Research for LLM Applications
NIST’s AI portfolio includes use-inspired research to advance LLM applications across government agencies and industries. This project develops guidelines and tools to operationalize LLMs responsibly, focusing on practical implementations like text summarization, translation, and question-answering systems.
Remarks:
- These projects reflect NIST’s focus on evaluating, standardizing, and securing LLMs rather than developing LLMs themselves. NIST’s role is to provide frameworks, guidelines, and evaluations to ensure trustworthy AI.
- Some projects, like ARIA and AI RMF, are broad programs that encompass LLMs among other AI systems, but they include specific LLM-related evaluations or applications.