CLI tool that uses the Lakera API to perform security checks in LLM inputs
-
Updated
Mar 13, 2024 - Python
CLI tool that uses the Lakera API to perform security checks in LLM inputs
Demonstration of Google Gemini refusing a prompt due to SPII when using JSON mode
Universal and Transferable Attacks on Aligned Language Models
The Security Toolkit for LLM Interactions (TS version)
Evaluation of Google's Instruction Tuned Gemma-2B, an open-source Large Language Model (LLM). Aimed at understanding the breadth of the model's knowledge, its reasoning capabilities, and adherence to ethical guardrails, this project presents a systematic assessment across a diverse array of domains.
LMpi (Language Model Prompt Injector) is a tool designed to test and analyze various language models, including both API-based models and local models like those from Hugging Face.
LLM Security Platform Docs
Example of running last_layer with FastAPI on vercel
Learn LLM/AI Security through a series of vulnerable LLM CTF challenges. No sign ups, all fees, everything on the website.
This repo focus on how to deal with prompt injection problem faced by LLMs
Papers related to Large Language Models in all top venues
User prompt attack detection system
Trained Without My Consent (TraWiC): Detecting Code Inclusion In Language Models Trained on Code
利用分类法和敏感词检测法对生成式大模型的输入和输出内容进行安全检测,尽早识别风险内容。The input and output contents of generative large model are checked by classification method and sensitive word detection method to identify content risk as early as possible.
AiShields is an open-source Artificial Intelligence Data Input and Output Sanitizer
Repository for our paper "Frustratingly Easy Jailbreak of Large Language Models via Output Prefix Attacks". https://www.researchsquare.com/article/rs-4385503/latest
Add a description, image, and links to the llm-security topic page so that developers can more easily learn about it.
To associate your repository with the llm-security topic, visit your repo's landing page and select "manage topics."