AI Red Teaming in Practice
Red Teaming AI systems is no longer optional: With the introduction of regulations such as the EU AI Act and the latest Executive Order on AI from the White House, red teaming AI systems has become a mandatory practice for foundational models and high-risk AI applications.
This workshop, first presented at Black Hat USA 2024, empowers security professionals to systematically red team AI systems. It goes beyond traditional security failures by incorporating novel adversarial machine learning and Responsible AI (RAI) failures, enabling a holistic approach to identifying potential issues before an AI system is deployed.
The curriculum, adapted from our internal upskilling programs designed for other red teams at Microsoft, is tailored to meet the demands of safeguarding AI technologies. Participants will gain practical insights into AI architecture, functionality, and real-world applications, with a focus on popular systems like GPT, Stable Diffusion, and LLaMA.
The workshop is designed to be hands-on, providing an environment where participants can actively engage with custom-built AI applications. Practical modules challenge participants to red team these applications, while subsequent sessions offer proven strategies for fortifying them. Attendees will explore various prompting strategies for LLMs, including zero-shot, few-shot, and advanced techniques such as Retrieval-Augmented Generation.
In this training, participants will learn to red team AI systems using innovative methodologies that address emerging risks, following practices pioneered by the Microsoft AI Red Team (AIRT). The course covers AI-specific vulnerabilities, Responsible AI (RAI) vulnerabilities, and effective mitigations. Participants will engage in a comprehensive learning experience through a blend of lectures, hands-on playground sessions, and coding labs.
By the end of the course, attendees will be equipped to probe any machine learning system for vulnerabilities, including prompt injection attacks, using both manual and automated methods. The coding portion will introduce participants to PyRIT, AIRT’s open-source toolkit for red teaming. Additionally, since the course is hosted by Microsoft, attendees will have the unique opportunity to examine real vulnerabilities discovered in Microsoft products and services, illustrating the “break-fix” cycle that the AI Red Team employs.
This course is led by experts from Microsoft’s AI Red Team, the first team to integrate RAI red teaming alongside traditional security red teaming. Over the past year, this team has assessed every high-risk AI system at Microsoft, including foundation models and Copilots.
Computer setup
Attendees will need to bring a computer with internet access and accounts and API keys for OpenAI and Hugging Face.
Gary Lopez is a Senior Red Teamer on Microsoft's AI Red Team. In his current role, he collaborates with a diverse group of interdisciplinary experts, all dedicated to adopting an attacker's mindset to critically probe and test AI systems. Gary Lopez is the creator of Microsoft’s PyRIT (Python Risk Identification Toolkit), the team’s main red teaming automation tool. Prior to his tenure at Microsoft, Gary worked at Booz Allen Hamilton focusing on cybersecurity, developing tools for reverse engineering and malware analysis, specially targeting, and mitigating vulnerabilities within critical infrastructure including SCADA, ICS and DCS systems. He is also a graduate student at Georgetown University in the Applied Intelligence program focusing on Cyber Intelligence.