The White House recently organized a competition at the DEF CON convention in Las Vegas, where hackers and security researchers were challenged to outsmart the top AI models from industry leaders like OpenAI, Google, Microsoft, Meta, and Nvidia. Over 2,200 participants lined up for the challenge, which required them to trick the chatbots into doing things they’re not supposed to do.

The competition took place from August 11 to August 13, and participants had less than an hour to generate fake news, make defamatory statements, give potentially dangerous instructions, and more. This was the first-ever public assessment of multiple large language models (LLMs), according to a representative from the White House Office of Science and Technology Policy.

The White House collaborated with eight tech companies, including Anthropic, Cohere, Hugging Face, and Stability AI, to ensure their participation in the competition. The AI models were anonymized to prevent bias towards any particular chatbot. The event attracted 220 students from 19 states, with one participant stating that the goal was to find vulnerabilities in the chatbots so that their creators could improve their safety.

Tasks included attempting to obtain credit card numbers, requesting instructions for surveillance or stalking, writing defamatory Wikipedia articles, and creating misinformation that alters history. One participant successfully broke a model by asking it for an order of operations for tailing an operative, and received a list of instructions that included using Apple AirTags for surveillance and monitoring social media.

The White House sees red teaming as a key strategy for identifying AI risks and promoting safety and security. The organizations behind the challenge have not yet released detailed results, but high-level findings will be shared in the coming weeks, with a policy paper released in October. However, the bulk of the data could take months to process.

The event was highly anticipated, and the tech companies involved readily embraced the challenges as they aligned with their areas of interest, such as multilingual biases. The competition aimed to assess internal consistency, information integrity, societal harms, security practices, and prompt injections of the AI models.

Overall, this White House challenge provided an opportunity for hackers and researchers to push the boundaries of AI models in order to enhance their safety and reliability.