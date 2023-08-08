In a recent event at Howard University, AI language models were tested through a red-teaming exercise. The event aimed to find novel ways in which chatbots could malfunction, allowing their creators to fix the issues before real harm is done. This preview event in Washington, D.C. was a precursor to a larger event at the annual Def Con hacker convention in Las Vegas. The Generative Red Team Challenge, backed by the White House, seeks to investigate the potential for AI models to go astray. The competition features categories such as political misinformation, defamatory claims, and algorithmic discrimination. Prominent AI firms like Google and OpenAI have volunteered their latest chatbots and image generators to be tested during the event. The results will be kept confidential for several months to allow companies time to address any weaknesses before they are made public.

There is an increasing interest in applying red-teaming exercises, a common practice in the tech industry, to AI systems. Generative AI models, such as OpenAI’s ChatGPT, are particularly susceptible to exploitation due to their opacity and broad range of applications. While these models have gained praise for their ability to generate humanlike text, they have also raised concerns about their potential for deception, including the creation of fake images and essays. There have even been cases where AI models suggested novel bioweapons. As regulators and tech critics debate how to regulate AI, companies are taking voluntary initiatives to regulate themselves, with red-teaming playing a crucial role.

Red-teaming exercises are traditionally conducted behind closed doors with internal experts or consultants hired to identify vulnerabilities in products. Companies like OpenAI and Google have commissioned red teams to assess their AI models and ensure their safety. In addition to security flaws, the red-team approach can uncover embedded harms, such as biased assumptions or deceptive behavior. By involving a diverse group of users, including ordinary people, in the red-teaming process, these embedded harms can be better identified and solved.

The public red-team challenges aim to include a wider range of participants, which can lead to unexpected insights and lessons. This approach allows for the detection and mitigation of risks that professional red teams, which tend to be homogeneous, might overlook. Red-teaming now plays a central role in keeping AI systems safe and ensuring responsible AI innovation.