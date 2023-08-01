A recent study conducted by researchers at Cornell University suggests that while artificial intelligence (AI) can generate jokes, it lacks a deep understanding of what makes them funny. The study involved testing AI models and humans on tasks related to The New Yorker magazine’s Cartoon Caption Contest entries. These tasks included matching jokes to cartoons, identifying winning captions, and explaining their humor. In all tasks, humans outperformed the AI models, indicating that there is still room for improvement in AI’s understanding of humor.

The AI models achieved only 62% accuracy in a multiple-choice test matching cartoons to captions, compared to 94% accuracy achieved by humans. When it came to comparing human-generated explanations of humor with AI-generated explanations, humans’ versions were preferred approximately 2-to-1. This study highlights that while AI has made advancements, it still has a way to go before fully understanding humor.

The researchers used large neural networks, a form of AI, to generate thousands of jokes similar to the classic “Why did the chicken cross the road?” joke. However, the question remained whether these AI models truly understood why these jokes were funny. By using The New Yorker’s Cartoon Caption Contest as a testbed, the researchers aimed to determine the AI’s comprehension of humor.

Lead author of the study, Jack Hessel, explained that testing AI models’ understanding of humor can be done through evaluation tests, such as multiple-choice tests. However, even if AI models surpass human performance in these tests, it is still debatable whether they truly understand humor due to the inherently human nature of comprehension. Nevertheless, the researchers acknowledged the impressive performance of AI models on these tasks, regardless of their understanding.

The researchers compiled over 700 caption contests from The New Yorker, spanning 14 years, for their study. They tested two types of AI models – “from pixels” (computer vision) and “from description” (analysis of human summaries of cartoons) – for the three tasks. The relationship between the captions and cartoons in The New Yorker contest required a higher level of sophistication and indirect understanding compared to other datasets.

The study revealed a significant gap between AI and human understanding of humor. While AI models achieved only 62% accuracy in matching cartoons to captions, humans achieved 94% accuracy in the same task. Additionally, humans’ explanations of humor were preferred over AI-generated explanations at a ratio of 2-to-1. However, the researchers noted that AI could potentially be used as a collaborative tool for humorists to brainstorm ideas, even if it does not fully grasp humor yet.

The study, titled “Do Androids Laugh at Electric Sheep? Humor ‘Understanding’ Benchmarks from The New Yorker Caption Contest,” received a best-paper award at the annual meeting of the Association for Computational Linguistics. The research was funded by organizations such as the Defense Advanced Research Projects Agency, AI2, and Google.

Overall, the study suggests that while AI has made progress in generating jokes, it still has a long way to go in truly understanding humor.