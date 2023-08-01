While artificial intelligence (AI) can generate jokes, a recent study suggests that it lacks an understanding of what makes jokes funny. In an experiment conducted by researchers, AI models and humans were tested on tasks involving matching jokes to cartoons, identifying winning captions, and explaining their humor. Humans significantly outperformed the machines in all tasks, indicating that AI’s understanding of humor still has room to improve.

The study used the New Yorker magazine’s Cartoon Caption Contest entries as a testbed. The AI models achieved only 62% accuracy in matching cartoons to captions in a multiple-choice test, compared to 94% accuracy by humans.

Although AI has made some progress in understanding humor, it still has a long way to go. Jack Hessel, lead author of the study, explains that testing AI models’ understanding involves building evaluations and multiple-choice tests. If a model surpasses human performance in these tests, it raises the question of whether the machine truly understands humor. While understanding is a human trait, the impressive performance of AI on these tasks cannot be ignored.

The researchers compiled over 700 caption contests from the New Yorker, spanning 14 years, for the study. They tested two types of AI models – computer vision-based models that analyze images, and models that analyze human summaries of cartoons. The relationships between the images and captions in the New Yorker contest required a higher level of sophistication for the models to understand.

The experiment revealed a significant gap between AI and human understanding of humor. In the multiple-choice matching task, AI models achieved only 62% accuracy, while humans achieved 94% accuracy. Human-generated explanations were also preferred over AI-generated ones at a ratio of 2-to-1.

Despite AI’s current limitations, the authors of the study suggest that it could potentially be used as a collaborative tool for humorists to brainstorm ideas. The study was funded in part by the Defense Advanced Research Projects Agency, AI2, and a Google Focused Research Award.