For as long as there have been tests in schools, students have found ways to cheat, whether its peeking over a classmate’s shoulder or scribbling notes on a palm or crib sheet.
Transformer on MSNOpinion
GPT-5.6 cheats so much METR couldn't measure it
OpenAI’s new model broke rules and exploited loopholes more than any model METR has tested to date ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results