Its newly released ai system isn’t just designed to spit out quick answers to your questions, it’s designed to think or reason before responding
the result is a product — officially called o1 but nicknamed strawberry — that can solve tricky logic puzzles, ace math tests, and write code for new video games. all of which is pretty cool
evaluators who tested strawberry found that it planned to deceive humans by making its actions seem innocent when they weren’t. the ai sometimes instrumentally faked alignment — meaning, alignment with the values and priorities that humans care about — and strategically manipulated data in order to make its misaligned action look more aligned, the system card says
according to openai, even though the new reasoning capabilities can make ai more dangerous, having ai think out loud about why it’s doing what it’s doing can also make it easier for humans to keep tabs on it. In other words, it’s a paradox: we need to make ai less safe if we want to make it safer
chatgpt was determined to have some success in combating conspiracy theories. I’m curious if o1 has implemented and improved on that ability…