Despite their impressive language abilities, today’s AI chatbots struggle with reasoning. However, OpenAI’s secretive project Strawberry could potentially address this challenge.
Current large language models (LLMs) fall short in complex, multi-step problem-solving, limiting their utility. Enhancing these capabilities is a priority for research labs, and recent reports suggest OpenAI might be close to a breakthrough.
Reuters revealed an internal document discussing Strawberry, a project aimed at equipping models with skills like planning, autonomous internet navigation, and “deep research.” Bloomberg reported a demo at an OpenAI meeting showing GPT-4’s human-like reasoning skills, possibly linked to Strawberry.
This project is an extension of the Q* project, which demonstrated significant problem-solving advancements, such as solving grade-school math problems—a proxy for reasoning skills.
Sources claim OpenAI tested a model that scored 90 percent on a challenging math test, hinting at enhanced reasoning abilities. Strawberry reportedly fine-tunes existing LLMs using a method akin to Stanford’s Self-Taught Reasoner (STaR). This method involves “chain-of-thought” prompting, where models explain their reasoning steps, correct mistakes, and refine their logic. This iterative process led to improved performance, suggesting the model can self-improve using its generated data.
While the specifics of Strawberry remain unclear, it could involve self-generated data, a step towards “recursive self-improvement,” where AI enhances its capabilities autonomously. However, skepticism is warranted as commercial AI labs often exaggerate progress. The rebranding of Q* as Strawberry, with no significant public advancements, suggests cautious optimism. Nonetheless, leading AI companies are investing heavily in overcoming reasoning bottlenecks, and a genuine breakthrough might soon be revealed.