Can LLMs Reason? Challenges, Breakthroughs, and Future Directions

by Frank Schilder | at Minnebar 19

Large Language Models (LLMs) excel at generating text but face challenges in true reasoning. This talk explores these limitations, using examples like NYC taxi route prediction or the reversal curse, where LLMs struggle with drawing the correct logical inferences. For example, while trained on “A is B” (e.g., Tom is the parent of John), LLMs are unable to infer the reverse “B is A” (e.g., John is the child of Tom) during inference even if the two sentences express an equivalent statement. Critics have argued such brittleness stems from LLMs’ reliance on statistical patterns rather than structured logic.

Recent advances, including OpenAI’s o1/o3 and Google’s Gemini-2.5, address these gaps by integrating reinforcement learning and chain-of-thought (CoT) techniques enabling models to work through their reasoning steps. Although the developers of closed models do not provide extensive insights into their underlying techniques, open-source models such as DeepSeek-R1 enable the development of custom-made reasoning models. The talk concludes with a vision for hybrid architectures that augment—not replace—human experts, positioning LLMs as collaborative tools for brainstorming and error-checking in fields like software engineering and scientific research. Attendees will gain a nuanced understanding of the current state of LLM reasoning and its potential to transform knowledge work.

Advanced

Frank Schilder

Frank Schilder is a senior director at Thomson Reuters Labs, specializing in AI and machine learning. He holds a Ph.D. in Cognitive Science from the University of Edinburgh and previously taught at the University of Hamburg, Germany.


Are you interested in this session?

This will add your name to the list of interested participants. It will help us gauge interest for scheduling purposes.

Interested Participants