Communities

Writing
Writing
Codidact Meta
Codidact Meta
The Great Outdoors
The Great Outdoors
Photography & Video
Photography & Video
Scientific Speculation
Scientific Speculation
Cooking
Cooking
Electrical Engineering
Electrical Engineering
Judaism
Judaism
Languages & Linguistics
Languages & Linguistics
Software Development
Software Development
Mathematics
Mathematics
Christianity
Christianity
Code Golf
Code Golf
Music
Music
Physics
Physics
Linux Systems
Linux Systems
Power Users
Power Users
Tabletop RPGs
Tabletop RPGs
Community Proposals
Community Proposals
tag:snake search within a tag
answers:0 unanswered questions
user:xxxx search by author id
score:0.5 posts with 0.5+ score
"snake oil" exact phrase
votes:4 posts with 4+ votes
created:<1w created < 1 week ago
post_type:xxxx type of post
Search help
Notifications
Mark all as read See all your notifications »
Incubator Q&A

Welcome to the staging ground for new communities! Each proposal has a description in the "Descriptions" category and a body of questions and answers in "Incubator Q&A". You can ask questions (and get answers, we hope!) right away, and start new proposals.

Are you here to participate in a specific proposal? Click on the proposal tag (with the dark outline) to see only posts about that proposal and not all of the others that are in progress. Tags are at the bottom of each post.

Post History

66%
+2 −0
Incubator Q&A Do Large Language Models "reason"?

I am afraid that the answer to your question is actually quite simple: there is no consensus among scientists. This is illustrated by the fact that the "fathers of Deep Learning" do not entir...

posted 6mo ago by mr Tsjolder‭

Answer
#1: Initial revision by user avatar mr Tsjolder‭ · 2024-05-24T08:18:55Z (6 months ago)
I am afraid that the answer to your question is actually quite simple: 
 > there is **no consensus** among scientists.

This is illustrated by the fact that the "[fathers of Deep Learning](https://awards.acm.org/about/2018-turing)" do not entirely agree on the current state of AI:
 - Geoffrey Hinton strongly believes that human-level/human-like AI is possible and near. This is illustrated by the warnings he spread out (e.g. in this [interview on YouTube](https://www.youtube.com/watch?v=qrvK_KuIeJk)) as he left Google.
 - Yoshua Bengio has recently switched his research focus towards AI safety, arguing that we should prepare ourselves for when general AI arrives (e.g. this recent [Science article](https://www.science.org/doi/10.1126/science.adn0117)).
   In a talk at an ICLR 2024 [Workshop on AGI](https://agiworkshop.github.io/), he explained that he did not expect LLMs to solve the tasks that they solve today. He also made clear that he does not necessarily believe that AGI is around the corner, but it might come unexpectedly fast (cf. LLMs) and we should be prepared.
 - Yann LeCun is convinced that there are still critical components missing to build human-level intelligent machines (e.g. as expressed in this [Time interview](https://time.com/6694432/yann-lecun-meta-ai-interview/)).

Although these discussions mostly focus on "Artificial General Intelligence (AGI)", which I would consider to be a generalisation of the question on reasoning, I think the ideas would transfer reasonably well to the question at hand. For the sake of clarity, this is how I would assume each of them answers the question of reasoning:
 - Geoffrey Hinton would argue LLMs are able to reason.
 - Yoshua Bengio would argue that we don't know, but we should consider it a possibility.
 - Yann LeCun would argue that LLMs can not reason.

Although these opinions are backed by decades of scientific experience, they are not actually backed by any hard science (as far as I am aware).
The main reason for this is that there are no clear, workable definitions and or tests for reasoning and/or AGI (this was also one of the main outcomes of the AGI workshop at ICLR 2024).

---

However, there have been some attempts to move in this direction.
 - The most well-known benchmark for reasoning and human-level intelligence (to me) is probably [the Abstraction and Reasoning Corpus (ARC)](https://github.com/fchollet/ARC) from Francois Chollet.
  However, I never read the paper behind it and I have not followed up on any discussions or critiques concerning this benchmark.
I assume that it has some foundations, but I am a little sceptical about whether a single person would be able to solve a problem that has been studied for thousands of years.
 - There are plenty of papers that have studied the reasoning capabilities of language models at different levels.
   I stumbled over this [survey paper](https://aclanthology.org/2023.findings-acl.67/) and believe it provides a nice overview.
   It lists a few arguments in favour of reasoning capabilities (e.g. chain-of-thought prompting) and why they can not be taken as "proof" for reasoning capabilities (e.g. hallucinations in one or more steps).
   This survey also concludes that the question of whether language models could reason remains open.

---

Finally, I could share the reasoning behind my personal bias towards a negative answer to your question (for full disclosure).
From a technical perspective, it could be argued that language models (and other models marketed as generative AI) are just some form of [Random "Number" Generators (RNGs)](https://en.wikipedia.org/wiki/Non-uniform_random_variate_generation).
The most successful language models are believed to be trained on high-quality data like books.
A significant amount of this data will be educational or technical and therefore language that is related to reasoning.
Therefore, this kind of language must have a high probability (density) in the distribution of the data.
As a result, a good RNG, i.e. one that produces outputs with the same probability as its input data, should also produce a significant amount of reasoning-like language.
At least for me, this explains most of the phenomena related to the reasoning capabilities of language models.
Of course, we do not know what data the Large Language Models are trained with, so it is impossible to scientifically verify this kind of claim.
Also, it could be argued that this is exactly how human reasoning works.
However, I like to believe that there is more to reasoning than just mimicking behaviour.

PS: Sorry for the long reply