Welcome to the staging ground for new communities! Each proposal has a description in the "Descriptions" category and a body of questions and answers in "Incubator Q&A". You can ask questions (and get answers, we hope!) right away, and start new proposals.

Are you here to participate in a specific proposal? Click on the proposal tag (with the dark outline) to see only posts about that proposal and not all of the others that are in progress. Tags are at the bottom of each post.

Post History

66%

+2 −0

Incubator Q&A Do Large Language Models "reason"?

I am afraid that the answer to your question is actually quite simple: there is no consensus among scientists. This is illustrated by the fact that the "fathers of Deep Learning" do not entir...

posted 1y ago by mr Tsjolder‭

Answer

#1: Initial revision by

mr Tsjolder‭ · 2024-05-24T08:18:55Z (about 1 year ago)

Copy Link

Raw

Markdown

I am afraid that the answer to your question is actually quite simple:
> there is **no consensus** among scientists.

This is illustrated by the fact that the "[fathers of Deep Learning](https://awards.acm.org/about/2018-turing)" do not entirely agree on the current state of AI:
- Geoffrey Hinton strongly believes that human-level/human-like AI is possible and near. This is illustrated by the warnings he spread out (e.g. in this [interview on YouTube](https://www.youtube.com/watch?v=qrvK_KuIeJk)) as he left Google.
- Yoshua Bengio has recently switched his research focus towards AI safety, arguing that we should prepare ourselves for when general AI arrives (e.g. this recent [Science article](https://www.science.org/doi/10.1126/science.adn0117)).
In a talk at an ICLR 2024 [Workshop on AGI](https://agiworkshop.github.io/), he explained that he did not expect LLMs to solve the tasks that they solve today. He also made clear that he does not necessarily believe that AGI is around the corner, but it might come unexpectedly fast (cf. LLMs) and we should be prepared.
- Yann LeCun is convinced that there are still critical components missing to build human-level intelligent machines (e.g. as expressed in this [Time interview](https://time.com/6694432/yann-lecun-meta-ai-interview/)).

Although these discussions mostly focus on "Artificial General Intelligence (AGI)", which I would consider to be a generalisation of the question on reasoning, I think the ideas would transfer reasonably well to the question at hand. For the sake of clarity, this is how I would assume each of them answers the question of reasoning:
- Geoffrey Hinton would argue LLMs are able to reason.
- Yoshua Bengio would argue that we don't know, but we should consider it a possibility.
- Yann LeCun would argue that LLMs can not reason.

Although these opinions are backed by decades of scientific experience, they are not actually backed by any hard science (as far as I am aware).
The main reason for this is that there are no clear, workable definitions and or tests for reasoning and/or AGI (this was also one of the main outcomes of the AGI workshop at ICLR 2024).

---

However, there have been some attempts to move in this direction.
- The most well-known benchmark for reasoning and human-level intelligence (to me) is probably [the Abstraction and Reasoning Corpus (ARC)](https://github.com/fchollet/ARC) from Francois Chollet.
However, I never read the paper behind it and I have not followed up on any discussions or critiques concerning this benchmark.
I assume that it has some foundations, but I am a little sceptical about whether a single person would be able to solve a problem that has been studied for thousands of years.
- There are plenty of papers that have studied the reasoning capabilities of language models at different levels.
I stumbled over this [survey paper](https://aclanthology.org/2023.findings-acl.67/) and believe it provides a nice overview.
It lists a few arguments in favour of reasoning capabilities (e.g. chain-of-thought prompting) and why they can not be taken as "proof" for reasoning capabilities (e.g. hallucinations in one or more steps).
This survey also concludes that the question of whether language models could reason remains open.

---

Finally, I could share the reasoning behind my personal bias towards a negative answer to your question (for full disclosure).
From a technical perspective, it could be argued that language models (and other models marketed as generative AI) are just some form of [Random "Number" Generators (RNGs)](https://en.wikipedia.org/wiki/Non-uniform_random_variate_generation).
The most successful language models are believed to be trained on high-quality data like books.
A significant amount of this data will be educational or technical and therefore language that is related to reasoning.
Therefore, this kind of language must have a high probability (density) in the distribution of the data.
As a result, a good RNG, i.e. one that produces outputs with the same probability as its input data, should also produce a significant amount of reasoning-like language.
At least for me, this explains most of the phenomena related to the reasoning capabilities of language models.
Of course, we do not know what data the Large Language Models are trained with, so it is impossible to scientifically verify this kind of claim.
Also, it could be argued that this is exactly how human reasoning works.
However, I like to believe that there is more to reasoning than just mimicking behaviour.

PS: Sorry for the long reply

Communities

Post History