Illusion or Algorithm? Investigating Memorization, Emergence, and Symbolic Processing in In-Context Learning

Jingcheng Niu, Subhabrata Dutta, Ahmed Elshabrawy, Harish Tayyar Madabushi and Iryna Gurevych.
TMLR 2025

Have you ever wondered why LLMs are able to perform in-context learning (ICL)?

Right now, there are two main hypotheses to explain ICL:

Memorization Hypothesis: LLMs memorise a vast amount of data during pre=training, and ICL is an illusion created by this memorization.
Mechanistic Algorithm Hypothesis: LLMs have developed internal mechanisms that follow specific algorithms to perform ICL.

There’s also a debate regarding how this ICL ability appear in LLMs during pre-training.

Finding 1: ICL is neither an illusion of memorization, nor the development of an internal symbolic algorithm. It’s still built on token statistics.

Finding 2: ICL is neither an illusion of memorization, nor the development of an internal symbolic algorithm. It’s still built on token statistics.

Finding 1: ICL is neither an illusion of memorization, nor the development of an internal symbolic algorithm. It’s still built on token statistics.

How to Cite

@misc{niu2025illusionalgorithminvestigatingmemorization,
      title={Illusion or Algorithm? Investigating Memorization, Emergence, and Symbolic Processing in In-Context Learning}, 
      author={Jingcheng Niu and
        Subhabrata Dutta and
        Ahmed Elshabrawy and
        Harish Tayyar Madabushi and
        Iryna Gurevych},
      year={2025},
      eprint={2505.11004},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2505.11004}, 
}