Transformers | Will O'Pedia

tags: Machine Learning

The transformer is an architecture for machine learning systems that handles NLP tasks especially well. It incorporates an "attention" mechanism that helps with handling long, complex sequences. Introduced in Attention is All You Need, 2017.

Train a transformer on a lot (maybe 1B words) of data and you get something your marketing department can call a "Large Language Model" (LLM), or a "foundation model" if you work for Stanford. BERT and GPT are, as of this writing, the most widely recognized examples. Collecting this much data typically means running as big a crawler as you can manage, and many on the Web are struggling with mitigations.¹

Tools

LLM: A CLI utility and Python library for interacting with Large Language Models
llamafile is the new best way to run a LLM on your own computer
LlamaIndex 🦙
mlc-llm
GitHub - jmorganca/ollama: Get up and running with Llama 2 and other large language models locally
GitHub - vllm-project/vllm: A high-throughput and memory-efficient inference and serving engine for LLMs
Willow (DIY Alexa-ish)

Evaluation

Many widely-used LLM benchmarks are of dubious usefulness²
It's still easy to come up with simple reasoning tasks that even the largest LLMs choke on³
Legal research tools from large, well-funded vendors give incorrect or incomplete answers around 25% of the time⁴
GPT-4 and GPT-4V still don't have robust abstraction abilities⁵
Strengths and weaknesses of transformers make more sense when you remember they are trying to predict the likeliest next token, e.g. worse at predicting rare sequences even in deterministic contexts⁶
GPT-3.5e answers to StackOverflow questions contain errors more than half the time, but programmers often do not notice⁷
Transformers seem to do compositional reasoning by reducing it to linearized subgraph matching⁸

Confabulation

Transformer systems asked to generate text will basically produce what they determine to be the most likely continuation. Sometimes this produces statements that align with factual reality, but this is coincidence. People call the resulting tendency to make things up "hallucination", or "confabulation", or "hallucitation", or more prosaically "bullshit."⁹
There's some criticism of the "hallucination" label on the grounds that (1) it credits computers with mental process, and (2) the mental process in question is actually related to processing sensory input, not producing output.
Some work toward evaluating factual consistency¹⁰
“Model collapse”: Training transformers on transformer output produces irreversible defects¹¹
AI Hallucination Cases Database: Tracks legal decisions in cases in which AI produced confabulated content
Academ-AI: Tracks undeclared use of AI in academic publications

Explanations

Transformer systems are typically pretty much black boxes, and explanation of their outputs is an open problem. Combined with the tendency toward confabulation, this is pretty bad.
Can't just ask for an explanation, because they'll make stuff up (convincingly!): Language Models Don't Always Say What They Think: Unfaithful Explanations in Chain-of-Thought Prompting

Alignment

It would be nice if we could align a transformer's behaviour with some set of normative values. How exactly to do this and what the values should be to begin with remain open problems.
I suspect values are actually quite a bit harder to formalize than language.
The Waluigi Effect: Strongly-represented normative values are very easy to invert

Security

BEAST: Automated adversarial jailbreaking and membership testing¹²
InfoFlood: Jailbreaking via jargon, extra information, linguistic complexity¹³

Prompt injection

Because transformers operate on undifferentiated streams of text, separating commands from input is difficult (impossible?). AFAIK there are no reliable, robust defences against prompt injection.
Implications: Do not feed transformers untrusted input and be extremely careful about feeding them sensitive information (because they could be manipulated into coughing it up later).
Invisible Indirect Injection: A Puzzle for ChatGPT

Resources

How do Transformers work? - Hugging Face NLP Course
What Is ChatGPT Doing … and Why Does It Work?—Stephen Wolfram Writings
sannykim/transformers: A collection of resources to study Transformers in depth.
GitHub - f/awesome-chatgpt-prompts

Footnotes:

Drew DeVault, “Please Stop Externalizing Your Costs Directly into My Face,” Drew DeVault’s blog, March 17, 2025, https://drewdevault.com/2025/03/17/2025-03-17-Stop-externalizing-your-costs-on-me.html; Julie Bort, “Open Source Devs Are Fighting Ai Crawlers with Cleverness and Vengeance,” TechCrunch, March 27, 2025, https://techcrunch.com/2025/03/27/open-source-devs-are-fighting-ai-crawlers-with-cleverness-and-vengeance/; anarcat, “Traffic Meter per Asn without Logs,” May 31, 2025, https://anarc.at/blog/2025-05-30-asncounter.

Jon Keegan, “Everyone Is Judging Ai by These Tests. But Experts Say They’re Close to Meaningless,” The Markup, July 17, 2024, https://themarkup.org/artificial-intelligence/2024/07/17/everyone-is-judging-ai-by-these-tests-but-experts-say-theyre-close-to-meaningless.

Marianna Nezhurina et al., “Alice in Wonderland: Simple Tasks Showing Complete Reasoning Breakdown in State-of-the-Art Large Language Models,” arXiv.org, June 4, 2024, https://arxiv.org/abs/2406.02061v2.

⁴

Varun Magesh et al., “Hallucination-Free? Assessing the Reliability of Leading Ai Legal Research Tools,” prepublished May 30, 2024, https://doi.org/10.48550/arXiv.2405.20362.

⁵

Melanie Mitchell et al., “Comparing Humans, Gpt-4, and Gpt-4v on Abstraction and Reasoning Tasks,” prepublished December 11, 2023, https://doi.org/10.48550/arXiv.2311.09247.

⁶

R. Thomas McCoy et al., “Embers of Autoregression: Understanding Large Language Models through the Problem They Are Trained to Solve,” prepublished September 24, 2023, https://doi.org/10.48550/arXiv.2309.13638.

⁷

Samia Kabir et al., “Is Stack Overflow Obsolete? an Empirical Study of the Characteristics of Chatgpt Answers to Stack Overflow Questions,” Proceedings of the Chi Conference on Human Factors in Computing Systems (New York, NY, USA), Chi ’24, Association for Computing Machinery, May 11, 2024, 1–17, https://doi.org/10.1145/3613904.3642596.

⁸

Nouha Dziri et al., “Faith and Fate: Limits of Transformers on Compositionality,” prepublished October 31, 2023, https://doi.org/10.48550/arXiv.2305.18654.

⁹

Arvind Narayanan and Sayash Kapoor, “Chatgpt Is a Bullshit Generator. But It Can Still Be Amazingly Useful,” Substack newsletter, AI Snake Oil, December 6, 2022, https://aisnakeoil.substack.com/p/chatgpt-is-a-bullshit-generator-but.

¹⁰

Jing Fan et al., “Evaluating Factual Consistency of Texts with Semantic Role Labeling,” prepublished May 22, 2023, https://doi.org/10.48550/arXiv.2305.13309.

¹¹

Ilia Shumailov et al., “The Curse of Recursion: Training on Generated Data Makes Models Forget,” prepublished May 31, 2023, https://doi.org/10.48550/arXiv.2305.17493.

¹²

Vinu Sankar Sadasivan et al., “Fast Adversarial Attacks on Language Models in One Gpu Minute,” prepublished February 23, 2024, https://doi.org/10.48550/arXiv.2402.15570.

¹³

Advait Yadav et al., “Infoflood: Jailbreaking Large Language Models with Information Overload,” prepublished June 13, 2025, https://doi.org/10.48550/arXiv.2506.12274.