There’s no such thing as auto-formalization

This is an explanation of why “auto-formalization”, as a general concept, is not real, but probably still helpful some of the time.

Some people talk about “auto-formalization”: the act of convincing a computer to automatically formalize informal stuff (such as mathematical statements). They might be referring to the ability for GPTs to make conversions like

for every natural number, there is a larger prime number


theorem exists_larger_prime (n : ℕ) : ∃ (p : ℕ), prime p ∧ p > n

The tl;dr here is this:

In other words, if it were possible to auto-formalize some range of problems, you would have had to already formalize the range of problems.

Or: 102. “One can’t proceed from the informal to the formal by formal means”.

This is not meant to devalue the work folks are doing on machine learning and theorem proving, which I would like to succeed. I’m not part of any LLM research communities, which I assume host lively debate around what they mean by auto-formalization and the pros and cons of various approaches. I do sometimes sense confusion in discussions around verification; sometimes there is an implicit unchallenged belief that “verification” is an objective measure that some object either does/does not possess.

an example

To be a little more concrete, suppose you want me to start by automatically formalize a particular textbook (say, Algebra by Artin). You want a program that takes in the “natural” language and spits out formal theorems and formal proofs of those theorems. Here are several things that I might do, each resulting in a particular program that constructs a formal object given some input:

  1. A program that always returns the proposition “true” with accompanying proof.
  2. I hire a grad student to formalize the book by hand, check their work, then give you a program that prints out the work.
  3. I import a highly sophisticated library containing formalized algebra at a much higher level of abstraction. The program computes formalizations of the theorems that pass external validation (we call this subroutine auto_theorems), then it does a brute force search to find instantiations of the abstract proofs from the library that prove the theorems.
  4. auto_theorems, but then the natural language proofs are parsed and an attempt is made to structurally translate them into formal proof with the help of a library of prior math knowledge. Whenever gaps are found, the program asks a user (in some mix of formal and natural language) how to fill them.
  5. The same as the previous, but all gaps are filled automatically.
  6. The same as the previous, but proofs are constructed without the help of any domain knowledge (just basic logical axioms and induction).

The first point is: there are no formal grounds by which any of these responses can be refuted as incorrect (even the first one). It is representative of a general danger: the formal theorem statement chosen to represent an informal one might simply not be relevant or useful. These errors could be buried arbitrarily deeply in the formalization, although end-to-end proofs (with vetted hypotheses and conclusion) are a way to guard against that. Still, the vetting must take place.

The second point is: auto-formalization should (probably) be general in an interesting way. The second strategy is not general (cannot formalize any other text) and the third is dubious (its prior knowledge is highly domain specific).

Another point about (3): once you choose a formal system and formal theorem expressions, proofs can be checked for validity without further human intervention. However, determining whether or not a particular formal proof represents a given natural language proof is highly problematic. Normally, people learn by reading proofs; if formal proofs are to be helpful for learning, something more is required.

The fourth strategy is not automatic, but I would argue is highly useful. There is a reality out there where people are already highly interested in building formal math; routines that fill in more busywork for them are great. They expect tools to be interactive: it is simply not the norm for one-shot methods to succeed. When an automated method does fail, it should fail in an informative way: some hint as to what to try next is necessary.

The final option is the most objective, but still relies on human vetting that auto_theorems is sound and may fail to provide proofs that are useful in some desired sense.


Replacing all programming activity with “auto” translated natural language is not a holy grail. Programming languages tend to introduce vast amounts of incidental complexity, but there are still real reasons to prefer formal languages for describing complex processes. A well designed language/notation supports thinking.

As far as mixing natural language and theorem proving my take is: