184

I apologize for writing a lengthy answer, but I get the feeling the discussions about foundations for formalized mathematics are often hindered by lack of information.

I have used proof assistants for a while now, and also worked on their design and implementation. While I will be quick to tell jokes about set theory, I am bitterly aware of the shortcomings of type theory, very likely more so than the typical set theorist. (Ha, ha, "typical set theorist"!) If anyone can show me how to improve proof assistants with set theory, I will be absolutely deligthed! But it is not enough to just have good ideas – you need to test them in practice on large projects, as many phenomena related to formalized mathematics only appear once we reach a certain level of complexity.

The components of a proof assistant

The architecture of modern proof assistants is the result of several decades of experimentation, development and practical experience. A proof assistant incorporates not one, but several formal systems.

The central component of a proof assistant is the kernel, which validates every inference step and makes sure that proofs are correct. It does so by implementing a formal system $F$ (the foundation) which is expressive enough to allow formalization of a large amount of mathematics, but also simple enough to allow an efficient and correct implementation.

The foundational system implemented in the kernel is too rudimentary to be directly usable for sophisticated mathematics. Instead, the user writes their input in a more expressive formal language $V$ (the vernacular) that is designed to be practical and useful. Typically $V$ is quite complex so that it can accommodate various notational conventions and other accepted forms of mathematical expression. A second component of the proof assistant, the elaborator, translates $V$ to $F$ and passes the translations to the kernel for verification.

A proof assistant may incorporate a third formal language $M$ (the meta-language), which is used to implement proof search, decision procedures, and other automation techniques. Because the purpose of $M$ is to implement algorithms, it typically resembles a programming language. The distinction between $M$ and $V$ may not be very sharp, and sometimes they are combined into a single formalism. From mathematical point of view, $M$ is less interesting than $F$ and $V$, so we shall ignore it.

Suitability of foundation $F$

The correctness of the entire system depends on the correctness of the kernel. A bug in the kernel allows invalid proofs to be accepted, whereas a bug in any other component is just an annoyance. Therefore, the foundation $F$ should be simple so that we can implement it reliably. It should not be so exotic that logicians cannot tell how it relates to the accepted foundations of mathematics. Computers are fast, so it does not matter (too much) if the translation from $V$ to $F$ creates verbose statements. Also, $F$ need not be directly usable by humans.

A suitable variant of set theory or type theory fits these criteria. Indeed Mizar is based on set theory, while HOL, Lean, Coq, and Agda use type theory in the kernel. Since both set theory and type theory are mathematically very well understood, and more or less equally expressive, the choice will hinge on technical criteria, such as availability and efficiency of proof-checking algorithms.

Suitability of vernacular $V$

A much more interesting question is what makes the vernacular $V$ suitable.

For the vernacular to be useful, it has to reflect mathematical practice as much as possible. It should allow expression of mathematical ideas and concepts directly in familiar terms, and without unnecessary formalistic hassle. On the other hand, $V$ should be a formal language so that the elaborator can translate it to the foundation $F$.

To learn more about what makes $V$ good, we need to carefully observe how mathematicians actually write mathematics. They produce complex webs of definitions, theorems, and constructions, therefore $V$ should support management of large collections of formalized mathematics. In this regards we can learn a great deal by looking at how programmers organize software. For instance, saying that a body of mathematics is "just a series of definitions, theorems and proofs" is a naive idealization that works in certain contexts, but certainly not in practical formalization of mathematics.

Mathematicians omit a great deal of information in their writings, and are quite willing to sacrifice formal correctness for succinctness. The reader is expected to fill in the missing details, and to rectify the imprecisions. The proof assistant is expected to do the same. To illustrate this point, consider the following snippet of mathematical text:

Let $U$ and $V$ be vector spaces and $f : U \to V$ a linear map. Then $f(2 \cdot x + y) = 2 \cdot f(x) + f(y)$ for all $x$ and $y$.

Did you understand it? Of course. But you might be quite surprised to learn how much guesswork and correction your brain carried out:

The vernacular $V$ must support these techniques, and many more, so that they can be implemented in the elaborator. It cannot be anything as simple as ZFC with first-order logic and definitional extensions, or bare Martin-Löf type theory. You may consider the development of $V$ to be outside of scope of mathematics and logic, but then do not complain when computer scientist fashion it after their technology.

I have never seen any serious proposals for a vernacular based on set theory. Or to put it another way, as soon as we start expanding and transforming set theory to fit the requirements for $V$, we end up with a theoretical framework that looks a lot like type theory. (You may entertain yourself by thinking how set theory could be used to detect that $f : U \to V$ above does not make sense unless we insert coercions – for if everthying is a set then so are $U$ and $V$, in which case $f : U \to V$ does make sense.)

Detecting mistakes

An important aspect of suitability of foundation is its ability to detect mistakes. Of course, its purpose is to prevent logical errors, but there is more to mistakes than just violation of logic. There are formally meaningful statements which, with very high probability, are mistakes. Consider the following snippet, and read it carefully:

Definition: A set $X$ is jaberwocky when for every $x \in X$ there exists a bryllyg $U \subseteq X$ and an uffish $K \subseteq X$ such that $x \in U$ and $U \in K$.

Even if you have never read Lewis Carroll's works, you should wonder about "$U \in K$". It looks like "$U \subseteq K$" would make more sense, since $U$ and $K$ are both subsets of $X$. Nevertheless, a proof assistant whose foundation $F$ is based on ZFC will accept the above definition as valid, even though it is very unlikely that the human intended it.

A proof assistant based on type theory would reject the definition by stating that "$U \in K$" is a type error.

So suppose we use a set-theoretic foundation $F$ that accepts any syntactically valid formula as meaningful. In such a system writing "$U \in K$" is meaningful and therefore the above definition will be accepted by the kernel. If we want the proof assistant to actually assist the human, it has to contain an additional mechanism that will flag "$U \in K$" as suspect, despite the kernel being happy with it. But what is this additional mechanism, if not just a second kernel based on type theory?

I am not saying that it is impossible to design a proof assistant based on set theory. After all, Mizar, the most venerable of them all, is designed precisely in this way – set theory with a layer of type-theoretic mechanisms on top. But I cannot help to wonder: why bother with the set-theoretic kernel that requires a type-theoretic fence to insulate the user from the unintended permissiveness of set theory?

30

EDIT: Since this question has gotten so much interest, I have decided to substantially rewrite my answer, stating explicitly here on MO some of the more important points rather than forcing the reader to follow links and chase down references.

  1. To begin with, it is important to distinguish between what currently existing proof assistants can do versus what they could do if we put in the necessary development work. There is no doubt that existing type-theoretic proof assistants outperform existing set-theoretic proof assistants according to various important metrics, such as convenience, pre-existing libraries, etc. Someone who favors type-theoretic proof assistants therefore always has a trump card to play in these discussions—“What you say is nice in theory, but show me the money. How does your set-theoretic proof assistant perform in practice on real problems?” In an earlier version of this answer, I mentioned a talk by John Harrison entitled, “Let’s make set theory great again!” (part 1 part 2 slides), and Andrej Bauer asked the reasonable question (in the comments below) whether Harrison had implemented his ideas. As Jeremy Avigad has said, even though he thinks that the “ideal proof assistant would be based on ZFC, with enough practical infrastructure to support all the things we need to do mathematics,” “it is easy to underestimate the difficulties involved in designing a useful and workable system.” At the same time, if we take the long view, we should be careful not to mistake what might be an artifact of our current implementations for a fundamental truth. Larry Paulson has in effect said “show me the money” in a more literal sense:

I would guess that the amount of effort and funding going into type theory exceeds the amount that went into set theory by an order of magnitude if not two. It's not unusual to encounter open hostility to set theory and classical logic combined with an air of moral superiority: “Oh, you aren’t constructive? And you don't store proof objects? Really?” And I have seen "proof assistant" actually DEFINED as "a software system for doing mathematics in a constructive type theory".

The academic interest simply isn't there. Consider the huge achievements of the Mizar group and the minimal attention they have received. Also, I think that my 2002 paper on proving the reflection theorem (and presented at CADE, a high-profile conference) was really interesting, but it had been cited only six times, and two of those are by myself.

I am certain that we would now have highly usable and flexible proof assistants based on some form of axiomatic set theory if this objective had enjoyed half the effort that has gone into type theory-based systems in the past 25 years.

  1. A second point is that everyone acknowledges that having a system where the computer can help you catch silly mistakes is a huge benefit, if not an absolute necessity. For this, some kind of type-theory-like mechanism is very useful. However, this is not as decisive an argument in favor of type theory and against set theory as it might seem at first glance. The “working mathematician” is often tempted to regard the absurdity of a statement such as $2\in 3$ as a strong argument against set theory, but the working mathematician also tends to balk at giving $0/0$ a concrete value (instead of declaring it to be “undefined”), which is the sort of thing that many proof assistants do. In both cases, there are known ways to block “fake theorems.” It is standard engineering practice to develop systems that contain multiple layers (the distinction between vernacular and foundation in Andrej Bauer’s excellent answer is an example), and the fact that $2\in 3$ might be a theorem at some low layer does not automatically mean that this is something a user will be able to enter from the keyboard and not get caught by the system. In principle, therefore—to return to the actual question being asked—set theory does not seem to pose any intrinsic barriers to automation. Indeed, other answers and comments have made this point, and explained how powerful automation tactics can and have been written in set-theoretic systems such as Metamath. Another example is the work of Bohua Zhan on auto2, which has shown that many of the alleged difficulties with untyped set theory can be overcome.

  2. There remains the question, even if a set-theoretic proof assistant with the power and usability of Coq/Lean/Isabelle could be developed, what would be the point? Aren't the already existing type-theoretic assistants good enough? This is a much more “subjective and argumentative” point, but I would propose a couple of arguments in favor of set theory. The first is that set theory has a great deal of flexibility, and it is not an accident that historically, the first convincing demonstration that all of mathematics could be put on a single, common foundation was accomplished using set theory rather than type theory. With a relatively small amount of training, mathematicians can see how to formulate any concepts and proofs in their field of expertise in set-theoretic terms. In the language of Penelope Maddy’s paper, What do we want a foundation to do? set theory provides a Generous Arena and a Shared Standard for all of mathematics with minimal fuss. Of course, there is a price to be paid if we give someone enough rope—they might decide to hang themselves. But if we want to see widespread adoption of proof assistants by the mathematical community, then we should take seriously any opportunities we have to leverage mathematicians’ existing habits of thought. I do not think that it is an accident that set-theoretic proof assistants tend to produce more human-readable proofs than type-theoretic proof assistants do (though I will admit that this could be an artifact of existing systems, rather than a fundamental truth).

    Another argument is that if we are interested in reverse mathematics—which axioms are needed to prove which theorems—then there has been a lot more work done to calibrate mathematics against set-theoretic and arithmetical systems than against type-theoretic systems. In Maddy’s language, we might hope for a proof assistant to help us with Risk Assessment and Metamathematical Corral. This does not seem to be a priority with too many people at the present time, but again I am trying to take the long view here. The mathematical community already has a good grasp of how the mathematical universe can be built from the ground up using set theory, and exactly what ingredients are needed to achieve which results. It would be desirable for our proof assistants to be able to capture this global picture.

    One could point out that someone who is really interested in set theory can use something like Isabelle/ZF, which builds set theory on top of type theory. That is true. I am not trying to argue here that a set-theoretic foundation with some kind of type theory layered on top is necessarily better than a type-theoretic foundation with some kind of set theory layered on top. I am only trying to argue that set theory does enjoy some advantages over type theory, depending on what you are trying to achieve, and that the claim that “automation is very difficult with set theory,” or that there is nothing to be gained by using set theory as the basis for a proof assistant, should be taken with a grain of salt.

Let me conclude with a remark about Lean specifically. A couple of years ago, Tom Hales provided a review of the Lean theorem prover that spells out the pros and cons as he saw them at the time. Some of what he said may no longer be true today, but one thing that is true is that even Lean enthusiasts agree that there are flaws that they promise will be fixed in Lean Version 4 (which unfortunately is going to be incompatible with Lean 3, or so I hear).

27

I still find it very surprising that this random talk I gave attracts so much attention, especially as not everything I said was very well thought out. I am more than happy to engage with people in discussions about what I said and whether or not some things I said were ill-informed.

But onto my answer to your question: whilst I am not an expert in proof assistants in general (I have become knowledgeable at precisely one proof assistant and have limited experience with others), it is my empirical observation that high-level tactics like Lean's ring tactic, which will prove results like $(x+2y)^3=x^3+6x^2y+12xy^2+8y^3$ immediately -- and there are similar tactics in Coq and Isabelle/HOL, two more type theory systems -- do not seem to exist in the two main set theory formal proof systems, namely Metamath and Mizar. I don't really know why, but those are the facts. Note that the proof of this from the axioms of a ring is extremely long and uncomfortable, because you need to apply associativity and commutativity of addition and multiplication many times -- things mathematicians do almost without thinking.

6

I'll answer just the automation question since the other answers gave nice broad overview, but didn't seem focus on that narrow question. My own direct automation experience is wrt to ACL2, Lean and SMT-based solvers.

Strictly speaking, I don't know if there's any foundational argument for why set theory would be better or worse than the type theory-based approach in Lean.

The strengths that Lean have from my perspective are: an expressive explicit type system, a relatively simple core language for representing terms, and a attention to how terms are represented for efficient manipulation.

With regards to typed core logics, most automation in theorem provers is tailored to specific common theories that are widely used in mathematics. When writing such automation, it is important to know the types and operations involved. For example, in writing a decision procedure for linear arithmetic in an untyped language, one needs to carefully check that any transformations still make sense even if the expressions do not denote numbers. By having a typed and typechecked expression language, one gets from the theorem prover itself and does not have to pay the additional runtime and complexity costs.

A second strength of Lean is ensuring that the core language is simple, but expressive so that one can represent proofs compactly. When using automation such as SMT solvers, the "proof terms" generated as evidence can be very large and the core proof language needs to be designed to compactly represent proofs while still be amenably to efficient checking. I'm not sure if Lean has an advantage to Coq or other solvers here per se but it is a factor in Lean's design.

A third strength of Lean is that the language for writing tactics and creating definitions and theorems are one and the same. There is a bit of syntactic sugar for the tactic sequences and a tactic-specific library, but by having the same language one does not have to learn multiple languages just to get started writing tactics. Lean is also not unique here -- ACL2 is similar, but it is a strength of Lean still. It will also become even more relevant with Lean 4 thanks to the efficient compiler being developed.

answered Nov 20 '20 at 20:31
-2

Modern programming languages unlike older programming languages, or unlike assembly and machine code are typed. For example c++ with or without templates compared to c, or Java compared to Basic.

With typing, a compiler needs to check that all expressions have the correct type. Generally, this is done by telling the compiler what types are being used. However, we can attempt to automate this process and allow the compiler to work out what the types are by itself, given the base types under consideration. This is known as type inference.

Depending upon the language, type checking and inference can range from trivial to undecidable.

We also say it is sound iff it accepts correctly typed programs.

Now, type checking and inference in the simply typed Lambda Calculus with parametric polymorphism is decidable and one implementation of this is the Hindley-Milner Type Sysyem where complexity is linear in the size of the program; however, the problem is PSPACE hard and EXPTIME complete, meaning that in the worst case scenario it requires a polynomial worth of extra space and an exponential amount of extra time. Luckily, it happens to be linear when the nesting depth of polymorphic variables is bounded.

For example,

  1. C++ with templates is Turing complete by an informal proof by Veldhuizen '03. This means type checking and inference is undecidable.
  1. In C#, both type checking and inference is undecidable. And it is unsound.
  1. Similarly Java because Java Generics are Turing Complete. Amin and Tate '15 showed that Java 5 or later is unsound.
  1. Haskell has decidable type checking and inference. But with enough extensions, that is to at least System F, then it becomes undecidable as System F is.

Dependent types allow types to depend upon their values and so give a finer control over types. For example, with ordinary types you cannot have a type for even numbers; with dependent types you can. Unfortunately, adding this capability generally renders type inference undecidable.

Lean is a programming language which is based upon Calculus of Constructions with Inductive Tupes. This calculus is a higher order typed Lambda Calculus and lies at the top node of Barendregt's Lambda Cube. This means it has polymorphism and dependent types.

It's main rival, is Martin Lof's Type Theory which is also dependent and inductive types. It comes in two main varieties, intensional and extensional. The intensional variety can be alternatively thought of as Homotopy Type Theory. In this variety, type inference is decidable whilst in the extensional variety it is not.

Set Theory like machine code, like assembly and like early programming languages are not typed. So type checking does not apply. As types allow us to program more safely, these languages are not as safe as typeful languages and hence, automation is more difficult here as we cannot prove they will act as we expect them to do so.

Typed languages does not mean that untyped languages will go away. After all, the languages at all in fact nested. Haskell, at bottom, is still executed by machine code and this is untyped. Hence, development of both is required for automation in all it's varieties.

After all, if I had to program the fastest loop to add as many consecutive integers in a second as I could on a particular machine I would choose to write it in machine code, I typed as it is - every time. More importantly, in my opinion it's makes one realise how the whole stack of languages from the editor through the compiler and then through the interface and interrupts down to the metal works. In this way, you learn more about how every works 'under the hood' - so to speak.

edit

@LSpice: It's written this way because the question itself is kind of vague. Why not ask the OP to tighten up the question? The proposed question is answered in the last paragraph. And the preceding paragraphs simply explains my terms and my thinking in a step-by-step pedagogical manner rather than a deus ex-machina manner ... or do you think explanatory text is a little too old-fashioned for the hyped up and glib virtual world?