ARGONNE NATIONAL LABORATORY
9700 South Cass Avenue
Argonne, IL 60439
ANL/MCS-TM-191
The QED Workshop
held at Argonne National Laboratory
May 18--20, 1994
Gail W. Pieper, synthesizer
Mathematics and Computer Science Division
Technical Memorandum No. 191
July 1994
The workshop was supported by special funding from the Office of Naval Research under ONR Order No. N00014-96-F-0088.
Abstract
On May 18--20, 1994, Argonne National Laboratory hosted the QED Workshop. The workshop was supported by special funding from the Office of Naval Research. The purpose of the workshop was to assemble a group of researchers to consider whether it is desirable and feasible to build a proof-checked encyclopedia of mathematics, with an associated facility for theorem proving and proof checking. Among the projects represented were Coq, Eves, HOL, ILF, Imps, MathPert, Mizar, NQTHM, NuPrl, OTTER, Proof Pad, Qu-Prolog, and RRL.
Although the content of the QED project is highly technical---rigorously proof-checked mathematics of all sorts---the discussions at the workshop were rarely technical. No prepared talks or papers were given. Instead, the discussions focused primarily on political, sociological, practical, and aesthetic questions such as Why do it? Who are the customers? How can we get mathematicians interested? What sort of interfaces are desirable?
The most important conclusion of the workshop was that QED is an idea worthy pursuing, a statement with which virtually all the participants agreed.
In this document, we capture some of the discussions and outline suggestions for the start of a QED scientific community.
The workshop gathered approximately thirty researchers, representing ongoing worldwide efforts in theorem proving and mathematics. Among the projects represented were the Coq, Eves, HOL, ILF, Imps, MathPert, Mizar, NQTHM, NuPrl, OTTER, Proof Pad, Qu-Prolog, and RRL.
Many of those attending had previously contributed to a document known as the QED Manifesto---an anonymously authored document that discusses the desirability and feasibility of organizing a public-domain database of a substantial part of mathematical knowledge, including proofs suitable for machine checking.
The focus of the workshop was on mathematics and automated deduction. The objective was to consider the desirability and feasibility of building a proof-checked encyclopedia of mathematics, with an associated facility for theorem proving and proof checking. Such a project could be used in university mathematics education, graduate research, mathematics research, and K-12 school education.
The structure of the workshop was intentionally kept informal; no formal presentations were given. Moreover, the discussions themselves were focused on nontechnical issues---potential customers, philosophy, linkages with other symbolic and numerical systems. The result was lively discussion, often sharp disagreement, about a variety of political, sociological, and aesthetic questions involved in organizing such a major undertaking as the QED project.
In the remainder of this report, we summarize the activities of the QED Workshop. First, we review the background to the QED project. We then list the participants and present summaries of the pertinent issues raised. We conclude with an evaluation of the workshop and suggestions for future work.
In particular, professional research mathematicians are only beginning to show interest in using theorem proving to attack hard problems. A recent article in the New York Times illustrates the level of skepticism manifest by many mathematicians about the feasibility of automated theorem provers [2]. Compounding the problem is the difficulty of encoding mathematical results and their proofs in a single database that can be easily used by students not only at the graduate level but also at the K-12 level. A principal objective of the QED Workshop, then, was to discuss the feasibility of constructing an ``intellectual infrastructure''---an organized framework encapsulating theorem provers and checkers, proofs, and the basic ideas of mathematics.
The groundwork for discussion at the QED Workshop was a document known as the QED Manifesto [3]. Many of the participants had contributed to this document, directly or indirectly, by comments via electronic mail. Moreover, several of the participants are already working on mathematics-related projects in theorem proving. One notable example is the Mizar system [4], which has a substantial body of formally checked mathematics. The objective at the QED Workshop, however, was not to identify specific programs but to understand how and whether such programs might be organized, together with a library of proofs and results, into a multilayered QED system.
Michael Beeson San Jose State University Robert Boyer University of Texas at Austin Bernd Dahn Humboldt-University at Berlin Masami Hagiya University of Tokyo John Harrison University of Cambridge Joan Hart University of Wisconsin M. Randall Holmes Boise State University Paul Jackson Cornell University Thomas Jech Pennsylvania State University Deepak Kapur State University of New York at Albany Kenneth Kunen University of Wisconsin Ewing Lusk Argonne National Laboratory William McCune Argonne National Laboratory Chet Murthy INRIA-Rocquencourt Ross Overbeek Argonne National Laboratory William Pase Odyssey Research Associates, Inc. Piotr Rudnicki University of Alberta John Staples The University of Queensland Rick Stevens Argonne National Laboratory F. Javier Thayer MITRE Corp. Andrzej Trybulec Warsaw University Tomas Uribe Stanford University Ralph Wachter ONR Richard Waldinger SRI International Toby Walsh IRST Larry Wos Argonne National Laboratory
As one might expect among groups who have worked independently (in most cases, for decades), the discussion was lively. The participants did agree that QED must necessarily differ from earlier projects in both scope and strategy. In contrast to the Automath project of the 1960s, for example, QED must be a large project, must be widely available (via Internet), and must have vast numbers of people working on it.
In the remainder of this report, we present some of the discussions related to these and other, more controversial issues.
Bourbaki was cited as the best example so far of mathematics organized into a coherent framework. According to Andre Weil, ``Perhaps the most important contribution of Bourbaki was to carry out a famous proposal made by the great German mathematician David Hilbert in 1900 that mathematics be placed on a more secure foundation.'' He noted: ``Hilbert just said so, and Bourbaki did it" [1]. Yet several participants believe that Bourbaki is what QED should not be, in particular because Bourbaki is highly praised, but rarely used.
Wide use by research communities was deemed vital. Three research communities appear the most obvious users: (1) computer scientists (e.g., for automated deduction and developing verified systems), (2) logicians, and (3) mathematicians (e.g., for proof-checking, teaching, and publication). Most participants agreed that the idea of merely archiving proofs would be boring to many mathematicians and that if this is the key objective of the QED project, mathematicians probably will not contribute. Mathematicians will be interested only to the extent that we can help them do new mathematics.
Clarifying the benefits of QED to the wider community appears to be the best approach to attracting potential users (as well as the support of sponsors). Several such benefits were identified:
Some participants view interface matters as unimportant so long as deep technical issues remain to be solved. Yet, interface issues determine 50 Three areas of interfacing were identified as significant:
Some participants felt that a standard object language was needed (although it should allow for the evolution of dialects of the standard object language and coexistence with alien object languages such as constructive systems).
Others noted that one should distinguish between the object language as a set of well-formed formulas and the basic theory formulated in the object language. For example, first-order logic can be used to reason in different basic theories (e.g., Peano arithmetic, MacLane set theory). The question thus becomes, Should one declare a single theory to be basic?
It was argued that this might create obstacles to QED research: for example, restricting the quantifiers by types could force users to verify these constraints throughout a proof, even in field where the types have nothing to do with the mathematical contents. The alternative would be to implement mechanisms to do the type checking automatically or to prove formally metatheorems that show that type checking is not necessary. Both are ambitious tasks.
Alternatively, one might demand that each accepted proof in QED state explicitly the basic theory and the calculus it employs. Then, the connection between the systems could be made by (1) proving the axioms of the basic theory of the first system in the second, and (2) proving that provability in the first calculus implies provability in the second calculus.
The group was split on the importance of having a common theory, say, T, and translations from the object theories of the individual systems into T that could provably transform proofs of the individual systems into proofs in T. At first T was taken to be some variant of set theory (Mizar's brand). Subsequently, a different proposal emerged, in which the proof system of each system was formalized in primitive recursive arithmetic, or PRA (Boyer-Moore like). The meta-logic equivalent to PRA would be a predicative type theory, with a theory of syntactical objects at the base (theory of syntactical objects roughly equivalent to a theory of finite tree structures). It would, in effect, be a programming language (perhaps with polymorphism in the type system) as well as a logical theory; this strategy would facilitate the ``bootstrapping" of results about proofs in PRA to procedures allowing one to skip many steps in later proofs in PRA (verification of derived inference rules by reflection). One could write a version of the proof checker in the algorithmic part of the theory.
The workshop participants also disagreed about the amount of ``blow-up" that might occur in translating proofs between existing prover projects. Some participants felt that while an approach working mechanically through the meta-logic (or root logic, as it is misleadingly called in the manifesto) might be bad, a heuristic approach using known common features of particular pairs of theories might make proof translations feasible. Further research in this area is clearly warranted, particularly to distinguish between the theoretical problems of exponential growth and the practical problems of translating huge quantities of proofs.
To achieve this objective will require several important changes. First, common terminology is needed. The numerous projects represented at the workshop have been developed in relative isolation so far. In these isolated projects, people tended to name same things differently and to give the same name to different functions. Second, there is also a need to agree on the name for the area into which QED-like activities fall. This will be the starting point of a QED scientific community. Third, the experience collected by separate teams, no matter how impressive, is still too small to extrapolate into the fully fledged QED. More of such small-scale experience is required.
The QED Workshop also drew the following, related conclusions:
2. Kolata, G. ``Computers Still Can't Do Beautiful Mathematics,'' New York Times, Week in Review Section E, July 14, 1991.
3. The ``QED Manifesto.'' Available by anonymous ftp from Internet site info.mcs.anl.gov in directory pub/qed.
4. Rudnicki, Piotr. ``An Overview of the MIZAR Project,'' preprint, 1994.