Skip to Content.
Sympa Menu

coq-club - Re: [Coq-Club] Proof automation class ideas

coq-club AT inria.fr

Subject: The Coq mailing list

List archive

Re: [Coq-Club] Proof automation class ideas


Chronological Thread 
  • From: Talia Ringer <tringer AT cs.washington.edu>
  • To: coq-club AT inria.fr
  • Subject: Re: [Coq-Club] Proof automation class ideas
  • Date: Sat, 21 Aug 2021 18:18:33 -0500
  • Authentication-results: mail2-smtp-roc.national.inria.fr; spf=None smtp.pra=tringer AT cs.washington.edu; spf=Pass smtp.mailfrom=tringer AT cs.washington.edu; spf=None smtp.helo=postmaster AT mx5.cs.washington.edu
  • Ironport-hdrordr: A9a23:3jeVIaulstqbeOLr5icshYS07skDA9V00zEX/kB9WHVpm6Oj+/xG8M536faLslgssRgb8LjqBEDqexzhHPBOi7UsAQ==
  • Ironport-phdr: A9a23:DT7e0xOzG7vtTUhaykcl6nalDRdPi9zP1u491JMrhvp0f7i5+Ny6ZQqDv60r3QaCANqTwskHotKei7rnV20E7MTJm1E5W7sIaSU4j94LlRcrGs+PBB6zBvfraysnAJYKDwc9rDm0PkdPBcnxeUDZrGGs4j4OABX/Mhd+KvjoFoLIgMm7yfy+94fXbglVizawYbF/JwiqoAvMscUbnZFsIbsrxBvTpXtIdeVWxWd2Kl+Wgh3x+MS+8oN9/ipJo/4u+NJOXqv8f6QjULxXFy8mPHwv5M3qrhbMUw2C7WYBX2oMkxpIBw/F7AzmXpr0ryD3uPZx1DWcMMbrS70/RDas4LpxSBLwhygHOTw2/mHZhMJzkaxVvg6uqgdlzILIeoyYLuZycr/fcN4cWGFPXtxRVytEAo6kcYUAFe4BPedCoILgu1UOrB2+BQayC+P11zRFgWP23bc70+QnHgHJwhYgEskSv3TPttn0Or0dUfu1zKnJyzXDYO9W2TD76ITSbB8uvOyMUKt2fMHMxkYhCxnLgU+MqYz5ITyVzOINvnCH4+RgV++iim0qph1vrjWsx8oglIrEi4wJx13F9yh0zok4KMOlRUN7fNKoDpVeuS+UOoV2Tc4uX25ltikmx7Ebv5OwYSYEyJMixxHFavyHdZCF7Q7jVOaVIDd4nGhqeLaliBqo90iv1PH8W8+p21hJtipIisTAum4O2hDJ9MSLVvhw8l2/1TqV2A3e5flILV4omaffMZIswb49moAOvUnCHyL6gkr7gLGQe0454Oan8f7nba/jppKEN497lAX+MqM2l8y6DuU1MhICUHSA+eim2rzs51b5QLVLjv0wnanZrJfaJcIBqqGnHgBVz54v6xe5Dzi4zNQVhWQLIExGdR6dkYTlJ1LDLOrmAfuljFmhny9nx/XcMb3gBpXNIGLDkLDkfbtl905c0gszzdZQ551KELEMO+78WlTruN3WFBA5KAy0w+fhCNVyyoMeXmSPDrWHP6zPrF+E/vgvLPWUZI8JpDb9LOAo6OLpjX8ggFMSYa2p3YYMZ32jBfRnI0CZYWL2jdsbEGcKuBA+TO3wh1GYXz5TfSX6Y6Vp7TYiTYmiEI3rR4a3gbXH0j3oMIdRYzV6A1SNGD/SdoOLVu1EPD6II8lunyYsXqPnVIY61RCouxP9zfxqIveCqX5Qjo7qyNUgv76brho17zEhV6x1PEmGVCdrl3gIRjk5wKd550Fx1wXauUCdq/dDU8Na/PNIVAgmMpiawuBnWYiasuPpe8fUDl29BMqvGjEwSN0txNlIbkpgSY3Ksw==

I think the story changes a lot when you expand beyond solely Coq. For example, Lean has a "neural" tactic from what I gather that I've seen people use before.

I think it's worth noting that OpenAI (the company that made Codex, the model behind Copilot) has an ongoing project on neural proof synthesis for Lean. I think we will see some use of this in the near future. For that week, I'm likely to invite someone at OpenAI to give a talk and demo the model if he's allowed. (Coq will get left behind from that effort, by the way, if nobody adds Coq proofs for their cross-language benchmarks; if interested in that effort, let me know.)

Language models do really well with enough context, and this is often neglected when people try them out, so they often abandon them early. I recently tried Copilot with Joe Cutler in the middle of CompCert to see what it does, and I think even a naive large language model like Codex with no semantic understanding at all does really well at synthesizing proofs inside of a large project specifically when you are in a repetitive file with many similar, small lemmas and slight variations between them. The syntax often captures many of those variations, and the LLM picks up on them easily. We will post a YouTube video on this Monday.

What excites me most about LLMs for proofs is that, after just a few examples, they learn to replicate *your style*. That makes perfect sense, but it's also incredible---this is something I've failed to do with any other kind of proof automation. Combining this with some semantic understanding, I can see a wonderful world in the future where we can eventually synthesize tactics that both work and capture your style.

Another thing we will see in the future is tools for automatic translation between proof assistants. This will be one of the most exciting things, but again, Coq will be left out unless we contribute to the relevant benchmark suite with Coq versions of all of the proofs.

For Coq, most of the existing neural proof synthesis tools erase too much information from terms and types to be useful. I'm working on improving this with some students right now.

Likewise when it comes to ATP for ITP, I'd think a bit broader than just Z3. Lean for example implements e-graphs, the data structures beneath SMT solvers, with a notion of equality compatible with dependent types (assuming UIP), and from this implements congruence closure. One can view this as a solver for a more interesting logic than some FOL variant. Leo was very convinced it makes more sense to have a different solver for a dependently typed notion of equality, and I agree. The same could, in theory, be done for path equality in cubical rather gracefully, thereby automating transport as a type-directed search, though efficiency is a hard problem to solve. Sledgehammer in Isabelle/HOL is very widely used, and under the hood relies on a lot of ATP techniques.

Fun discussion, and thanks everyone so far!

Talia

On Sat, Aug 21, 2021, 2:16 PM Adam Chlipala <adamc AT csail.mit.edu> wrote:
On 8/18/21 11:24 PM, Talia Ringer wrote:
I'm thinking a bit ahead to the grad seminar I'll teach in the spring (starting in January) which I think I'll have complete freedom over. I really want to do a proof automation course, where one day every week is a deep dive on a single paper, and the other is hacking on the paper artifact. I was thinking this outline:

[...]
8, 9: ATP for ITP
10, 11: neural networks for proofs

I'm really curious on the community's thoughts regarding stable/usable artifacts in these categories.  My collaborators and I actually haven't found any!

There have been research papers on SMT bridges for Coq (heck, I think I built the first one myself as a student), but we find that they all fall over when used for significant program verification.  That is, we hit blocking issues (performance scaling or otherwise) that make the maintainers say "yeah, that would be really hard to fix."

The neural-networks angle has been a popular point of discussion since deep learning became hot, but I don't think I've yet come across someone using such a tool who isn't also one of its primary authors.

You can carefully curate experiences using rickety tools in a course, but it would be nice if students could expect stepping outside the lines to go well.



Archive powered by MHonArc 2.6.19+.

Top of Page