Syzygies 7 days ago

I love Lean 4 as a general purpose programming language, but I can't use it because no current AI coding assistant can reliably work with me in Lean. I can be trying six languages at once like a chess master playing an exhibition simultaneous match, and they'll all flow in Cursor using Claude 3.5 Sonnet, or even Windsurf. Then add Lean 4 and everyone chokes.

On the other hand, Claude 3.5 Sonnet understands my iPad math drawings better than my colleagues. I am admittedly weird, but AI as an association engine of incomprehensible scope likes trippy. Trippy helps its reach. Philosophy never really worked until now when you can run it on a computer. AI really rewards recursive meta; it picks up speed like ice sailing. We're would-be shamans sitting around the fire a million years ago, imagining we can control the dance of the flames. With AI, we can.

Claude also understands, say, computational group theory or strategies for combinatorial enumeration, remarkably well. So it can assist in math research.

Formal proofs in Lean 4 may be hard to read, but the elephant in the room is how hard conventional math papers are to read. We're terrible at expressing ourselves, so reading a paper is the equivalent of reverse-engineering bad computer code to reconstruct thoughts the author failed to express. And it's embarrassing in 2025 that conventional papers can't be automatically checked.

A new genre will emerge in between: In the "Centaur" model Gary Kasparov popularized after he recovered from his chess defeat by IBM's Big Blue, we want papers to actually be very careful, battle-tested prompts for AI. They will pass as conventional papers for mathematicians still trapped in amber, but they're really just prompts for both us and AI.

The documentation for Lean 4 is aimed at humans. If it were designed to be a hybrid "Centaur" prompt doc, perhaps my tools would be better able to help me code in Lean 4.

  • Syzygies 6 days ago

    Deepseek r1 is getting a lot of buzz in the last day or so. As a Chat agent in Cursor, it gets further than I've seen other AI get at writing Lean 4 code.

    The process is acutely painful. It can't stick to directions, and keeps trying to add proof features that it can't get to work, despite my instructions that we're using Lean 4 as a general purpose programming language.

    To my surprise, the code ran faster than either Haskell or Rust on a small case. There could be so many artifacts involved in such a conclusion, but larger cases blew the stack. That's just inexperienced code.

fastneutron 7 days ago

I’m no expert in formal methods, but of all the tools I’ve used, Athena [0] has been one of my favorites to work with. The proofs read much more like what you’re used to seeing in math literature.

0. https://athena-lang.org/

  • zozbot234 6 days ago

    > The proofs read much more like what you’re used to seeing in math literature.

    Yup, that's mostly a factor of using declarative proofs rather than lists of proof "tactics" based on an entirely opaque proof state (that can only be reconstructed and understood by replaying them in the proof system). You'd find these proofs in systems like Mizar or Isar (a declarative framework that's part of Isabelle). Systems like Lean and Coq/Rocq do support structured proofs that can act as a minimal step towards declarativity but are not nearly as readable as actual declarative proofs.

sylware 6 days ago

That makes me think I am forgetting about the relationship between "logic" and "set theory" (starting point of all maths).

  • staunton 6 days ago

    Set theory is the start of most maths right now but it only has been for about a century and this might change. Lean is based on type theory, not set theory. On the other hand, (some sort of) logic seems indispensable.

    • sylware 6 days ago

      I was talking about "formal logic".

      I don't recall how it is built already.

    • pfdietz 6 days ago

      Really? Category theorists would dispute that.

robinzfc 7 days ago

The title is quite misleading. This is a tutorial on reading a Lean verification script so the title should be like "Anatomy of a Lean verification script". As it is it suggests that all formal proofs look like this which they typically don't.

  • seanhunter 6 days ago

    The title seems quite apt in my opinion. The article describes a proof as does the title. It doesn’t claim that all proofs are like this. The title isn’t “The anatome of formal proofs” or “all formal proofs” or anything similar.