Proof-Producing Synthesis of ML from HOL

Paper

This page contains supporting material for the following paper.
Magnus O. Myreen and Scott Owens.
Proof-Producing Synthesis of ML from Higher-Order Logic.
In International Conference on Functional Programming (ICFP), 2012.

Abstract

The higher-order logic found in proof assistants such as Coq and various HOL systems provides a convenient setting for the development and verification of pure functional programs. However, to efficiently run these programs, they must be converted (or "extracted") to functional programs in a programming language such as ML or Haskell. With current techniques, this step, which must be trusted, relates similar looking objects that have very different semantic definitions, such as the set-theoretic model of a logic and the operational semantics of a programming language.

In this paper, we show how to increase the trustworthiness of this step with an automated technique. Given a functional program expressed in higher-order logic, our technique provides the corresponding program for a functional language defined with an operational semantics, and it provides a mechanically checked theorem relating the two. This theorem can then be used to transfer verified properties of the logical function to the program.

We have implemented our technique in the HOL4 theorem prover, translating functions to a core subset of Standard ML, and have applied it to examples including functional data structures, a parser generator, cryptographic algorithms, and a garbage collector.

Implementation and source code

The above paper reports on a method for proof-producing translation from functions in higher-order logic to a pure ML language, which we call MiniML. We have implemented our method in the HOL4 theorem prover. The source code and examples are available as part of the sources for HOL4 on github. In particular, there you can find the MiniML operational semantics, its mechanised meta-theory (including type soundness and big-step/small-step semantics equivalence proofs) and the proof producing translator.

Changes since paper submission

The translation method described in the paper above produces certificate theorems with assumptions listing what closures need to be present in the environment. Since the paper submission, we have revised our implementation in such a way that all these assumptions are collected into a single DeclAssum. A DeclAssum lists what declarations it assumes have been made in the environment.

This DeclAssum-feature allowed us to make many of the examples share preludes that define standard datatypes (e.g. lists, pairs) and functions (e.g. append, fst) that are not primitive in MiniML. We use the following preludes in the case studies listed below.

mini_prelude — defines pair and list datatypes and a few functions for lists
definition in HOL (5 line), generated AST, SML, OCaml, certificate theorems

std_prelude — a collection of library functions from HOL4 sources
definition in HOL (144 lines), generated AST, SML, OCaml, certificate theorems

word_prelude — functions for operations on fixed-width words (used by crypto examples)
definition in HOL (48 lines), generated AST, SML, OCaml, certificate theorems

Case studies

To illustrate how our implementation works, we have applied it to a number of different examples listed below. Our tool synthesises abstract syntax trees for MiniML, but to aid readablity, we also provide concrete syntax for both SML and OCaml (using the simple printer formalised in print_ast.lem). We currently make no formal statements regarding the concrete syntax representation, and some of the OCaml versions do not build due to variables that begin with capital letters.

We applied our tool to both new definitions and definitions that already existed in various examples directories. The following is a list of the existing case studies that we applied our tool to. The fact that we can apply our tool to definitions that were developed before this project started shows that users of our implementation do not need to tailor their definitions to our tool.

McCarthy's 91 function — originally defined in HOL by Slind
definition in HOL (1 line), generated AST, SML, OCaml, certificate theorems

quicksort — defined and verified by Slind
definition in HOL (11 lines), generated AST, SML, OCaml, certificate theorems

a regular expression matching function — defined and verified by Owens
definition in HOL (13 lines), generated AST, SML, OCaml, certificate theorems

Miller-Rabin primality test — defined and verified by Hurd
definition in HOL (59 lines), generated AST, SML, OCaml, certificate theorems

an SLR parser generator — defined and verified by Barthwal
definition in HOL (134 lines), generated AST, SML, OCaml, certificate theorems

AES, RC6 and TEA private key encryption/decryption algorithms — defined and verified by Duan et al.
AES: definition in HOL (144 lines), AST, SML, OCaml, certificate theorems
RC6: definition in HOL (54 lines), AST, SML, OCaml, certificate theorems
TEA: definition in HOL (25 lines), AST, SML, OCaml, certificate theorems

a copying Cheney garbage collector — defined and verified by Myreen
definition in HOL (48 lines), generated AST, SML, OCaml, certificate theorems

We have also applied our implementation to new definitions. Here is a list of the examples from Okasaki's book on purely functional data structures that we have modelled as functions in HOL, verified correctness of and applied our translator to. (At present, the Binary random-access list and the Hood-Melvile queue are not yet verified.)

Batched queue (9 lines) HOL, AST, SML, OCaml, certificate theorems

Bankers queue (15 lines) HOL, AST, SML, OCaml, certificate theorems

Physicists queue (19 lines) HOL, AST, SML, OCaml, certificate theorems

Real-time queue (14 lines) HOL, AST, SML, OCaml, certificate theorems

Implicit queue (23 lines) HOL, AST, SML, OCaml, certificate theorems

Hood-Melville queue (52 lines) HOL, AST, SML, OCaml, certificate theorems

Leftist heap (26 lines) HOL, AST, SML, OCaml, certificate theorems

Pairing heap (22 lines) HOL, AST, SML, OCaml, certificate theorems

Lazy pairing heap (27 lines) HOL, AST, SML, OCaml, certificate theorems

Splay heap (46 lines) HOL, AST, SML, OCaml, certificate theorems

Binomial heap (47 lines) HOL, AST, SML, OCaml, certificate theorems

Unbalanced set (19 lines) HOL, AST, SML, OCaml, certificate theorems

Red-black set (68 lines) HOL, AST, SML, OCaml, certificate theorems

Merge sort (18 lines) HOL, AST, SML, OCaml, certificate theorems

Binary random-access list (45 lines) HOL, AST, SML, OCaml, certificate theorems