Theory Time_Manual

(* Author: Tobias Nipkow *)

theory Time_Manual
imports "HOL-Library.Time_Commands"
begin

section ‹Introduction›

text ‹This manual describes the framework for the automatic definition of step-counting
`running-time' functions from HOL functions. The principles of the translation are described
in Section 1.5, Running Time, of the book
  Functional Data Structures and Algorithms. A Proof Assistant Approach.
  🌐‹https://fdsa-book.net›
To load the framework import \<^theory>‹HOL-Library.Time_Commands›
The framework was implemented by Jonas Stahl.

As a first simple example consider ‹len›, which we define here returning an ‹int›
(to distinguish it from the time functions returning ‹nat›):›

fun len :: "'a list ⇒ int" where
"len [] = 0" |
"len (x#xs) = 1 + len xs"

time_fun len

text‹
Command ‹time_fun› defines a new function ‹T_len› of type ‹'a list ⇒ nat›, the time function
for ‹len› that counts the number of computation steps. The definition is printed by ‹time_fun›:
‹fun T_len :: "'a list ⇒ nat" where
 T_len [] = 1 |
 T_len (x # xs) = T_len xs + 1›
The details of this translation are described in the book referenced above.
This manual is about the use of the time framework.

Command ‹time_fun f› retrieves the definition of ‹f› and defines a corresponding step-counting
running-time function ‹T_f›. For all auxiliary functions used by ‹f›
(excluding constructors and predefined functions (see below)),
running time functions must already have been defined. Example:›

fun aux :: "'a ⇒ 'a" where
"aux x = x"

time_fun aux

fun main :: "bool ⇒ bool" where
"main x = aux x"

time_fun main

text ‹For functions defined by ‹definition›, there is a corresponding ‹time_definition› command.
Example:›

definition gdef :: "'a ⇒ 'a" where "gdef x = x"

time_definition gdef

thm T_gdef.simps

text ‹Note that ‹T_gdef› is defined via ‹fun›, which means that the defining equation is not named
‹T_gdef_def› but ‹T_gdef.simps› and is a simp-rule.

The time functions for many standard functions (in particular on lists) are already
defined in theory ‹HOL-Library.Time_Functions› and basic upper bounds are proved.›


section ‹Termination›

text ‹If the definition of a recursive function requires a manual termination proof,
use ‹time_function› accompanied by a ‹termination› command.›

function sum_to :: "int ⇒ int ⇒ int" where
  "sum_to i j = (if j ≤ i then 0 else i + sum_to (i+1) j)"
by pat_completeness auto
termination
  by (relation "measure (λ(i,j). nat(j - i))") auto

time_function sum_to
termination
  by (relation "measure (λ(i,j). nat(j - i))") auto


section ‹Partial Functions›

text ‹Partial functions can also be `timed'.›

(* Partial functions defined with ‹function› can currently not be timed:
function pos1 :: "int ⇒ bool" where
"pos1 i = (if i = 1 then True else pos1 (i-1))"
by auto

time_function pos1

function T_pos1 :: "int ⇒ nat" where
"T_pos1 i = (if i = 1 then 1 else T_pos1 (i-1) + 1)"
by auto
*)

partial_function (tailrec) positive :: "int ⇒ bool" where
"positive i = (if i = 1 then True else positive (i-1))"

time_partial_function positive

text ‹The difference is that ‹T_positive› has return type ‹nat option› because ‹positive›
may not terminate.

Timing a function defined with ‹partial_function (option)› is trickier and we do not go into it here.›


section ‹Higher-Order Functions›

text ‹A large subclass of higher-order functions are supported, covering ‹map›, ‹filter› and other
standard functions. For example,›

time_fun map

text ‹defines a time function ‹T_map :: ('a ⇒ nat) ⇒ 'a list ⇒ nat›.
The first argument (called ‹T_f› below) is the time function for the first argument ‹f› of ‹map›.
We ignore the definition of ‹T_map› because the output of ‹time_fun map›
suggests that you should add these lemmas›

lemma T_map_simps [simp,code]:
  "T_map T_f [] = 1"
  "T_map T_f (x # xs) = T_f x + T_map T_f xs + 1"
by (simp_all add: T_map_def)

text‹which are what you would expect as defining equations.
You can click on the suggestion to have it copied into your theory.
Afterwards, you can work with ‹T_map› as if it were defined via those equations.

In general, things are a bit more complicated, which is why ‹T_map› is defined the way it is.
Consider›

fun foldl :: "('b ⇒ 'a ⇒ 'b) ⇒ 'b ⇒ 'a list ⇒ 'b" where
"foldl f a [] = a" |
"foldl f a (x # xs) = foldl f (f a x) xs"

time_fun foldl

text‹This definition is generated:

‹fun T_foldl :: ('b ⇒ 'a ⇒ 'b) × ('b ⇒ 'a ⇒ nat) ⇒ 'b ⇒ 'a list ⇒ nat where
 T_foldl (f,T_f) a [] = 1 |
 T_foldl (f,T_f) a (x # xs) = T_f a x + T_foldl f (f a x) xs + 1›

The meaning of the pair ‹(f, T_f)› is obvious. The difference to ‹T_map› is
that ‹T_foldl› needs not just ‹T_f› (like ‹T_map›) but also ‹f›. Function ‹T_map› does not need ‹f›:
in the recursion equation ‹map f (x # xs) = f x # map f xs› the result of subterm ‹f x›
is irrelevant for the computation of ‹T_map› because the running time of ‹(#)› is constant.
This is in contrast to ‹foldl›, whose running time may depend on its second argument.

All higher-order functions are translated like ‹foldl›, but if the first element in ‹(f,T_f)›
is unused, a simplified definition is derived. This is the case for ‹T_map›.

In case you wonder how it is ensured that ‹T_foldl› is always passed a corresponding pair
of a function and its timing function: this is the responsibility of the time framework
when translating functions that use ‹foldl›. Example:›

definition inc :: "int ⇒ 'a ⇒ int" where "inc i x = i+1"
definition "len2 xs = foldl inc 0 xs"
time_definition inc
time_definition len2

text ‹In the defining equation ‹T_len2 xs = T_foldl (inc, T_inc) 0 xs›
we find the correct pair ‹(inc, T_inc)›.›


subsection ‹Limitations›

text‹Partial application and lambda-abstraction are currently not supported.
They need to be replaced by additional function definitions, if possible. For example,›

definition fHO :: "bool list ⇒ bool list" where ‹fHO = map (λx. x ∧ x)›

text ‹is not acceptable (i.e. ‹time_definition fHO› fails), but can be replaced with›

definition double :: "int ⇒ int" where ‹double i = 2 * i›
definition fHO' :: "int list ⇒ int list" where ‹fHO' xs = map double xs›

time_definition double
time_definition fHO'

text ‹That is why in the definition of ‹len2› above we could not just write
‹foldl (λi x. i+1) 0 xs›.›


section ‹Predefined Functions›

text‹The time framework requires executable functions. However,
many basic types and functions are not defined via ‹datatype› and ‹fun›
but in an abstract mathematical fashion and are not executable,
i.e. the time framework does not apply (or gives the `wrong' result).

In order to model actual hardware that executes these predefined functions in constant time,
there is a command for axiomatically declaring that some function takes 0 time.
(This is how we model constant time, to simplify the resulting time expressions.
This does not change the asymptotic running time of user-defined functions using the
predefined functions because 1 is added for every user-defined function call.)
Theory \<^theory>‹HOL-Library.Time_Commands› declares a number of predefined functions
as 0-time functions. This includes
‹(+)›, ‹-›, ‹(*)›, ‹/›, ‹div›, ‹min›, ‹max›, ‹<›, ‹≤›, ‹¬›, ‹∧›, ‹∨› and ‹=›
and can be extended with the command ‹time_fun_0›.
This feature has to be used with care:

▪ Many of these functions are polymorphic and reside in type classes.
The constant-time assumption is justified only for those types where the hardware offers
suitable support, e.g. numeric types. The argument size is implicitly bounded, too.

▪ The constant-time assumption for ‹(=)› is justified for recursive data types such as lists and trees
as long as the comparison is of the form ‹t = c› where ‹c› is a constant term, for example ‹xs = []›.

Users of the time framework need to ensure that 0-time functions are used only within these bounds.›


section ‹Locales›

text ‹If we want to apply the time framework to a function ‹g› defined within a locale,
we need to add additional locale parameters ‹T_f :: τ ⇒ nat› for every locale
parameter ‹f :: τ ⇒ τ'› used in the definition of ‹g›.

In the following example we do not only parameterize the locale with ‹T_f›
but also assume a property of ‹T_f›. As a result we can prove a property of ‹T_g›
inside the locale:›

lemma T_map_sum: "T_map T_f xs = sum_list (map T_f xs) + length xs + 1"
by(induction xs) (auto)

locale LT =
fixes f :: "'a ⇒ 'a"
and T_f :: "'a ⇒ nat"
assumes T_f: "T_f x ≤ 1"
begin

definition g where "g xs = map f xs"

time_definition g

lemma sum_list_map_T_f_ub: "sum_list (map T_f xs) ≤ length xs"
proof(induction xs)
  case (Cons a xs) thus ?case using T_f[of a] by auto
qed simp

lemma T_g_ub: "T_g xs ≤ 2 * length xs + 1"
unfolding T_g.simps T_map_sum using sum_list_map_T_f_ub[of xs]
by linarith

end

text ‹Of course now you need to prove ‹T_f x ≤ 1› for every interpretation of the locale.
A more flexible approach is not to constrain ‹T_f› inside the locale. It may then be difficult
to derive a generic time bound for ‹T_g› inside the locale (in the above example it would not be
difficult). If that is the case, one may also derive a bound for ‹T_g› conditional
on some specific bound for ‹T_f›. Or one can derive the bound for ‹T_g›
after a specific interpretation with a specific ‹T_f›.
For a larger realistic example of the latter approach see theory
‹HOL-Data_Structures.Time_Locale_Example›.›


section ‹Fine Points›

text ‹Time functions for mutually recursive functions ‹f, g, …›: ‹time_fun f g …›.›

text ‹If you want to generate time functions not from the defining equations of a function
but from lemmas proved as equations, you can provide those lemmas explicitly. Example:›

fun f0 :: "nat ⇒ nat" where
"f0 0 = 0" |
"f0 (Suc n) = f0 n"

lemma f0_eq: "f0 n = 0"
by (induction n) auto

time_fun f0 equations f0_eq

text ‹The ‹T_› prefix can be changed by modifying the ‹time_prefix› attribute. Example:›

declare [[time_prefix = "t_"]]

text ‹The time framework is not verified
(which is why the framework always prints out what it defines).
There is no underlying formal model. This remains future work.›

end