|
|
Minutes of Meetings of the Algebra of Programming Research Group in 2000
-
Michaelmas Term 2000 (all Fridays)
-
-
Summer Break 2000 (all Fridays)
-
-
Trinity Term 2000 (all Fridays)
-
-
Hilary Term 2000 (all Fridays)
-
-
Friday, 1st December 2000
-
Ross proposed a syntactic sugar for writing arrows, a generalisation
of monads, in Haskell. He extended the grammar of expressions with a
proc-statement containing a pattern and a command, to be
interpretted as an arrow. He then gave a do-notation for commands,
and used it to write a deterministic parser. Finally, he mentioned two
other applications of arrows: synchronous circuits and homogeneous
functions. His slides are available
here.
-
Friday, 24th November 2000
-
Shin talked once again about how to use the converse function theorem
to derive a program that builds a tree from a list of leaf
depths. Oege thought of a different function for him to invert this
time; it used an accumulating parameter. Like last time, Richard
thought Shin had to consider an optimisation problem to build a best
forest. Then Shin introduced a predicate to filter out certain kinds
of forest and Jeremy suggested that the problem fitted a scheme of
generate and test.
Silvija mentioned that Google used inverted files rather than the
expected suffix trees, Jeremy speculated how this might work and Richard
explained why he was interested.
-
Friday, 17th November 2000
-
Jeremy extended the theorem he had proven last week to unfolds that
create infinite data structures. Ian talked briefly about how to
implement fans, which are non-deterministic, in a functional language,
and mentioned Chris Okasaki's random bush generator. Then Jeremy
introduced Varmo Vene's thesis. He defined and gave example uses of
paramorphisms (namely, dropWhile) and histomorphisms (tabulation),
both of which are variants of catamorphisms. Then he
did the same for their duals, apomorphisms (append)
and futumorphisms (turning a character string into Morse code).
He finished by using monads to generalise these new combinators.
-
Friday, 10th November 2000
-
Richard gave a datatype for tries and described a function mktrie that
given a collection of strings, returned a trie storing exactly those
strings. He asked how mktrie should be specified, commenting that
if the collection was a set rather than a list then he could not
apply the converse function theorem.
Oege wrote up a program to find the length of the longest common
subsequence of two lists. Then he described his attempts to discover
and improve the program's space complexity in Haskell under lazy
evaluation.
Then, prompted by Richard, Jeremy tried to prove a theorem about folds
and unfolds.
Finally, Shin talked again about how to derive a shift-reduce parser
from the grammar's print function using the converse function
theorem. In particular, he wanted to know how to put in a lookahead.
-
Friday, 3rd November 2000
-
Oege proposed a notion of minimum edit distance on parse trees. It could be
used as part of a version control system to allow several people to
edit the same program file at once. The algorithm normally used
represents programs as strings instead.
Then Richard talked about the search engine of Google, which appears to
work by consulting a trie. He noted that this structure uses space
very efficiently in Haskell, because lazy evaluation
enforces optimal node sharing.
Finally, Ian talked once more about the category of lax natural
transformations between endorelators. Oege said it was bound to have
initial algebras, subject to a quick check by Clare. He also suggested
that nested relators should be defined not by generalising nested
functors, but independently so that the corresponding functors can
then be shown to be specialisations. This should be done by comparing the
folds formed in the functor and relator categories.
-
Friday, 27th October 2000
-
Ian explained the first-order semantics for regular datatypes and the
higher-order semantics for nested datatypes. In the former, it is easy
to show that functors for inductive datatypes commute with converse,
and therefore extend to relators, because maps can be written as
folds. This is not true in the higher-order semantics however,
where the only definition of map is the naturality condition
of the initial algebra. Oege said that although intial algebras
existed in the functor category, it wasn't clear that this result
carried over to the relator category.
Shin had previously used the converse function theorem to derive a
program to unflatten a list into a tree. Now, he suggested that
LR-parsing for an arbitrary CFG could be derived using the print
function for the grammar's parse tree. Oege suggested a special
case to look at first.
-
Friday, 20th October 2000
-
Yorck explained that a function is dependently typed if its type
depends on the values of its inputs. He wrote in Cayenne, a language
which supports such functions, an operation without tags that either
flipped a boolean or negated an integer, depending on the value of a
bivalent input. Its type signature featured a function that examined this
input and accordingly returned either the type of booleans or the type
of integers. Then he
defined polytypic summation using the same technique. Finally, he wrote a
zipWith of arbitrary arity using a function that given a number,
returned a type of lists of that length. The group argued about
whether zipWith was in fact transposition, and Richard mentioned a current
paper that used zipWith as an example to argue that one can always do
without dependent types.
-
Friday, 6th October 2000
-
Ian defined generalised, simple (a.k.a. ungeneralised) and efficient folds
on linear nested datatypes. Jeremy and Clare noted that it was only the
type system of Haskell that made it impossible to write the identity
function as an efficient fold.
Ian wanted to know when an efficient fold could be written as a simple
fold. The problem was excepted to, but did not, meet the conditions for
map-fusion on generalised folds so Jeremy suggested using the
universal property of folds instead. The law thus derived was
used to rewrite both embed and flatten as simple folds.
-
Friday, 29th September 2000
-
Roland combined the shortest path and tallest lorry problems, which
could be solved with Warshall's algorithm, into a third which couldn't
without adaption. Since the principle of optimality no longer held, he
needed to maintain extra information in the nodes so as to restore
it. Jeremy compared the problem with that of data mining bank balances
and Roland finished by abstracting it to recursive sets of equations.
Motivated by Shin, Jeremy had considered the question of
when a function can be written
as an unfold. He replaced the sufficient condition of surjectivity
with an exact equivalence (iff) condition and proved the
result. Roland tried to reprove relationally a needed lemma, but this
and the dualisation to unfolds was hard.
Shin wanted to use the converse function theorem to top Okasaki's too
informal derivation of an algorithm to label a tree breadth-first with
the elements of a list. The underlying definition of breadth-first
traversal could be queue-based or of the form "concat after
levels". Richard said he should certainly use the former instead of
the latter, and concluded by mentioning two similar applications
of the theorem.
-
Friday, 15th September 2000
-
Ian wanted a better name for the generic operation he had been calling
jfold. It is equal to a generalised fold, defined using maps, after
another map, but contains no maps itself and has a simple Rank-1 type
signature. Jeremy wrote a function which folded power trees in a
jfold-like manner, but Clare said it was too ad-hoc and needed
two more parameters to reflect the generic form. It then became clear
that Jeremy's function was only iterative because it consumed a
tail-recursive datatype, and that his alternative name of iterative
folds was misleading. He therefore suggested that jfolds be called
efficient folds, or efolds for short, because they had no maps, and
the group agreed.
-
Friday, 8th September 2000
-
The group agreed to keep its present membership (for RAE purposes),
name, and meeting time (11 a.m. on Fridays until lunch but only rarely
after).
Jeremy talked about how universal quantifiers can simulate
existential ones in Haskell datatype declarations. Richard
mentioned that he could simplify his own derivation of a thinning
algorithm in the maximal marking problems paper by not considering the
general case. Then he criticised a proof of a lower bound for the
running time of Dijkstra's shortest path algorithm.
Finally, Richard presented the approximate string matching problem,
which considers three different kinds of mismatch (insertion,
deletion and exchange). He built up a mismatch table for an example
and then introduced the simpler problem where deletion is the only
kind of mismatch.
-
Friday, 11th August 2000
-
Jeremy summarised Roland's method of programming at the type level
using the Church encoding of natural numbers, booleans, pairs and so
on. The implemented datatype of even-length lists could be changed
to generate prime-length lists instead by replacing the evenness test
with a primality test. However, this test would have to use a
subtraction function for Church numerals and the group could not get
such a function past the type checker, despite over an hour of
trying. They wished to avoid Shin's solution of putting a forall
within the definition of the Church numerals type because polymorphism
on the type level is not allowed.
-
Friday, 28th July 2000
-
Shin described the problem of building a tree from a list of leaf
depths. He wished to derive Geraint's stack-based algorithm, which he
wrote up as a fold. (He also briefly mentioned Tarjan's alternative).
The specification was the converse of flatten after depths, so the
group tried to apply the converse function theorem, after first
reproving it for familiarity's sake. Then, the group considered
building smallest possible (under some ordering) forests- this
generalisation was an optimisation problem. However, Richard
believed that this was all too much work to arrive at such a
simple algorithm.
Finally, Shin asked a question about proving polytypic properties,
Jeremy mentioned that he couldn't get subtraction to work for Church
numerals in Haskell, and Richard posed a problem considered by Paige for
Geraint to solve.
-
Friday, 21st July 2000
-
Roland presented some research that he and Kevin had just done into the
expressivity of nested datatypes. A datatype can be seen as a
non-deterministic selection from an inductively defined collection
of Church numerals, each representing a possible size for an instance
of the datatype. Addition, multiplication, products, coproducts and
predicates could all then be encoded, according to standard computability
theory. Roland developed datatypes for matrices, square matrices (by
two different methods), and triangles, but he didn't know if he could
do streams. The group constructed a predicate to sieve out lists of
even length, in the hope that they could adapt it to composite lengths
in order to construct a datatype of lists with prime length. Roland's
technique appeared to be more widely applicable than a comparable one
proposed by Ralf Hinze.
-
Friday, 14th July 2000
-
Roland explained Dijkstra's shortest path algorithm and Knuth's
generalisation of the problem. It is to minimise any suitably
restricted function on sentences of an arbitrary grammar. This
seemed to be an optimisation problem, since the function can
always be written as a fold on a datatype representing the grammar.
The obvious datatype to use is mutually recursive (and therefore
not covered by standard theory), but Richard suggested that Roland
should use a rose tree instead. Finally, Richard presented Lambert
Meertens' intruder distributed problem, and an edge-labelling graph
problem promoted by Ralf Hinze.
-
Friday, 30th June 2000
-
The group debated a few questions raised by Ian in a recent email:
how to write height generically, how to write reg for alternating
lists given that Swap doesn't have membership, and how to informally
justify Pedro's embedding. Pedro intended to write the constraints
that nested datatypes can capture as a general recursion scheme
but Richard thought it too general and Ian believed it to be
equivalent to his unsuccessful work on recurrence relations.
-
Friday, 23rd June 2000
-
Pedro presented an alternative to Okasaki's embedding which has the
convenient property that iterative and non-iterative nests are mapped
to the same regular datatype of internally-labelled binary trees.
He explained how to construct as a fold the function reg which
takes nests to trees. However, his method only worked with linear
datatypes. Most constraints captured by nested datatypes are on
height, but De Bruijn terms and Braun trees are exceptions.
There is no polytypic definition of height in the literature.
Lastly, Richard presented his proof that every recursive regular
datatype is isomorphic to an iterative datatype.
-
Friday, 16th June 2000
-
Richard showed the group how to turn any regular datatype into an
isomorphic iterative datatype. However, there wasn't any clear benefit
in doing so, and he required that part of the base functor be linear.
Ian described how to label and scan nests and bushes, but he couldn't
write the scan for bushes as a fold. He had used a higher-order datatype
as his labelled variant, in contrast to the approach taken by Richard
in his pattern-matching slides. After lunch, Ian repeated his
explanation for Richard. Then Richard led a discussion on the terms
used to classify nested datatypes.
-
Friday, 9th June 2000
-
Shin described the cultural differences between functional and logic
programming. In the latter, efficiency concerns and program
transformation are considered relatively unimportant. He wrote an
Algebra of Programming dynamic programming rule in the language
of Prolog. The most obvious meaning of refinement was logical
implication. However, Silvija wanted an alternative meaning which
would allow programs and their refinements to return different but
equally good optimal solutions. The group decided though that this
alternative relationship was impracticable.
-
Friday, 2nd June 2000
-
We can express most non-polynomial datatypes as least fixed points of
polynomial functors. We can't do this for rose trees and rectangles as
lists of lists but we can for their isomorphic equivalents; Richard
demonstrated how to find them. If we solve naively the problem of
contiguously-marking rose trees, we need to form exponentially-sized
cross products; we can get round this by using (isomorphically-
equivalent) binary trees instead. Shin said that the thinning
algorithm for solving the problem can be replaced by several greedy
-
Friday, 26th May 2000
-
Clare wrote up a nested datatype for leftist left-perfect trees. She
wanted to write insertion for it as a generalised fold. Jeremy said
that one would expect to use a generalised unfold instead, by analogy
with regular trees. Ian suggested she use Okasaki's insertion for
random-access lists as a template. The group then debated in what sense
those two datatypes were isomorphic.
-
Friday, 19th May 2000
-
Ian had solved last week's problem of turning any nested function
written as a jfold into its regular counterpart. However, his
reasoning was circular because his reg was defined in terms of
jfold; an independent characterisation of reg was needed. He could specify
reg in terms of shape and contents properties but he could not prove that
his definition obeyed them. The group disagreed about how to define
jfold, with Richard saying that the presentation would still be
circular even if reg's definition was rephrased in terms of gfold.
-
Friday, 12th May 2000
-
Richard wrote up the standard function for reversing a list using an
accumulating parameter. He derived from it a pointer algorithm for
in-place list reversal; his specification used an unfold to
generate lists of addresses. Ian wanted to turn nested programs
written as jfolds into regular programs written as folds. He thought
he could apply jfold fusion to the refinement relation linking the
two programs and thereby get simple conditions relating the fold's
and the jfold's arguments. The conditions obtained were instead very
complex. The group pointed out that he needed an independent definition
of the embedding function reg.
-
Friday, 5th May 2000
-
The group discussed its strategy for the next few years so that Jeremy
could produce a written statement about it. Ian introduced the problem
of turning an arbitrary nested function f_N into its regular
equivalent f_R. If f_N is an iterative fold with regular target then
f_N and f_R are linked by a simple law. Ian suggested that the only
other case worth considering was where f_N is a constructor function
since all other functions could be built from it and the regular
functions. Jeremy showed that both the constructor function and the
identity function could be written as an iterative fold.
-
Friday, 10th March 2000
-
Jeremy stated and proved the fusion law for his bottom-up fold on
Ian's nested type from last week. He also proved the scan lemma for
the scan associated with the fold. This had been assumed correct in the
derivation of last week's pattern-matching algorithm. Both proofs
were inductive and therefore messy because no formal semantics has yet
been developed for the fold. The group then tried and failed to
work out the naturality condition (equivalent to Wadler's "theorem for
free") for list concatenation.
-
Friday, 3rd March 2000
-
Jeremy explained how he had used an iterative datatype designed by Ian to
rewrite Richard's development of a pattern-matching algorithm for
nests. He focussed on the encoding of suffixes, which he memoised.
His functions used a previously unseen fold operator which captured
bottom-up evaluation on the underlying regular datatype. The group
debated whether the new fold was related to the canonical one. Richard
then talked briefly about how to express monotonicity in Oege's
pointwise relational language.
-
Friday, 25th February 2000
-
Ian had invented a nested type that was preferable to Nest for pattern
matching on perfect trees. However, he couldn't write suffixes for
it. The group thought this was because he was using the wrong labelled
variant. They discovered many alternatives by drawing pictures and
manipulating functor equations, but no one candidate seemed the best.
-
Friday, 11th February 2000
-
Richard had written a pattern matching function for perfect internally
labelled trees, using a nested type to enforce the constraint of
perfection. Ian suggested changing the nested type to make the function
more efficient.
The week before, Oege had introduced a pointwise relational calculus. This
week he wanted to define a refinement ordering for it; subset inclusion is
not good enough. He used some desirable properties of the ordering to
derive it. The case of function spaces was particularly tricky.
-
Friday, 4th February 2000
-
Richard suggested that Jens should extend his pattern matching
algorithm from strings to trees. Richard then talked briefly about
pattern matching on perfect trees. Shin wrote up a program that chose
n elements from a list in all possible ways and filtered the
solutions. He then used an accumulating parameter to derive an
efficient bottom-up version of the same program. Jeremy wondered how
to write the iterative version of the gcd algorithm using folds and
unfolds. Next week's meeting is at 11 a.m.
-
Friday, 28th January 2000
-
Oege wanted to calculate relational programs pointwise; pointfree is
the norm. His motivating example used relational converse to specify a
function that builds forests. Oege explained the semantics of a choice
operator he wanted to use and then he talked about the equivalents in
category Rel of curry and uncurry in category Fun. The group did not
meet after lunch.
-
Friday, 21st January 2000
-
Before lunch, Sharon presented a greedy solution to the marbles
problem: given n marbles, collect as many sets of k differently-
coloured marbles as possible. The program was obscure so Jeremy
started to write a clearer version, which he later e-mailed to the
group.
Jens explained that finding common arithmetic subexpressions was
equivalent to calculating the dominator tree for a flow graph. Since
the latter problem is well documented, it is a pity that the
equivalence fails to hold when conditions (which may be non-strict)
are introduced.
After lunch, Shin tried to prove that perm . nodup = nodup . perm,
where perm permutes a list and nodup is the predicate (modelled as a
coreflexive) that returns true when the input list has no repetitions.
Sadly, this couldn't be done without inventing axioms that were no
more obviously true than the result itself.
-
Friday, 14th January 2000
-
Before lunch, Richard talked about the thinning theorem for
optimisation problems of the form "min after fold". If the fold had
non-deterministic (i.e. relational) arguments, he showed that they
could be made functional, yielding an implementation.
This result was applied to the problem of building minimum cost trees.
Richard finished by giving an ordering predicate on which to thin.
Oege presented his write-up (and expansion) of last week's talk about
how to turn d.a.g.s into let-expressions.
After lunch, Richard discovered that his ordering didn't satisfy the
thinning theorem's monotonicity condition.
-
Friday, 7th January 2000
-
The group discussed conference deadlines, confirmed the regular
meeting time as 11.30 a.m., and made a list of topics for
discussion in Hilary term:
- building if-then-else expressions (Oege)
- topics relating to Ian's thesis
- building trees of minimum height (Richard)
- semantics of higher-order nested datatypes (Clare)
- pattern matching
- data mining (Jeremy)
- pointwise relational programming (Jeremy, Oege, and Richard)
Oege explained that d.a.g.s could be used to represent if-then-else
expressions. He wanted to transform the d.a.g.s into programs
in such a way that boolean tests are never performed needlessly.
The algorithm he has designed to do this didn't work when implemented.
Richard talked about building trees of minimum height again.
Jeremy described the real-life problem of mining data held on
customer accounts by banks, who wanted to predict from the account
balance of their customers whether they would want a loan.
After lunch, Ian tried to formally prove something that was assumed to
be true by construction, namely that generalised folds are natural
transformations. It would follow immediately that his candidate zips
for nested types were too.
Ian Bayley
(Ian.Bayley@comlab.ox.ac.uk)
|
|