OXFORD UNIVERSITY COMPUTING LABORATORY

Minutes of Meetings of the Algebra of Programming Research Group in 2000

Michaelmas Term 2000 (all Fridays)

Summer Break 2000 (all Fridays)

Trinity Term 2000 (all Fridays)

Hilary Term 2000 (all Fridays)


Friday, 1st December 2000
Ross proposed a syntactic sugar for writing arrows, a generalisation of monads, in Haskell. He extended the grammar of expressions with a proc-statement containing a pattern and a command, to be interpretted as an arrow. He then gave a do-notation for commands, and used it to write a deterministic parser. Finally, he mentioned two other applications of arrows: synchronous circuits and homogeneous functions. His slides are available here.

Friday, 24th November 2000
Shin talked once again about how to use the converse function theorem to derive a program that builds a tree from a list of leaf depths. Oege thought of a different function for him to invert this time; it used an accumulating parameter. Like last time, Richard thought Shin had to consider an optimisation problem to build a best forest. Then Shin introduced a predicate to filter out certain kinds of forest and Jeremy suggested that the problem fitted a scheme of generate and test.

Silvija mentioned that Google used inverted files rather than the expected suffix trees, Jeremy speculated how this might work and Richard explained why he was interested.

Friday, 17th November 2000
Jeremy extended the theorem he had proven last week to unfolds that create infinite data structures. Ian talked briefly about how to implement fans, which are non-deterministic, in a functional language, and mentioned Chris Okasaki's random bush generator. Then Jeremy introduced Varmo Vene's thesis. He defined and gave example uses of paramorphisms (namely, dropWhile) and histomorphisms (tabulation), both of which are variants of catamorphisms. Then he did the same for their duals, apomorphisms (append) and futumorphisms (turning a character string into Morse code). He finished by using monads to generalise these new combinators.

Friday, 10th November 2000
Richard gave a datatype for tries and described a function mktrie that given a collection of strings, returned a trie storing exactly those strings. He asked how mktrie should be specified, commenting that if the collection was a set rather than a list then he could not apply the converse function theorem. Oege wrote up a program to find the length of the longest common subsequence of two lists. Then he described his attempts to discover and improve the program's space complexity in Haskell under lazy evaluation. Then, prompted by Richard, Jeremy tried to prove a theorem about folds and unfolds. Finally, Shin talked again about how to derive a shift-reduce parser from the grammar's print function using the converse function theorem. In particular, he wanted to know how to put in a lookahead.

Friday, 3rd November 2000
Oege proposed a notion of minimum edit distance on parse trees. It could be used as part of a version control system to allow several people to edit the same program file at once. The algorithm normally used represents programs as strings instead. Then Richard talked about the search engine of Google, which appears to work by consulting a trie. He noted that this structure uses space very efficiently in Haskell, because lazy evaluation enforces optimal node sharing.

Finally, Ian talked once more about the category of lax natural transformations between endorelators. Oege said it was bound to have initial algebras, subject to a quick check by Clare. He also suggested that nested relators should be defined not by generalising nested functors, but independently so that the corresponding functors can then be shown to be specialisations. This should be done by comparing the folds formed in the functor and relator categories.

Friday, 27th October 2000
Ian explained the first-order semantics for regular datatypes and the higher-order semantics for nested datatypes. In the former, it is easy to show that functors for inductive datatypes commute with converse, and therefore extend to relators, because maps can be written as folds. This is not true in the higher-order semantics however, where the only definition of map is the naturality condition of the initial algebra. Oege said that although intial algebras existed in the functor category, it wasn't clear that this result carried over to the relator category.

Shin had previously used the converse function theorem to derive a program to unflatten a list into a tree. Now, he suggested that LR-parsing for an arbitrary CFG could be derived using the print function for the grammar's parse tree. Oege suggested a special case to look at first.

Friday, 20th October 2000
Yorck explained that a function is dependently typed if its type depends on the values of its inputs. He wrote in Cayenne, a language which supports such functions, an operation without tags that either flipped a boolean or negated an integer, depending on the value of a bivalent input. Its type signature featured a function that examined this input and accordingly returned either the type of booleans or the type of integers. Then he defined polytypic summation using the same technique. Finally, he wrote a zipWith of arbitrary arity using a function that given a number, returned a type of lists of that length. The group argued about whether zipWith was in fact transposition, and Richard mentioned a current paper that used zipWith as an example to argue that one can always do without dependent types.

Friday, 6th October 2000
Ian defined generalised, simple (a.k.a. ungeneralised) and efficient folds on linear nested datatypes. Jeremy and Clare noted that it was only the type system of Haskell that made it impossible to write the identity function as an efficient fold. Ian wanted to know when an efficient fold could be written as a simple fold. The problem was excepted to, but did not, meet the conditions for map-fusion on generalised folds so Jeremy suggested using the universal property of folds instead. The law thus derived was used to rewrite both embed and flatten as simple folds.

Friday, 29th September 2000
Roland combined the shortest path and tallest lorry problems, which could be solved with Warshall's algorithm, into a third which couldn't without adaption. Since the principle of optimality no longer held, he needed to maintain extra information in the nodes so as to restore it. Jeremy compared the problem with that of data mining bank balances and Roland finished by abstracting it to recursive sets of equations.

Motivated by Shin, Jeremy had considered the question of when a function can be written as an unfold. He replaced the sufficient condition of surjectivity with an exact equivalence (iff) condition and proved the result. Roland tried to reprove relationally a needed lemma, but this and the dualisation to unfolds was hard.

Shin wanted to use the converse function theorem to top Okasaki's too informal derivation of an algorithm to label a tree breadth-first with the elements of a list. The underlying definition of breadth-first traversal could be queue-based or of the form "concat after levels". Richard said he should certainly use the former instead of the latter, and concluded by mentioning two similar applications of the theorem.

Friday, 15th September 2000
Ian wanted a better name for the generic operation he had been calling jfold. It is equal to a generalised fold, defined using maps, after another map, but contains no maps itself and has a simple Rank-1 type signature. Jeremy wrote a function which folded power trees in a jfold-like manner, but Clare said it was too ad-hoc and needed two more parameters to reflect the generic form. It then became clear that Jeremy's function was only iterative because it consumed a tail-recursive datatype, and that his alternative name of iterative folds was misleading. He therefore suggested that jfolds be called efficient folds, or efolds for short, because they had no maps, and the group agreed.

Friday, 8th September 2000
The group agreed to keep its present membership (for RAE purposes), name, and meeting time (11 a.m. on Fridays until lunch but only rarely after).

Jeremy talked about how universal quantifiers can simulate existential ones in Haskell datatype declarations. Richard mentioned that he could simplify his own derivation of a thinning algorithm in the maximal marking problems paper by not considering the general case. Then he criticised a proof of a lower bound for the running time of Dijkstra's shortest path algorithm.

Finally, Richard presented the approximate string matching problem, which considers three different kinds of mismatch (insertion, deletion and exchange). He built up a mismatch table for an example and then introduced the simpler problem where deletion is the only kind of mismatch.

Friday, 11th August 2000
Jeremy summarised Roland's method of programming at the type level using the Church encoding of natural numbers, booleans, pairs and so on. The implemented datatype of even-length lists could be changed to generate prime-length lists instead by replacing the evenness test with a primality test. However, this test would have to use a subtraction function for Church numerals and the group could not get such a function past the type checker, despite over an hour of trying. They wished to avoid Shin's solution of putting a forall within the definition of the Church numerals type because polymorphism on the type level is not allowed.

Friday, 28th July 2000
Shin described the problem of building a tree from a list of leaf depths. He wished to derive Geraint's stack-based algorithm, which he wrote up as a fold. (He also briefly mentioned Tarjan's alternative). The specification was the converse of flatten after depths, so the group tried to apply the converse function theorem, after first reproving it for familiarity's sake. Then, the group considered building smallest possible (under some ordering) forests- this generalisation was an optimisation problem. However, Richard believed that this was all too much work to arrive at such a simple algorithm.

Finally, Shin asked a question about proving polytypic properties, Jeremy mentioned that he couldn't get subtraction to work for Church numerals in Haskell, and Richard posed a problem considered by Paige for Geraint to solve.

Friday, 21st July 2000
Roland presented some research that he and Kevin had just done into the expressivity of nested datatypes. A datatype can be seen as a non-deterministic selection from an inductively defined collection of Church numerals, each representing a possible size for an instance of the datatype. Addition, multiplication, products, coproducts and predicates could all then be encoded, according to standard computability theory. Roland developed datatypes for matrices, square matrices (by two different methods), and triangles, but he didn't know if he could do streams. The group constructed a predicate to sieve out lists of even length, in the hope that they could adapt it to composite lengths in order to construct a datatype of lists with prime length. Roland's technique appeared to be more widely applicable than a comparable one proposed by Ralf Hinze.

Friday, 14th July 2000
Roland explained Dijkstra's shortest path algorithm and Knuth's generalisation of the problem. It is to minimise any suitably restricted function on sentences of an arbitrary grammar. This seemed to be an optimisation problem, since the function can always be written as a fold on a datatype representing the grammar. The obvious datatype to use is mutually recursive (and therefore not covered by standard theory), but Richard suggested that Roland should use a rose tree instead. Finally, Richard presented Lambert Meertens' intruder distributed problem, and an edge-labelling graph problem promoted by Ralf Hinze.

Friday, 30th June 2000
The group debated a few questions raised by Ian in a recent email: how to write height generically, how to write reg for alternating lists given that Swap doesn't have membership, and how to informally justify Pedro's embedding. Pedro intended to write the constraints that nested datatypes can capture as a general recursion scheme but Richard thought it too general and Ian believed it to be equivalent to his unsuccessful work on recurrence relations.

Friday, 23rd June 2000
Pedro presented an alternative to Okasaki's embedding which has the convenient property that iterative and non-iterative nests are mapped to the same regular datatype of internally-labelled binary trees. He explained how to construct as a fold the function reg which takes nests to trees. However, his method only worked with linear datatypes. Most constraints captured by nested datatypes are on height, but De Bruijn terms and Braun trees are exceptions. There is no polytypic definition of height in the literature. Lastly, Richard presented his proof that every recursive regular datatype is isomorphic to an iterative datatype.

Friday, 16th June 2000
Richard showed the group how to turn any regular datatype into an isomorphic iterative datatype. However, there wasn't any clear benefit in doing so, and he required that part of the base functor be linear. Ian described how to label and scan nests and bushes, but he couldn't write the scan for bushes as a fold. He had used a higher-order datatype as his labelled variant, in contrast to the approach taken by Richard in his pattern-matching slides. After lunch, Ian repeated his explanation for Richard. Then Richard led a discussion on the terms used to classify nested datatypes.

Friday, 9th June 2000
Shin described the cultural differences between functional and logic programming. In the latter, efficiency concerns and program transformation are considered relatively unimportant. He wrote an Algebra of Programming dynamic programming rule in the language of Prolog. The most obvious meaning of refinement was logical implication. However, Silvija wanted an alternative meaning which would allow programs and their refinements to return different but equally good optimal solutions. The group decided though that this alternative relationship was impracticable.

Friday, 2nd June 2000
We can express most non-polynomial datatypes as least fixed points of polynomial functors. We can't do this for rose trees and rectangles as lists of lists but we can for their isomorphic equivalents; Richard demonstrated how to find them. If we solve naively the problem of contiguously-marking rose trees, we need to form exponentially-sized cross products; we can get round this by using (isomorphically- equivalent) binary trees instead. Shin said that the thinning algorithm for solving the problem can be replaced by several greedy

Friday, 26th May 2000
Clare wrote up a nested datatype for leftist left-perfect trees. She wanted to write insertion for it as a generalised fold. Jeremy said that one would expect to use a generalised unfold instead, by analogy with regular trees. Ian suggested she use Okasaki's insertion for random-access lists as a template. The group then debated in what sense those two datatypes were isomorphic.

Friday, 19th May 2000
Ian had solved last week's problem of turning any nested function written as a jfold into its regular counterpart. However, his reasoning was circular because his reg was defined in terms of jfold; an independent characterisation of reg was needed. He could specify reg in terms of shape and contents properties but he could not prove that his definition obeyed them. The group disagreed about how to define jfold, with Richard saying that the presentation would still be circular even if reg's definition was rephrased in terms of gfold.

Friday, 12th May 2000
Richard wrote up the standard function for reversing a list using an accumulating parameter. He derived from it a pointer algorithm for in-place list reversal; his specification used an unfold to generate lists of addresses. Ian wanted to turn nested programs written as jfolds into regular programs written as folds. He thought he could apply jfold fusion to the refinement relation linking the two programs and thereby get simple conditions relating the fold's and the jfold's arguments. The conditions obtained were instead very complex. The group pointed out that he needed an independent definition of the embedding function reg.

Friday, 5th May 2000
The group discussed its strategy for the next few years so that Jeremy could produce a written statement about it. Ian introduced the problem of turning an arbitrary nested function f_N into its regular equivalent f_R. If f_N is an iterative fold with regular target then f_N and f_R are linked by a simple law. Ian suggested that the only other case worth considering was where f_N is a constructor function since all other functions could be built from it and the regular functions. Jeremy showed that both the constructor function and the identity function could be written as an iterative fold.

Friday, 10th March 2000
Jeremy stated and proved the fusion law for his bottom-up fold on Ian's nested type from last week. He also proved the scan lemma for the scan associated with the fold. This had been assumed correct in the derivation of last week's pattern-matching algorithm. Both proofs were inductive and therefore messy because no formal semantics has yet been developed for the fold. The group then tried and failed to work out the naturality condition (equivalent to Wadler's "theorem for free") for list concatenation.

Friday, 3rd March 2000
Jeremy explained how he had used an iterative datatype designed by Ian to rewrite Richard's development of a pattern-matching algorithm for nests. He focussed on the encoding of suffixes, which he memoised. His functions used a previously unseen fold operator which captured bottom-up evaluation on the underlying regular datatype. The group debated whether the new fold was related to the canonical one. Richard then talked briefly about how to express monotonicity in Oege's pointwise relational language.

Friday, 25th February 2000
Ian had invented a nested type that was preferable to Nest for pattern matching on perfect trees. However, he couldn't write suffixes for it. The group thought this was because he was using the wrong labelled variant. They discovered many alternatives by drawing pictures and manipulating functor equations, but no one candidate seemed the best.

Friday, 11th February 2000
Richard had written a pattern matching function for perfect internally labelled trees, using a nested type to enforce the constraint of perfection. Ian suggested changing the nested type to make the function more efficient.

The week before, Oege had introduced a pointwise relational calculus. This week he wanted to define a refinement ordering for it; subset inclusion is not good enough. He used some desirable properties of the ordering to derive it. The case of function spaces was particularly tricky.

Friday, 4th February 2000
Richard suggested that Jens should extend his pattern matching algorithm from strings to trees. Richard then talked briefly about pattern matching on perfect trees. Shin wrote up a program that chose n elements from a list in all possible ways and filtered the solutions. He then used an accumulating parameter to derive an efficient bottom-up version of the same program. Jeremy wondered how to write the iterative version of the gcd algorithm using folds and unfolds. Next week's meeting is at 11 a.m.

Friday, 28th January 2000
Oege wanted to calculate relational programs pointwise; pointfree is the norm. His motivating example used relational converse to specify a function that builds forests. Oege explained the semantics of a choice operator he wanted to use and then he talked about the equivalents in category Rel of curry and uncurry in category Fun. The group did not meet after lunch.

Friday, 21st January 2000
Before lunch, Sharon presented a greedy solution to the marbles problem: given n marbles, collect as many sets of k differently- coloured marbles as possible. The program was obscure so Jeremy started to write a clearer version, which he later e-mailed to the group.

Jens explained that finding common arithmetic subexpressions was equivalent to calculating the dominator tree for a flow graph. Since the latter problem is well documented, it is a pity that the equivalence fails to hold when conditions (which may be non-strict) are introduced.

After lunch, Shin tried to prove that perm . nodup = nodup . perm, where perm permutes a list and nodup is the predicate (modelled as a coreflexive) that returns true when the input list has no repetitions. Sadly, this couldn't be done without inventing axioms that were no more obviously true than the result itself.

Friday, 14th January 2000
Before lunch, Richard talked about the thinning theorem for optimisation problems of the form "min after fold". If the fold had non-deterministic (i.e. relational) arguments, he showed that they could be made functional, yielding an implementation. This result was applied to the problem of building minimum cost trees. Richard finished by giving an ordering predicate on which to thin. Oege presented his write-up (and expansion) of last week's talk about how to turn d.a.g.s into let-expressions.

After lunch, Richard discovered that his ordering didn't satisfy the thinning theorem's monotonicity condition.

Friday, 7th January 2000
The group discussed conference deadlines, confirmed the regular meeting time as 11.30 a.m., and made a list of topics for discussion in Hilary term:

  • building if-then-else expressions (Oege)
  • topics relating to Ian's thesis
  • building trees of minimum height (Richard)
  • semantics of higher-order nested datatypes (Clare)
  • pattern matching
  • data mining (Jeremy)
  • pointwise relational programming (Jeremy, Oege, and Richard)

Oege explained that d.a.g.s could be used to represent if-then-else expressions. He wanted to transform the d.a.g.s into programs in such a way that boolean tests are never performed needlessly. The algorithm he has designed to do this didn't work when implemented. Richard talked about building trees of minimum height again. Jeremy described the real-life problem of mining data held on customer accounts by banks, who wanted to predict from the account balance of their customers whether they would want a loan.

After lunch, Ian tried to formally prove something that was assumed to be true by construction, namely that generalised folds are natural transformations. It would follow immediately that his candidate zips for nested types were too.


Ian Bayley (Ian.Bayley@comlab.ox.ac.uk)

[Oxford Spires]



Oxford University Computing Laboratory Courses Research People About us News