Algebra of Programming Research Group: 2004 Minutes

Minutes of 2004 Meetings of the Algebra of Programming Research Group

Michaelmas Term 2004 (all Fridays)

10th December
3rd December
26th November
12th November
5th November
8th October
1st October

Trinity Term 2004 (all Fridays)

11th June
30th April

Hilary Term 2004 (all Fridays)

12th March
5th March
27th February
20th February
13th February
6th February
23rd January
9th January

Friday, 10th December 2004

(Minutes by JG)

Michael Abbott talked today, about what he called "omega-sets". An omega-set is a pair ( X, |= ) where |= is a relation between the naturals and X such that

  for all x in X, there exists n in N with n |= X

n is said to "witness" x. He made this into a category, in which the objects are omega-sets, and the arrows are functions between the underlying sets which are themselves witnessed by numbers. In particular, he chose to witness each (computable, I guess) function by its Goedel number.

Michael is interested in this structure, because it gives a rare example of a setting with considerable expressive power ("a model of Martin-Löf type theory"), but nevertheless without the usual existence of infinite limits and colimits.

Apologies if that's a bit garbled; I came in after Michael started and left before he finished!

Friday, 3rd December 2004

Richard came with a problem:

(1) Let @ be an associative operator with an irreducible unit e. (That is, not only is x@e=x=e@x, but also x@y=e => x=y=e.)

(2) Define <= by

  x <= y  <=>  exists z . x @ z = y

(3) Richard claimed that <= is then a partial order, with least element e.

(4) Define + by

  (x + y) <= z  <=>  (x <= z) && (y <= z)

(5) Richard claimed that + is then associative, commutative, and had identity e.

(6) Richard wondered whether @ then distributes over +.

--------------- Examples include:

the naturals, with @ being addition. Then the ordering is the natural ordering, and + is max. Indeed, addition distributes over max.
the naturals, with @ being multiplication. Then the ordering is by division, and + is least common multiple. Indeed, multiplication distributes over lcm.
strings, with @ being concatenation. Then the ordering is the prefix ordering, and + is longest common prefix. Concatenation distributes (from the left) over lcp.
left-closed, right-open subintervals of the unit interval, with @ being the "narrow" operation from arithmetic coding. Then the ordering is interval containment, and + is interval intersection. (I think that must be partial: what is the intersection of disjoint intervals?) Does narrowing distribute over intersection? Too late in the afternoon for me to verify...

--------------- After some discussion, we were impressed by the examples, but weren't sure of some of the theorems. In particular, the least upper bound operator + is not always completely defined by (4), and we thought the ordering introduced in (2) is not always a partial order.

Friday, 26th November 2004

Present: SC, CEM, GJ, JG, BS

JG had been talking to Graham Hutton (GH) about the stable marriage problem. The stable marriage problem is where you have a number of men, the same number of women, and you want to marry them all off. The stability refers to when you don't have one man and one woman not married to each other who want to elope together because of both preferring their fellow eloper to their spouse.

For example, say we have men 1 2 3 4, women a b c d

The rankings <_m for the men could be

            ___________
            1 | b c a d
            2 | c b a d
            3 | b a c d
            4 | b c a d

The rankings <_w for the women could be

            ___________
            a | 2 1 4 3
            b | 1 3 4 2
            c | 4 2 3 1
            d | 1 3 4 2

A pair of marriages (m,w) (m',w') is unstable iff (for w, m' the elopees)

m' <_w m and w <_m' w'

(i.e. w prefers m' to m, and m' prefers w to w')

The challenge is to find a stable matching. It's not obvious that there exists a stable matching, but one does exist.

It was also noted that you can get cycles of unstable matchings, where one pair of elopers gives rise to another pair of elopers, which ....

Both JG and GH had sat down and written programs to do this, but they were completely different programs.

JG asked whether we wanted a specification or a program. We chose a specification.

           filter stable societies

           societies = map (zip men) (perms women)

This is asymmetrical with respect to men/women, so you could have vice versa.

           stable = all pairStable . pairs
           pairs xs = [(x,y) | x<- xs, y<- xs, x=/=y]

There was some concern over the type checking of the pairs (x,y) and how you'd know that x and y were different sexes, but it was pointed out that the (x,y) pairs were pairs of marriages not people so you didn't need to know the sex of x and y. Or, as GJ put it, "Marriages have no sex."

pairUnstable (m,w) (m',w') = (m' <_wm && w <_m' w') || (m <_w' m' && w' <_m w)

Not that we can write (x,) , but still ...

           pairs (x:xs) = map (x,) xs ++ pairs xs
           pairs [] = []

"It's a fold."

"No it isn't."

"It's a paramorphism."

"So it is."

--------------- JG said this was related to Dijkstra's Shortest Paths algorithm, also a coupon collecting problem, where you have boxes containing 1 of n coupons, and you want to know how many you should collect to get the full set of coupons.

"Can you see inside the boxes?"

"No, you can't go rummaging around opening all the boxes in the supermarket."

           pairs = concat . map tailpairs . tails

           tailpairs (x:xs) = map (x,) xs
           tailpairs [] = []

Moving over to the other whiteboard, JG went through the Gale-Shapley [1] algorithm for stable marriages.

Round 1. All men propose to their most favoured women. All women now have a (possibly-empty) collection of proposals. Each chooses the best on offer (if they can).

            ___________
            1 |*b c a d
            2 |*c b a d
            3 |*b a c d
            4 |*b c a d

Proposals: 1,b  2,c  3,b  4,b
Accepted:  1,b  2,c

Round 2. All unattached men now make a proposal to their second-favoured woman. Each woman with fresh proposals chooses the best of those (this may involving dumping a fiance).

            ___________
            1 |*b c a d
            2 |*c b a d
            3 |*b*a c d
            4 |*b*c a d

Proposals: 3,a  4,c
Accepted: 3,a 4,c (c dumps 2)        1,b still affianced

Round 3. All unattached men now make a proposal to their next favoured woman that they haven't proposed to yet (this is how all rounds can be described).

            ___________
            1 |*b c a d
            2 |*c*b a d
            3 |*b*a c d
            4 |*b*c a d

Proposals: 2,b  (not accepted)

Round 4.

            ___________
            1 |*b c a d
            2 |*c*b*a d
            3 |*b*a c d
            4 |*b*c a d

Proposals: 2,a  (accepted, a dumps 3)
Now have 1,b  2,a  4,c

Round 5.

            ___________
            1 |*b c a d
            2 |*c*b*a d
            3 |*b*a*c d
            4 |*b*c a d


Proposals: 3,c (not accepted)

Round 6.

            ___________
            1 |*b c a d
            2 |*c*b*a d
            3 |*b*a*c*d
            4 |*b*c a d

Proposals: 3,d (accepted)

Marriages are: 1,b; 4,c; 2,a; 3,d.

We checked that this was stable (it was). There then followed a discussion about the results. JG said this algorithm produces the best of all stable marriages for the men, and the worst of all stable marriages for the women. GJ and SC wanted to know what he meant by "best" and whether he meant "maximal" really. JG explained that when defining "best" for a man, he meant that no man can make progress to another stable matching that improves his lot (wife).

There were wonderings about whether the stable matchings formed a lattice, but it was decided not, as you can get two stable matchings whose meet isn't.

Editorial note: Actually, the set of stable matchings for given preferences do form a lattice. The ordering is a matter of dominance: matching M dominates matching M' from the men's point of view if every man's wife in M is at least as good in his preference ordering as his wife in M'. Dually for dominance from the women's point of view. Moreover, M dominates M' from the men's pov iff M' dominates M from the women's.
The meet of two matchings M and M' gives to every man the more preferable of his partners in M and M' (and to every woman the less preferable). The join is the dual. It isn't immediately obvious to me that these yield matchings at all, let alone stable ones, but they do.

The Gale-Shapley algorithm starts off with the "best" (in some sense) for men, and then reduces to the best stable matching. Whereas the women start off in the worst state, and then they gradually improve their lot as the algorithm progresses.

Editorial note: Right. The "best" that GS starts with isn't a matching: every man is courting his most preferred woman, but there may be competition and there may be wallflowers. The men's preference lists are traversed in descending order, approaching the set of stable matchings from above from the men's pov. At the same time, all the women start off unmatched (which, in this traditional world, is the worst of all circumstances), and gradually get matched to better and better men - approaching the set of stable matchings from below from the women's pov.

JG said that his implementation implemented Gale-Shapley, but it didn't resemble the spec. GH's was different, in that it considered a single man's proposal at a time, rather than in rounds: effectively there's a "bag" of unattached men, and a random unattached man gets to make his next proposal each round. So as far as the order with which things happen in goes, it does a different topological sort of the graph of men's preferences.

Essentially the two programs "approach the frontier" of the histogram *s differently.

            1 |*b
            2 |*c*b*a
            3 |*b*a*c*d
            4 |*b*c

Knuth's little book (a collection of transcripts of lectures given in Montreal [2]) gives another traversal. It does one unattached man at a time, but uses a stack, not a bag:

            ___________
            1 | b c a d
            2 | c b a d
            3 | b a c d
            4 | b c a d

The animated data changing on the white board could be transcribed as:

Stack: 1,2,3,4

Proposal: 1,b (accepted)
Arrangements now: 1,b
Stack: 2,3,4

Proposal: 2,c (accepted)
Arrangements now: 1,b   2,c
Stack: 3,4

Proposal: 3,b (not accepted)
Arrangements now: 1,b   2,c
Stack: 3,4

Proposal: 3,a (accepted)
Arrangements now: 1,b   2,c   3,a
Stack: 4

Proposal: 4,b (not accepted)
Arrangements now: 1,b   2,c   3,a
Stack: 4

Proposal: 4,c (accepted)
Arrangements now: 1,b   4,c   3,a
Stack: 2

Proposal: 2,b (not accepted)
Arrangements now: 1,b   4,c   3,a
Stack: 2

Proposal: 2,a (accepted)
Arrangements now: 1,b   4,c   2,a
Stack: 3

Proposal: 3,c (not accepted)
Arrangements now: 1,b   4,c   2,a
Stack: 3

Proposal: 3,d (accepted)
Arrangements now: 1,b   4,c   2,a   3,d
Stack:

We speculated that the general version of the algorithm was some order in which you go through that *s histogram from the left. The invariant for that would be that the people matched so far (that sub society) would be matched in a stable way. The variant is the # of remaining proposals altogether (not the number of unattached people).

The original Gale-Shapley was for matching students to colleges, and you could do this from the point of view of the students or the colleges. It makes sense to have the students choose.

SC was reminded of this being used in the setting of students being matched with hospitals [3], where it used to be done from the hospitals point of view, but was then changed to be done from the students point of view (so the students are the men and get their best stable matching).

There then followed a discussion about how many marriages you can break up in a single round of the Gale-Shapley algorithm. Can you break up n/2? Yes, you can.

Dijkstra's shortest paths algorithm was also discussed, in that the creeping histogram frontier was similar to the collection of nodes to which the shortest paths were known. However nodes don't change their minds in Dijkstra's algorithm.

JG said something about his program and co-induction, and got made to paraphrase it, which he did, saying that it was consumptive rather than productive (consuming proposals, and not necessarily producing more marriages).

GJ pointed out why the program feels far away from the specification: the algorithm consumes proposals, and the proposals don't actually appear in the specification.

      maptOptimal (filter stable (societies men women))

      societies men women = map (zip man) (perms women)

We want to use proposals to build the society. We want to consume the table for <_mfor the men, in the order of the proposals.

Maybe we should expand the pairUnstable definition, to walk over the <_m ordering, as a table, whilst keeping the <_w as an ordering.

By this time, lunch was looming.

Minutes by SC.

References:

[1] D. Gale and L.S. Shapley: College admissions and stability of marriage American Mathematical Monthly, 69(1):9-15, 1962

[2] Donald E. Knuth: Mariages stables et leur relations avec d'autre problemes combinatoires. Introduction a l'analyze mathematique des algorithmes Les Presses de l'Universite de Montreal (Collection de la Chair Aisenstadt) 106 p.

[3] The Math of Medical Marriages from the AMS Math in the Media column, Sept 2004.

Friday, 12th November 2004

Last week (12th), Geraint was thinking back to his work of 1989-ish on deriving the Fast Fourier Transform, after presenting the algorithm again in a tutorial recently. He was hoping for a presentation with less hand-waving, and fewer indices. We didn't make a whole lot of progress.

The Discrete Fourier Transform is a matter of evaluating the polynomial

a₀ x⁰ + a₁ x¹ + ... + a_n-1 x^n-1

for x = 1, w, w², ..., w^n-1, where w is the complex n-th root of unity.

The crucial step in obtaining the Fast Fourier Transform is to take the right split of the problem into subproblems; it turns out to be a matter of considering the even and the odd powers of omega. With some rather involved calculations, we derived the appropriate divide-and-conquer version of the DFT. However, the calculation involved various cases of switching between curried and uncurried versions of key combinators such as zipWith and pair operations - we felt there was a cleaner, more point-less version that we were missing.

But we didn't even get as far as making the move from there to the FFT, which is a matter of capitalizing on the fact that some of the subproblems are repeated.

This took us well over an hour, so we remained to be convinced that this would turn into an appropriate way to explain FFT from scratch in a single lecture.

Friday, 5th November 2004

Mike Spivey talked to us about Categories for Breadth-First Search. The setting is one of non-deterministic programs

  f :: a -> m b

for some "monad of non-determinism" m. For example, m might be the list datatype, giving Wadler's "list of successes" technique. (For the sake of this note, all lists are finite.) Since we will be considering non-deterministic programs, one of the things we will want to be able to do is to run two programs and choose from their results; so we will need an operator

  mplus :: m a -> m a -> m a

For good measure, we also assume an "empty computation"

  mzero :: m a

These two should form a monoid. (In Haskell terms, we are dealing with the MonadPlus class.)

But we will also want to reason about the "cost" of different alternative solutions, so that we can enumerate them "fairly by cost". We therefore assume an extra operation

  mwrap :: m a -> m a

which "increments the cost" of every result returned, or "adds one to the cost counter". This gives us the following type class:

  class MonadPlus m => MonadBFS m where
    mwrap :: m a -> m a

With the list monad, there is no sensible implementation of this cost mechanism, so we define

  instance MonadBFS [] where
    mwrap = id

But we can pick more elaborate datatypes to represent the non-determinism, which allow us to implement the counter. One such is the datatype of forests:

  data Tree a = Tip a | Fork (Forest a)
  type Forest a = [Tree a]

  instance Functor Forest where
    fmap f = map (tmap f) where
               tmap f (Tip a) = Tip (f a)
               tmap f (Fork x) = Fork (fmap f)
    
  instance Monad Forest where
    join = map tjoin where
             tjoin (Tip x) = Fork x
             tjoin (Fork xs) = join xs
    return a = [Tip a]

  instance MonadPlus Forest where
    mzero = []
    mplus = (++)

  instance MonadBFS Forest where
    mwrap x = [Fork x]

(This is a simplification of the Haskell Monad instance, using join instead of the two binds and omitting the fail. I'm also cheating by using a type synonym for Forest to avoid the constructor.)

The join operator flattens a forest of forests into a forest, grafting each forest into place. The mplus operator concatenates two forests. Without mwrap, the only forests that can be constructed are lists of tips; the mwrap operator adds an extra level to the forest.

In fact, one needn't think hard to construct the correct definition of join, because (a standard result says) the monadic structure is a consequence of the free algebra construction, in this case for the algebra with operators mzero, mplus and mwrap and laws that mplus is associative with unit mzero. (This kind of forest is the initial object in the category of breadth-first search strategies.)

Taking the free algebra but with additional laws gives simpler representations of the non-determinism. For example, adding the law that mwrap is the identity operation collapses forests to lists.

Another implementation is (finite) lists of (finite) bags, or what Mike calls Matrices. This is obtained from forests by adding the laws that mplus is commutative (so the ordering of elements at a particular depth is irrelevant) and that mwrap distributes through mplus (so the nesting is irrelevant too). What is left is a list of bags of elements, one bag for each level.

  type Matrix a = [[a]]

  instance Functor Matrix where
    fmap f = map (map f) 
    
  instance Monad Matrix where
    join = a bit complicated!
    return a = [[a]]

  instance MonadPlus Matrix where
    mzero = []
    mplus = lzw (++)

  instance MonadBFS Matrix where
    mwrap x = [] : x

Here, lzw is "long zip with", which is like zipWith but returns a result as long as its longer argument.

In between Forests and Matrices are "Carroll Morgan Christmas Trees", in which mwrap distributes over mplus (so nesting is irrelevant, although depth still is) but mplus is not commutative (so order is still important): effectively a list of elements each at a given depth.

Depth-bounded search forms another monad:

  type DBS a = Int -> [ (a, Int) ]

This gives, for a given depth bound n, all the elements at depth at most n, each paired with the remaining depth available (ie the difference between their depth and n).

  return a =  \ n -> [(a,n)]
  g >>= f =   \ n -> [ (b,n'') | (a,n') <- g n, (b,n'') <- f a n' ]
  mzero =     \ n -> []
  mplus f g = \ n -> f n ++ g n
  wrap f =    \ n -> if n==0 then [] else f (n-1)

Mike finished by explaining how the traditional stack/queue thing for dfs/bfs arises. Define

  search :: MonadBFS m => Forest a -> m a
  search [] = mzero
  search [Tip a] = return a
  search [Fork x] = search x
  search (x++y) = search x `mplus` search y

Instantiated for the list monad, this gives:

  dfs :: Forest a -> [a]
  dfs [] = []
  dfs [Tip a] = [a]
  dfs [Fork x] = dfs x
  dfs (x++y) = dfs x ++ dfs y

from which it follows that

  dfs (Fork x : y) = dfs (x++y)

and the forest behaves like a stack.

Instantiated for the matrix monad, this gives:

  bfs :: Forest a -> Matrix
  bfs [] = [[]]
  bfs [Tip a] = [[a]]
  bfs [Fork x] = bfs x
  bfs (x++y) = lzw (++) (bfs x) (bfs y)

  bfs' = concat . bfs

from which it follows that

  bfs' (Fork x : y) = bfs' (y++x)

and the forest behaves like a queue. (A similar explanation appears in Geraint's and Jeremy's paper The Underappreciated Unfold.)

For further reading, see Mike's chapters with Silvija Seres in Millenial Perspectives in Computer Science and The Fun of Programming.

Friday, 8th October 2004

Jeremy talked about "program fission", which is program fusion in reverse. Fusion is a transformation that improves efficiency at the cost of structure and readability; fission regenerates that structure. These thoughts were inspired by a chat with Jose Nuno Oliviera.

Consider, for example, the Unix utility wc, which counts words in a text file (and lines and characters too, which we'll ignore here). In C, it might be written:

  #include <stdio.h>
  #define IN   1  /* inside a word */
  #define OUT  0  /* outside a word */

  /* count words in input */
  main()
  {
      int c, nw, state;
      state = OUT;
      nw = 0;
      while ((c = getchar()) != EOF) {
          if (c == ' ' || c == '\n' || c == '\t')
              state = OUT;
          else if (state == OUT) {
              state = IN;
              ++nw;
          }
      }
      printf("%d\n", nw);
  }

I argue that this is "obviously equivalent" to the following Haskell program:

> wc1 :: String -> Integer
> wc1 = fst . foldl step1 (0, False)
> step1 (n, b) c | isSpace c = (n, False)
> step1 (n, False) c = (n+1, True)
> step1 (n, True) c = (n, True)

Of course, for this problem it doesn't matter if you read left-to-right (as is natural in C) or right-to-left (natural in Haskell).

> wc2 :: String -> Integer
> wc2 = fst . foldr (flip step1) (0, False)

This is the result of applying fusion to:

> wc3 :: String -> Integer
> wc3 = genericLength . getWords3
> getWords3 :: String -> [String]
> getWords3 = fst . foldr glue3 ([], False)
> glue3 c (ws, b) | isSpace c = (ws, False)
> glue3 c (ws, False) = ([c]:ws, True)
> glue3 c (w:ws, True) = ((c:w):ws, True)

(Well, truer to say that fusing the right-hand two components in

  fst . pair genericLength id . getWords3

gives wc2. The relationship between the above and wc3 is a simple law of fst.)

Conversely, applying fission to wc2 yields the more comprehensible wc3.

The C program above maintains a count of the number of words, and a boolean variable recording the state (whether currently in a word or not). There is another program that dispenses with the boolean variable, but uses a two-character window to determine whether a certain non-space character is the start of a word:

  #include <stdio.h>
  #define IN   1  /* inside a word */
  #define OUT  0  /* outside a word */

  /* count words in input */
  int
  isSpace(int c)
  {
    return (c == ' ' || c == '\n' || c == '\t' || c == EOF);
  }

  main()
  {
    int c, d, nw;
    nw = 0;
    d = ' ';
    c = getchar();
    while (c != EOF) {
      if (!isSpace(c) && isSpace(d)) {
        ++nw;
      }
      d = c; c = getchar();
      }
    printf("%d\n", nw);
  }

Again, the natural thing to do in Haskell is count word endings rather than beginnings, so we use lookahead rather than lookback. In this case, the program is a paramorphism:

> wc4 = paral step4 0
> step4 c (n, _) | isSpace c = n
> step4 c (n, []) = n+1
> step4 c (n, ' ':_) = n+1
> step4 c (n+1, _:_) = n+1
> paral :: (a -> (b,[a]) -> b) -> b -> [a] -> b
> paral f e [] = e
> paral f e (x:xs) = f x (paral f e xs, xs)

This fissions (yes, that is the verb) into length after a another paramorphism, the latter for collecting the words themselves:

> wc5 = genericLength . getWords5
> getWords5 = paral glue5 []
> glue5 c (ws, _) | isSpace c = ws
> glue5 c (ws, []) = [c]:ws
> glue5 c (ws, ' ':_) = [c]:ws
> glue5 c (w:ws, _:_) = (c:w):ws

In my mind, of course, getWords is neither a fold nor a para, it's an unfold:

> wc6 = genericLength . getWords6
> getWords6 = unfoldr getWord6
> getWord6 xs | null ys = Nothing
>             | otherwise = Just (span (not . isSpace) ys)
>   where ys = dropWhile isSpace xs

We explored these and other structures for the wordcount program. I haven't yet managed to explain how the fissioned programs may be discovered - indeed, I believe some invention is bound to be necessary for entropic reasons.

Friday, 1st October 2004

Richard talked again about loopless algorithms, harking back to his presentation on 24th October 2003. He was again trying to get a loopless algorithm for the function mixall (for the definitions, see the earlier minutes), but is dissatisfied at how complicated his solutions are.

Friday, 11th June 2004

Clare talked about a recent paper by Annabelle McIver and Carroll Morgan on their probabilistic programming calculus; we concentrated on working through an example of theirs about investing in financial futures, and on whether it would fit within Clare and Sharon's multi-relation framework.

This led to a discussion about fixpoints, and reminded me about an argument I'd had recently about coinduction and invariants. Davide Sangiorgi gave an invited talk about coinduction at POPL in January. I wasn't there, but his slides are on the web.

He claims (slide 25) that Hoare logic is a coinductive method. I'd always thought of proofs of loop correctness via invariants as an inductive method. We tried to work out where the discrepancy is.

The coinductive proof principle (as on Sangiorgi's slide 7) is that

  x [= F(x)  =>  x [= gfp(F)

for monotone F on a complete lattice (where [= is the lattice order, and => is implication). Dually, the induction principle is that

  x ]= F(x)  =>  x ]= lfp(F)

By Hoare logic, I presume he means the standard method of invariants: that

  {I and B} S {I}  =>  {I} while B do S {I and not B}

By handwaving, I managed to convince myself and (I think) Clare and Sharon at the meeting that this was indeed an instance of induction, not coinduction. The rough argument is that F corresponds to the loop body, lfp(F) the terminating state of the loop, and X the invariant; that "X contains F(X)" is the correctness of the loop body, so by induction, "X contains lfp(F)", ie the final state satisfies the invariant too.

But my subsequent steps to pin this down failed on the grounds that the F I actually constructed had the empty set as a lfp. That indeed suggests that a gfp is what is wanted. But I haven't had time since then to explore this further.

Friday, 30th April 2004

Present: RSB, SC (minute-taker), CEM, GJ, JG, BO, BS

Room 441 bore a distinct resemblance to a shingle beach today, because on arrival, RSB was busy putting pebbles in buckets.

RSB had 4 buckets, and 16 pebbles to put in them.

  |   |    |   |    |   |    |   |
  |___|    |___|    |___|    |___|

Every time he put a pebble in a bucket, he stirred the pebbles in that bucket. So given that the cost of putting a pebble in a bucket which previously contained i-1 pebbles was i, what was the lowest possible total cost of putting all 16 pebbles in the buckets? 40, we said, by the greedy algorithm of putting the next pebble into a least full bucket, resulting in 4 buckets each with 4 pebbles in.

What about a more general problem with n pebbles to put in k buckets?

Letting

  q = n div k
  r = n mod k

as r of the buckets will have q+1 pebbles in, and (k-r) of the buckets will have q pebbles in. using the same algorithm, we get that the minimum cost is

    r(q+1)(q+2) +   (k-r)q(q+1)
    -----------     -----------
         2               2

"Someone simplify that in their heads!" was demanded of the audience.

RSB relented, and simplified it to

        k | n |  | n |
   (n - - |---|)(|---| + 1)
        2 |_k_|  |_k_|

RSB then introduced another operation (besides simply putting a pebble in a bucket) - a redistribution operation. This allows you to pick up several buckets and pour the contents into one bucket.

    |    |  |    |  |    |  |    |
    |_p1_|  |_p2_|  |_p3_|  |_p4_|

"Your buckets have holes in them"

  ===>

     |   |  |   |  |   |  |             |
     |_0_|  |_0_|  |_0_|  |_p1+p2+p3+p4_|

This operation has cost of the number of pebbles involved (in the poured buckets and the poured-into bucket), ie p1+p2+p3+p4 in this example.

It was conjectured that this wouldn't help. RSB pointed out that it did indeed help: consider these stages of bucket contents:

      0 0 0 0
      1 1 1 1
      0 0 0 4

The above pebble operations from 0 0 0 0 to 1 1 1 1 to 0 0 0 4 takes a cost of 8 (4 + 4), compared to the 10 it would take to put all the pebbles into the last bucket in the first place.

RSB then started talking about sorting, and the number of comparisons it takes to sort. For example, it takes 29 comparisons to sort 12 things, but it's not clear how to generalize, to get an idea of what lower bounds for sorting are for arbitrary n things to sort.

RSB wanted to get lower bounds for this buckets problem. He'd been up all night trying, and the best he'd managed to do was 37:

               Cost
     0 0 0 4     8
     1 1 2 4     5
     0 0 4 4     4
     2 2 4 4     6
     0 4 4 4     4
     4 4 4 4    10
               ---
                37

JG and SC promptly came up with solutions with costs 36:

     1 1 1 2    6              0 0 0 4    8
     0 0 0 5    5              1 2 2 4    7
     1 1 2 5    5              0 0 5 4    5
     0 0 4 5    4              2 2 5 4    6
     2 2 4 5    6              0 4 5 4    4
     0 4 4 5    4              3 4 5 4    6
     3 4 4 5    6                       ---
              ---                        36
               36

RSB then explained more of the context of this problem, apparently it's related to the Ordered File Maintenance problem, and the Online Labelling problem (see e.g. [1]).

GJ mused about Fibonacci numbers, and wondered if they might crop up somewhere in this problem. He pointed out that if you have enough buckets, the cheapest way to build up large quantities of pebbles is by merging large buckets.

SC wondered whether if we knew the quantity destined to be in the end bucket, whether there was a way to find out what cost was required to fill that bucket.

RSB then started talking about one of the Hilbert problems concerning Diophantine equations, and how the technique to show that this particular Diophantine equation was unsolvable made use of Fibonacci numbers.

Define T(k) to be the cost of putting 2^k (=n) pebbles in k buckets.

As T(k) = 2 + 2^1 + 2^2 + ... 2^(k-1) one algorithm is to do the algorithm for 2^(k-1) pebbles in k-1 buckets (cost T(k-1)), then put all those pebbles in the last bucket, then do the rest of the buckets (cost T(k-1)).

This gives the eqn

   T(k) = 2T(k-1) + 2^(k-1)
   T(2) = 6

to which the solution is of the order k2^k (n log n)

This gives an upper bound on T(k). This is where RSB got the 37-cost solution from, which illustrates that this can't be optimal, because we know that 36-cost solutions are possible. In any case, we suspected that this wasn't optimal because the last bucket isn't being used!

What if n isn't of the form 2^k ? What about general n and k?

A side-track discussion then focused on when n=a2^k and obtained the recurrence of

    T(k) = aT(k-1) + a^2 * 2^(k-1)

RSB wanted a good algorithm for this problem. Apparently Oege wants one too.

Overloading T such that T(k,n) is the cost of putting n pebbles into k buckets, the following recurrence eqn is arrived at:

   T(k,n) = T(k,n/2) + n/2 + T(k-1,n/2)

(Put n/2 pebbles in k buckets, merge into 1 bucket, put out remaining n/2 pebbles)

GJ improved the bound to

   T(k,n) = min { T(k,x)  + x  + T(k-1,n-x) }
             x

(same idea, but with x pebbles, not n/2)

RSB then stated three versions of his problem:

Problem 1:: the optimal cost is O(f(k,n)) for what f?
Problem 2:: if the optimal cost is O(f(k,n)) and k = O(log n), then what is f?
Problem 3:: if the optimal cost is O(f(k,n)) and k = log n, then what is f?

Moving on to a fresh discourse, RSB told us that as far as lower bounds were concerned, for n pebbles in k buckets, a lower bound was

     /-\  /n log n\
    _\ /_ |-------|
          \ log k /

Denote a bucket configuration by n1, n2, ... nk where sum [n1..nk] = n

Let C represent the "complexity" (a sort of cost measure), define

     C = sum (ni log ni)
          i

Define C_init = 0 (C_init the cost of the initial empty buckets configuration)

What is C_final ? Considering the extremes where either the pebbles are evenly distributed or piled up in one bucket,

     n log(n/k)  <=  C_final  <= n log n

RSB claimed that if the cost of a move is p, then the corresponding increase in C is at most p log k

Consider adding a pebble to a bucket:

     deltaC  = (p+1) log(p+1)  - p log p

             = p log((p+1)/p)  - log(p+1)

             <=  2p

     deltaC = n log n - sum(ni log ni)
                          i
            <= n log k

Therefore at most a (log k) cost for the redistribution, giving a lower bound on the complexity of the problem.

RSB pointed out that an exact lower bound is an interesting open problem.

On another tack, JG was wondering about Fibonacci numbers and whether they have unique partitions into Fibonacci numbers. 34 can be partitioned into [8,13,13] and [5,8,21], but maybe there's a unique partition without duplicates. Maybe this helps with the buckets problem?

Discussion had dwindled off, so at this point, SC clarified reference [1] for the sake of the minutes, and then RSB started explaining the online labelling problem. You are given, one-by-one, some things to label, and the labels you give have to respect the ordering on those things.

So, for example, if you have n items and 2^n labels to play with, it's trivial, give the 1st item the middle label, then subdivide depending on whether the 2nd item is larger or smaller than the 1st, etc. etc.

Now suppose you only have kn labels. You'll need some relabelling. This can be thought of as putting the incoming items into an kn-long array, and relabelling is moving items to make space from time to time.

RSB asserted that this must be at least of order n log n and challenged us to think why.

This is the case because once you've labelled everything, you have enough information to sort without making any further comparisons between items, so you must have done at least n log n because that's a lower bound on the number of comparisons for sorting.

We then discussed the upper bound, and couldn't quite see whether we could get it lower than quadratic in n.

Having run out of ideas, we then trooped off to lunch, taking our buckets and pebbles with us.

BS volunteered to do next week's, and JG bagged the week after that.

[1] P.F.Dietz, J.I.Seiferas, J.Zhang: "A tight lower bound for on-line monotonic list labeling" Algorithm Theory -- SWAT '94 (Proceedings of 4th Scandinavian Workshop on Algorithm Theory, Lecture Notes in Computer Science 824, Springer-Verlag) (Aarhus, Denmark, July 7, 1994), pp. 131-142.

Friday, 12th March 2004

Present: RSB, JG, BS, BO

BO presented an idea for defining types for generic functions. This idea hopes to extend previous research in the area.

BO started his presentation by reviewing Generic Haskell (GH). In GH the concept of structural polymorphism brings us the ability to define a function or type by induction on the structure of types. The types of generic functions (GF) are defined as kind-indexed types in Generic Haskell. For instance, the type for equality would be:

type Eq{| * |} t = t -> t -> Bool
type Eq{| k -> v |} t = forall a. E{| k |} a -> Eq{| v |} (t a)

Traditional GH has some limitations due to the fact that recursive GF can only use implicit recursion. Dependency style GH (DSGH) tries to solve that problem by allowing explicit recursion. However, there are still some problems that are not addressed by DSGH.

In both GH and DSGH, the key notion behind generic definitions are the kinds of a datatype. Consider for instance:

data List a = Nil | Cons a (List a)
data Tree a = Empty | Fork a (Tree a) (Tree a)

The kinds for List and Tree are the same. The consequence of this is that kind-indexed types will also be the same for those datatypes. But we know that the traditional definition of a fold, does not follow this pattern.

Type  |   Kind    |   signature of fold
----------------------------------------------------------------
List  ::  * -> *  |  (a -> b -> b) -> b -> List a -> List b
Tree  ::  * -> *  |  (a -> b -> b -> b) -> b -> List a -> List b

Type-Transformers (TT) is a way to define types not only on the kinds but also on the algebraic structure of types:

TT<Type> :: Type

Consider a very simple example :

outT :: T -> F T

We could define outT signature with a TT:

out<T | T -> O> :: T -> O

O depends on the structure of T, which is expressed by "| T -> O". The body of out could be written as:

out<Data T> = out'<T, Data T | T -> O>

We would need this auxiliary function out' because we are interested in matching the recursive pattern of T, the second parameter (Data T) would indeed correspond to that pattern. The definition of out' would be very similar to a DSGH definition. BO argue that it might even be possible to have type inference with this definition. Some of the applications for TT are: folds, zipper, possibly design patterns, possibly parsers.

Friday, 5th March 2004

I [JG] don't have Sharon's flair for recording what went on; here is a mere shadow.

Barney talked more about his PPM work. He described an approach to selecting justifiable probabilities for symbols that have never yet been seen in a text being adaptively compressed. He uses recurrence relations for the conditional probabilities of seeing certain sequences - that is, expressions relating the probabilities of longer sequences to those of shorter ones. Specializing these to the case of relating probabilities of two-character sequences from those of one-character ones, and (if I read it right) inferring a kind of stable state, he derived a fixpoint equation for the probabilities. When the recurrences are expressed in a matrix form, the fixpoint turns out to be an eigenvector of unit eigenvalue. This is sufficient to specify (and indeed, to compute) those prior probabilities.

(Barney, my apologies if I misrepresent you. I hope you can provide a more coherent explanation!)

I talked some more about enumerating the rationals, following on from last week's presentation. Richard had observed last time that the Calkin-Wilf tree of rationals we constructed is the tree of traces of Euclid's gcd algorithm. That gives an alternative proof of the facts that every rational appears (Euclid's algorithm is total) exactly once (...and deterministic).

We had been thinking about "the traces of Euclid's algorithm on relatively co-prime pairs", since the set of rationals is isomorphic to the set of such pairs. But there is no need to restrict in this way; if you take the instrumented gcd algorithm

  igcd (m,n)
    | m==n = []
    | m<n  = False : igcd (m, n-m)
    | m>n  = True : igcd (m-n, n)

then

  m/n = m'/n'  <=>  igcd (m,n) = igcd (m',n')

so we can enumerate the rationals by enumerating all traces of igcd, whether from a coprime initial state or not. The natural arrangement of these traces is into a binary tree, of which they form the paths.

This week, David Lester observed a connection between the Calkin-Wilf tree and continued fractions.

                         ____ [1] ____
                    ____/             \____
              [0,2]                          [2]
            /       \                     /      \
      [0,3]          [1,2]        [0,1,2]          [3]
     /    \         /     \      /       \        /   \
  [0,4] [1,3] [0,1,1,2] [2,2] [0,2,2] [1,1,2] [0,1,3] [4]

(Recall that the continued fraction [a0,a1,a2,...,a{n-1}] represents the rational

  a0 + 1/(a1 + 1/(a2 + 1/(... + a{n-1})))

Each term but the first is at least 1, and the last term is not 1. These are in one-to-one correspondence with the rationals. So one can enumerate the rationals by enumerating the continued fractions. Note that the first term is the integer part, and the reciprocal is formed by either removing a leading zero, or consing a new zero.)

It is not hard to see how to generate this tree directly (as opposed to mapping the rational-to-cf conversion function over the CW tree, as we did in the meeting). Each node x has right child x+1 - so cf n0:ns has right child (n0+1):ns - and left child 1/(1/x + 1) - so cf 0:n1:ns has left child 0:(n1+1):ns and otherwise cf n0:n1:ns has left child 0:1:n0:n1:ns.

  data Tree a = Node a (Tree a) (Tree a)
  unfoldt :: (b->a) -> (b->b) -> (b->b) -> b -> Tree a
  rats = unfoldt id fl fr [1]
  fl (n0:ns) = (n0+1):ns
  fr (0:n1:ns) = 0:(n1+1):ns
  fr ns = 0:1:ns

We did notice that the cfs on each level have the same sum, and that this sum increases from level to level. (This is easy to see from the program, as sum.fl = sum.fr = succ.sum.) We didn't twig at the time that therefore (a) Euclid's algorithm takes k steps on (m,n) where k is the sum of the cf of m/n, and (b) there are exactly 2^{k-1} normalized cfs that sum to k - two facts that I didn't know before.

We also couldn't see at the time how to generate each cf from its predecessor - and so how to deforest the tree in generating a list of rationals. However, it isn't so hard, because each of the movements in the tree (to a left or right child, and their inverses) are straightforward to represent on cfs. The program is a bit messy because of special cases for short cfs, but it takes constant time from one cf to the next:

  next [n0]                = [0, n0+1]
  next [0, 2]              = [2]
  next [0, n1]             = [1, n1-1]
  next [n0, 2]             = [0, n0, 2]
  next [n0, n1]            = [0, n0, 1, n1-1]
  next (0 : 1 : n2 : ns)   = (n2+1) : ns
  next (0 : n1 : n2 : ns)  = 1 : (n1-1) : n2 : ns
  next (n0 : 1 : n2 : ns)  = 0 : n0 : (n2+1) : ns
  next (n0 : n1 : n2 : ns) = 0 : n0 : 1 : (n1-1) : n2 : ns

This doesn't enlighten me much, though.

---------------

Since then I've thought and read a bit more. Graham, Knuth and Patashnik (S4.5) talk about a related tree, the Stern-Brocot tree:

  0/1 ____                   ____ 1/0
          \_____       _____/
                _ 1/1 _
              _/       \_
          1/2             2/1
         /   \           /   \
      1/3     2/3     3/2     3/1
     /   \   /   \   /   \   /   \
    1/4 2/5 3/5 3/4 4/3 5/3 5/2 4/1

Start with (0/1, 1/0), and between each adjacent pair (m/n, m'/n') insert a new fraction (m+m')/(n+n'). This is closely related to the CW tree we've talked about. It also relates to continued fractions, to representations of exact reals, to bit reversal permutations... in fact, most of of my work over the last few years appears here in microcosm. But that's a story for another day.

Friday, 27th February 2004

Present: RSB, SC, CEM, JG, BO, BS

JG offered to show us a program for enumerating the rationals. (It later turned out that he was just concentrating on the +ve rationals.) This is from a result in [1].

JG reminded us that the usual way of enumerating the rationals is to lay them out like this

    1/1  2/1  3/1  4/1 ...
    1/2  2/2  3/2  4/2 ....
    1/3 ...
    .
    .
    .

and then to traverse them in a diagonal fashion. But there's a problem: duplicates (e.g. 2/1 and 4/2)

JG then wanted us to think about how you might do it instead, not producing duplicates.

We satisfied ourselves that we want to keep just the fractions where the numerator and denominator are coprime (gcd = 1).

We then had a brief discussion about the possibilities of merging infinite lists (as they are sorted), discarding duplicates as you go, but this was dismissed as being very awkward - we don't want to do that.

SC wanted to know whether JG had in mind generating everything and then throwing away duplicates, or just not generating them in the first place. JG didn't propose to generate duplicates in the first place.

The diagonal traversal from above can be expressed as

  filter coprime (diag [[(a,b)| a <- [1..]] | b <- [1..]]

or as

  filter coprime (diag [(a,d-a)| d <- [1..], a<- [1..d]]

JG paused and looked at this for a moment. RSB accused him of thinking that this was niftier than whatever else he had in mind. JG assured us that the program he has in mind is definitely niftier, no question.

JG's hints hadn't been getting us very far, so he dropped a bigger one: trees. This is because we're trying to get a list of rationals, which is one dimensional, but the data isn't naturally arranged in 1D, it's naturally arranged in 2D. And trees are 2D....

We wondered if Pascal's triangle had anything to do with it? No, said JG, that involves sharing, and he doesn't have sharing in mind. The idea would be that we divide the (+ve) rationals into two, the left sub-tree and right sub-tree, and (breadth-first) traverse the tree to get all the rationals.

It was agreed that 1/1 was a good candidate to put at the top of the tree. What next? Random guesses from the audience put 1/2 as the left child and 2/1 as the right, which was fine, but then the guessing degenerated into values where we couldn't really see what was a good rule to use for determining the children of a node.

JG dropped another hint: we want to retain the coprime invariant. Whereupon CEM suggested a rule for children of a node, which is correct:

                          a/b
                         /   \
                        /     \
                    a/(a+b)  (b+a)/b

Thus, when a and b are coprime (gcd = 1) that means that a+b is coprime with both a and b too, and therefore this rule for generating the tree does only generate fractions in their most reduced form.

RSB wondered if this had anything to do with Farey sequences [Editor: e.g. see [2] for an explanation], but did not elaborate further.

Writing out the tree that we get, leads us to

                ___ 1/1 ___
               /           \
            1/2             2/1
           /   \           /   \
        1/3     3/2     2/3     3/1
        /  \    /  \    /  \    /  \
      1/4 4/3 3/5 5/2 2/5 5/3 3/4 4/1

with the resulting generation of the rationals

  [1/1,(1/2,2/1),((1/3,3/2),(2/3,3/1)),....]

This rang a bell for some people of work with nested datatypes.

JG also pointed out a pretty pattern in the tree and drew little blue lines all over the diagram. The blue lines pointed out the equality between adjacent numbers - obvious if you look at just the last line of the tree:

     1/4 4/3 3/5 5/2 2/5 5/3 3/4 4/1

    -1 4-4 3-3 5-5 2-2 5-5 3-3 4-4 1-

Even the first and last number are the same! This means a more efficient representation - we only need to store 1 number per fraction.

But of course, as JG pointed out, there remains two big questions: 1) Is everything in the +ve rationals in the tree? 2) Are there any duplicates?

In an attempt to convince ourselves that everything was really in the tree, RSB challenged us to provide a number for him to find. SC suggested 5/7. RSB didn't like 5/7 and wanted to find 22/7 instead.

SC said that 22/7 lived at a position (going from the top of the tree "6 left then 3 right"). RSB thought it would be at "3 right then 6 left" but "6 left then 3 right" was correct. RSB then challenged SC to find 23/7, which was duly found at "1 right, 3 left, 3 right". JG pointed out that SC had been lucky with 22/7: it is a natural plus the reciprocal of a natural. Then with 25/7 we realised that we needed 4/7 for that, because going from a node to a right descendant means increasing the number by 1, but we managed to lose 4/7 and couldn't find it in the tree. Having half convinced ourselves that all the numbers were in the tree somewhere, we decided to prove it.

Doing an inductive proof that r/s is in the tree (for r,s coprime) by induction on r+s :

Base case: r+s=2 so r=s=1

Inductive case: Suppose not all the rationals are present. Then there exists a rational r/s with the least r+s out of all the ones not present. Suppose r<s, then r/(s-r) is not present either (contradiction). Suppose r>s, then (r-s)/s is not present either (contradiction).

Meanwhile, SC had thought of a constructive proof, and thus found out where 4/7 was hiding. RSB thought of it too: the path of left/rights from a fraction r/s to the root is simply the trace of the gcd algorithm for r and s

By giving a construction for where r/s is in the tree, this means that both questions are answered: not only do we know that r/s is in the tree, we also know from the uniqueness of the construction that there can't be duplicates.

Meanwhile, BO had been busy scribbling to investigate his own ideas, to do with the relationship between the sequence generated by this tree and Pascal's triangle (see note [3]).

Finally, JG wanted to get rid of the tree. Merely generating the rows of the tree wasn't quite what he had in mind, as you're still going along the rows of the tree even if the tree isn't there. Instead, if we have just generated the fraction a/b, how do we know what the next fraction is?

If a/b is a left child, this is easy, because that portion of the tree must look like this:

                       a/(b-a)
                       /   \
                      /     \
                    a/b     b/(b-a)

and thus the next element is b/(b-a).

But what if it is a right child? Then the general format is

            (a-kb)/((k+1)b-a)
             /        \
      (a-kb)/b        b/((k+1)b-a)
       .                   .
       .                   .
       .                   .
        \                 /
         a/b        b/((2k+1)b-a)

And k is just |_ a/b _|

Look! This incorporates the left child case too, because then if a/b is a left child then a<b and k=0 Not only that, but it does the "end of the row" case as well, because a/1 at the end of one row is followed by 1/(a+1) at the beginning of the next, which works because b=1,k=a !!! So we end up with this program:

   rats = iterate next 1/1
   next a/b  = b/ ((2k+1)b-a)    where k = |_ a/b _|

Alternatively you can express it with a rational x, where x=a/b, to get

   rats = iterate next 1
   next x  = 1/( |_x_| + 1 - {x})

where {x} denotes the fractional part of x.

We all agreed this was very cool.

RSB got excited about the use of revlex (referring to reverse lexicographics ordering) characterizing the order in which these rationals are generated - a reverse lexicographic ordering on the trace of the gcd algorithm.

[1] Calkin, Neil, Wilf, Herbert S: Recounting the rationals, Amer. Math. Monthly 107 (2000), no. 4, 360--363.

[2] Mathworld's explanation of Farey Sequences.

[3] JG in a later email noted that:

I think you were on the right lines here, Bruno. Sequence number A064881 at the Online Encyclopedia of Integer Sequences describes the following two-dimensional sequence:

               a(n,m)= a(n-1,m/2) if m is even, else a(n,m)=
               a(n-1,(m-1)/2)+a(n-1,(m+1)/2, a(1,0)=1, a(1,1)=2.

That is, the first row is [1,2], and each subsequent row is obtained from the previous one by interleaving that row with the list of sums of its adjacent elements.

   [ [1,2], [1,3,2], [1,4,3,5,2], [1,5,4,7,3,8,5,7,2], ... ]
   = iterate step [1,2]

where

     step xs = interleave xs (adjpairs (+) xs)
     interleave [x] [] = [x]
     interleave (x:xs) (y:ys) = x : y : interleave xs ys
     adjpairs f xs = zipWith f (init xs) (tail xs)

The rationals in the left subtree of the tree we were exploring (ie the subtree rooted at 1/2) are

   map (adjpairs (%)) (iterate step [1,2])

The right subtree is in some sense the mirror of the left subtree.

In fact,

   concat (map (adjpairs (%)) (iterate step [1,1]))

gives all the positive rationals.

Friday, 20th February 2004

Only a select few this week: JG, MA and David Lester, who is visiting the lab on sabbatical this term. I can't hope to follow Sharon's example, so at best a brief summary (and my apologies to David if I misrepresent him).

DL talked about his work on computable real arithmetic. It seems that the general consensus these days is that continued fractions, (possibly redundant) digit sequences, and compositions of homographies (aka Moebius transformations, aka linear fractional transformations) which generalize these two, are not competitive with a much more straightforward approach. This is to represent a real as a (computable) function from precision to rational approximation.

The first approach you might think of is to represent a real x by a function f of type Int -> Rational, such that

  abs (f n - x) < 2 ^ (-n)

In fact, DL has concluded that for efficient arithmetic it is better not to allow arbitrary rationals in the range of this function: better, in fact, to insist that they have 2^n as the denominator. That means you don't need to represent the denominator exactly, and instead represent x by f of type Int -> Integer, such that

  (f n - 1) / (2^n) < x < (f n + 1) / (2^d)

or equivalently

  f n - 1 < 2^n * x < f n + 1

Addition, subtraction, multiplication, maximum and minimum have fairly simple pointwise implementations. Division is a bit more complicated (you need to keep prodding the divisor until it proves itself non-zero), and transcendental operations rather trickier. Plain comparisons are uncomputable, but a three-way comparison is implementable.

David has been occupying himself using the theorem prover PVS to prove his algorithms correct - in particular, that the functional representation always returns a value within its advertised bounds.

Friday, 13th February 2004

We discussed object-oriented models of priority queues, suitable for implementations of Dijkstra's shortest paths algorithm. No-one took minutes per se, but Michael and Jeremy sent summaries after the fact.

Michael's summary

It's intruiging how fiddly a problem this seems to be. I guess part of the problem arises from the compromises that need to be made when passing from the abstract to implementation. So perhaps an abstract model of the problem would help.

> class Ord a => PQueue q a where
>   empty :: q a
>   enqueue :: q a -> a -> q a
>   gethead :: q a -> (q a, a)

Oh, damn. We can't talk about changing the state of the entries and hence forcing a requeue without introducing state and monads and so on. I think my Haskell's too rusty for this, but maybe later on.

Ok, let's pretend we're in some kind of stateful language. I'll invent notation as I go. Let's assume all methods can be virtual (though actually only lt is required to be).

> class Orderable where
>   lt :: Orderable -> Bool
>
> class PQueue1 where
>   enqueue :: Orderable -> ()
>   gethead :: () -> Maybe Orderable
>   requeue :: Orderable -> ()

This is my offering, with the following intended semantics: a client of PQueue must implement Orderable instances which should provide a mutually consistent implementation of the lt operation and requeue must be called if the implementation of lt changes.

This design is a little unpleasant, as it requires a delicate interaction between the lt operation and requeue. Jeremy's suggestion provides one way to control this, but I think I'd like to offer an alternative which I think may be better as an abstract interface. In particular, note that this one treats the priority as a separate *value*, not an updatable object.

> type Priority = Orderable
>
> class PQueue2 where
>   enqueue :: (Object, Priority) -> ()
>   gethead :: () -> Maybe Object
>   requeue :: (Object, Priority) -> ()

Now here I suggest that the queue remember the priority for the object: here Object is the "ur class" with no methods. If the object also wishes to remember its priority then that is its own business. So maybe we end up with two copies of the priority, but the separation is cleaner.

With my engineering hat on I'm not at all happy about the fact that the requeue operations need to look up their arguments in a separately maintained lookup table. I have a couple of solutions to this (which I used in my work) but they involve breaking the abstraction in an important way, so I'll not develop this idea for the moment.

Now what about the user of the priority queue? We have Towns and TownDistance objects. Let's take a peek.

> class Town where ...

> class TownDistance1 inherits Orderable where
>   priority :: Real
>   town :: Town
>   lt(other :: Orderable) is
>     otherTD = cast other as TownDistance1
>     return this.priority < otherTD.priority

Now some code to change priority of towns

> td :: TownDistance1, queue :: PQueue1
> td.priority = basepri
> queue.enqueue(td)
  ...
> newpri = computation
> td.priority = newpri
> queue.requeue(td)
  ...
> td = cast queue.gethead() as TownDistance1

Let's try the same thing with PQueue2. I'll assume inheritance from Object.

> class TownDistance2 where
>   priority :: Real
>   town :: Town

> queue :: PQueue2
> td.priority = basepri
> queue.enqueue(td, basepri)
   ...
> td.priority = newpri
> queue.requeue(td, newpri)

This is still pretty abstract, but turning this into real code (modulo actually writing the PQueue class, which will actually require programming) should be straightforward.

Hmm. Not sure I've really captured the problem here. The really interesting part of the problem is how we avoid the cost of a strictly redundant lookup when we call requeue, but I think the strong concensus was that this is a fracture of abstraction too far!

I think I'm going to have to brush up my "stateful" Haskell: can anybody point me to some simple code examples of stateful programming with Haskell monads?

Jeremy's summary

Town is a simple abstraction, completely independent of priority queues.

    import java.util.Vector;
    import java.util.Iterator;

    public class Town {

        protected String name;
        protected Vector roads;

        public Town (String name) { 
            this.name = name; 
            roads = new Vector();
        }

        public void addRoad(Road r) { 
            roads.add(r); 
        }

        public Iterator exits() { 
            return roads.iterator(); 
        }

        public String toString() { 
            return name; 
        }
    }

Likewise roads:

    public class Road {

        protected Town from, to;
        protected int length;

        public Road (Town from, Town to, int length) {
            this.from = from; this.to = to;
            from.addRoad(this); to.addRoad(this);
            this.length = length;
        }

        public int getLength() { 
            return length; 
        }

        public Town otherEnd(Town thisEnd) {
            if (thisEnd == from) 
                return to;
            else if (thisEnd == to)
                return from;
            else
                throw new IllegalArgumentException();
        }
    }

Here's the interface for priority queues.

    public interface PQueue {

        /* is queue empty? */
        public boolean isEmpty();

        /* whether queue contains object (a query) */
        public boolean contains(Object o);

        /* highest-priority object (a query; queue is non-empty) */
        public Object highest();

        /* priority of given object (a query; assumed to be present) */
        public int priorityOf(Object o);

        /* enqueue the object o (assumed not already present) with priority p */
        public void enqueue (Object o, int p);

        /* reorder the queue, to accommodate o's increased priority p */
        public void requeue (Object o, int p);

        /* dequeue the highest-priority object (queue is non-empty) */
        public Object dequeue ();

    }

I dispensed with PrioritizedObjects and the Comparable interface. I was wrapping Towns up as PrioritizedObjects before inserting them into some collection; then it is awkward to find the corresponding PrioritizedObject when you need to update the priority of some Town neighbouring another you have just completed. I found it easier to keep the Object and the priority separate in the PQueue abstraction. The only problem then is that dequeueing ought to return a pair, an Object and its priority. I could have defined a Pair type (ie PrioritizedObject!), but instead I have three operations in the interface: two queries (yielding the first Object, and the priority of any Object in the queue) and one removal.

Here's a simple implementation of the PQueue interface.

    import java.util.Hashtable;
    import java.util.Map;
    import java.util.Iterator;

    public class UnorderedPQueue implements PQueue {

        protected Map entries;

        public UnorderedPQueue() { 
            entries = new Hashtable(); 
        }

        /* is queue empty? */
        public boolean isEmpty() {
            return entries.isEmpty();
        }

        /* whether queue contains object (a query) */
        public boolean contains(Object o) { 
            return entries.containsKey(o); 
        }

        /* highest-priority object (a query; queue is non-empty) */
        public Object highest () {
            Object highest = null; int p = 0, q;
            for (Iterator i = entries.keySet().iterator(); i.hasNext();) {
                Object current = i.next();
                q = priorityOf(current);
                if (highest==null || p > q) {
                    highest = current; p = q;
                }
            }
            return highest;
        }

        /* priority of given object (a query; assumed to be present) */
        public int priorityOf(Object o) {
            return ((Integer)entries.get(o)).intValue();
        }

        /* enqueue the object o with priority p */
        public void enqueue (Object o, int p) { 
            entries.put(o, new Integer(p)); 
        }

        /* reorder the queue, to accommodate o's increased priority p */
        public void requeue (Object o, int p) { 
            enqueue(o, p); // overwrites old priority if present
        }

        /* dequeue the highest-priority object (queue is non-empty) */
        public Object dequeue () {
            Object highest = highest();
            entries.remove(highest);
            return highest;
        }

    }

The queue is stored as a finite mapping from entries to priorities. Finding the highest-priority entry involves a linear search. Inspecting then removing the highest-priority element involves two searches; you'd probably want to maintain a private reference to that element too.

Here's a slightly more sophisticated implementation.

    import java.util.Hashtable;
    import java.util.Map;
    import java.util.Vector;
    import java.util.Iterator;

    public class OrderedPQueue implements PQueue {

        protected Vector entries;
        protected Map priorities;

        public OrderedPQueue() { 
            entries = new Vector(); 
            priorities = new Hashtable();
        }

        /* is queue empty? */
        public boolean isEmpty() {
            return entries.isEmpty();
        }

        /* whether queue contains object (a query) */
        public boolean contains(Object o) { 
            return entries.contains(o); 
        }

        /* highest-priority object (a query; queue is non-empty) */
        public Object highest () {
            return entries.elementAt(0);
        }

        /* priority of given object (a query; assumed to be present) */
        public int priorityOf(Object o) {
            return ((Integer)priorities.get(o)).intValue();
        }

        /* enqueue the object o with priority p */
        public void enqueue (Object o, int p) { 
            priorities.put(o, new Integer(p)); 
            int i = entries.size();
            while (i>0 && priorityOf(entries.elementAt(i-1))>p) i--;
            entries.add(i,o);
        }

        /* reorder the queue, to accommodate o's new priority p */
        public void requeue (Object o, int p) { 
            priorities.put(o, new Integer(p)); // overwrites if already present
            int i = entries.indexOf(o);
            if (i>0) {
                entries.remove(o);
                while (i>0 && priorityOf(entries.elementAt(i-1))>p) i--;
                entries.add(i,o);
            } // else already highest priority
        }

        /* dequeue the highest-priority object (queue is non-empty) */
        public Object dequeue () {
            Object highest = highest();
            entries.remove(highest);
            return highest;
        }

    }

This time the queue is stored as a Vector of entries (in order of decreasing priority), and a Hashtable mapping entries to priorities. Now enqueuing involves a linear search from the end of the Vector to the correct position, and requeuing involves some search for the current position plus a linear search from the old position to the new position. If the standard implementation indexOf of the search for the current position isn't good enough, you could keep a second mapping from entries to positions in the Vector.

Here's a simple main program. (This is "structured programming in Java": all features are static.)

    import java.util.Iterator;
    import java.util.Map;
    import java.util.Hashtable;

    public class Routeplanner {

        protected static Town[] towns = { new Town("London"),      // 0
                                          new Town("Oxford"),      // 1
                                          new Town("Birmingham"),  // 2
                                          new Town("Manchester"),  // 3
                                          new Town("Nottingham"),  // 4
                                          new Town("Carlisle"),    // 5
                                          new Town("Newcastle"),   // 6
                                          new Town("Edinburgh") }; // 7
        protected static Road[] roads = { new Road(towns[0],towns[1],56),
                                          new Road(towns[1],towns[2],68),
                                          new Road(towns[2],towns[3],87),
                                          new Road(towns[0],towns[4],131),
                                          new Road(towns[2],towns[4],53),
                                          new Road(towns[3],towns[5],120),
                                          new Road(towns[4],towns[6],162),
                                          new Road(towns[5],towns[6],59),
                                          new Road(towns[5],towns[7],99),
                                          new Road(towns[6],towns[7],108) };

        public static void main(String[] args) {
            Map results = new Hashtable();
            PQueue pqueue = new OrderedPQueue();
            pqueue.enqueue(towns[0],0);
            while (!pqueue.isEmpty()) {
                Town here = (Town)pqueue.highest();
                int distance = pqueue.priorityOf(here);
                pqueue.dequeue();
                results.put(here, new Integer(distance));
                for (Iterator i = here.exits(); i.hasNext(); ) {
                    Road road = (Road)i.next();
                    Town there = road.otherEnd(here);
                    if (pqueue.contains(there)) {
                        if (pqueue.priorityOf(there) > distance+road.getLength())
                            pqueue.requeue(there, distance+road.getLength());
                    }
                    else if (! results.containsKey(there)) {
                        pqueue.enqueue(there, distance + road.getLength());
                    }
                }
            }
            for (Iterator i = results.keySet().iterator(); i.hasNext(); ) {
                Town here = (Town)i.next();
                System.out.println("Distance from " + towns[0] + 
                                   " to " + here + " is " + results.get(here));
            }
        }
    }

Friday, 6th February 2004

Present: SC, CEM, JG, BO, BS

SC has recently received a new book on cake cutting algorithms [1], and brought it along especially to show RSB, who had asked to see it, but he wasn't there, and so instead, she talked about two algorithms for dividing a cake between n people.

The first was called the moving knife protocol and the second was a divide and conquer algorithm.

For each of the following algorithms it is assumed that the cake is represented by the unit interval [0,1]

In the moving knife protocol, a knife is moved from left to right across the cake until someone shouts "stop!". The knife then stops moving from left to right and cuts the cake. The person who shouted is given the piece of cake to the left of the cut and the process is repeated for the remaining n-1 people and the rest of the cake until there is only one person left. If several people shout simultaneously then one of them is chosen at random to receive the piece of cake.

To illustrate, we tried this out with a whiteboard diagram instead of a real cake.

The book [1] describes this as being not a finite algorithm, because the participants need to evaluate the position of the knife continuously, but SC wasn't sure about this, as you could just decide for the first cut where you thought was fair, and shout out once the knife had got past the cut. So that results in n + (n-1) + .... 2 evaluations made by the participants during the execution of the protocol, which is quadratic.

SC then explained the divide and conquer algorithm as [1] presents it, but the assembled throng reworded it to make it a lot nicer. This is the reworded explanation:

The divide and conquer algorithm is easiest to explain for the case when n = 2^k for some k, but it generalises to any n. So, first suppose n = 2^k. Each of the n people must mark the position on the cake that they think is halfway. The cake is then cut anywhere between the two middle marks. The 2^(k-1) people whose marks were to the left of the cut are then assigned the left half of the cake, and the rest are given the right half of the cake. The process is then repeated for each half of the cake until everyone is left with just one piece.

In the general case, each person is asked to mark the cake in a position that divides it in the ratio m : (n-m) for some m < n. The cake is then cut anywhere between the mth and (m+1)th mark. The m people whose marks were to the left of the cut are then assigned the left half of the cake, and the rest are given the right half of the cake. The process is then repeated for each half of the cake until everyone is left with just one piece.

It was observed that the first protocol was a special case of the second. JG said that they have different complexity: the moving knife protocol is quadratic and the divide and conquer protocol is O(n log n).

We noticed some similarities between these protocols and sorting algorithms, in particular quicksort, which is O(n log n) if you have a good pivot point, but quadratic in the worst case. We wondered whether they could be shown to be unfolds in the same way that sorting algorithms have been.

SC then talked about the Dawson team sharing protocol, a description of which can be found in [2]. It's an example used in a paper that CEM/SC/IR have submitted to MPC [3]. These resource division protocols look like providing a nice source of examples for the multirelational calculus (see minutes from Jan 23rd 2004).

Minutes by CEM+SC.

[1] Jack Robertson, William Webb (1998): Cake Cutting Algorithms: Be Fair If You Can. A K Peters.

[2] Ivan Petersen: Toward a Fairer Expansion Draft.

[3] C. E. Martin, S. A. Curtis, I. Rewitzky: Modelling Nondeterminism.

Friday, 23rd January 2004

Present: RSB, SC(minute-taker), CEM, IR, BS Welcome to our guest today, Ingrid Rewitsky (IR).

CEM took today's meeting, and started talking about set-valued relations (also known as multirelations).

These are equivalent to monotonic predicate transformers, but have the advantage that they can model programs going forward rather than backwards, and they provide a more powerful domain than ordinary relations (can express both angelic and demonic nondeterminism).

CEM explained that she'd been originally thinking of functional programs, and Nigel Ward had done some work on this, and developed a refinement calculus [1]. With a refinement calculus, you get specifications, and laws about them. The laws get proved in the predicate transformer world, and you can't do things from first principles.

This is where the set-valued relations/multirelations come in [2]. You can work with these and refine to what you want, rather than have to work in a separate calculus.

RSB wanted an example. CEM mentioned vending machines, and then went on to give some definitions:

A set-valued relation/multirelation is of type (writing |P for powerset)

   |P (A x |P B)

With an ordinary relation R,

   x R y

means that for input x, the output can be y. But for set-valued relations/multirelations, then

   x M post

means that for input x, the output can satisfy the predicate post (predicate represented as a set).

The idea is that the angel can choose the predicate that the output is to satisfy. For example, suppose the angel has the choice of predicates

   { x-1 }
   { x-2, x+2 }

If the angel chooses {x-1}, then the demon has no choice. But if the angel chooses { x-2, x+2 }, then the demon gets to choose which of x-2 or x+2 is output.

CEM pointed out that a condition is necessary for these predicates/postconditions: if x M post and post subseteq post', then post => post' , and it must also be true that x M post' Therefore these set-valued relations/multirelations have to be up-closed. An up-closed multirelation from type A to type B is written

          ->
    M : A -> B

RSB vociferously objected at this point at things going from left to right. He wrote that we have

     y = f x
     y <- R x
     post <- M x

so it is more natural to go from right to left, but of course with a program, you'd put the postcondition on the right. SC asserted that this is why multirelations go from left to right, and RSB quietened down.

CEM then pointed out something crucial: the angel ONLY gets to choose from the STRONGEST postconditions. (This is necessity, not convenience.) This is key to understanding them.

An example: the identity multirelation is

                       ->
               in  : A -> A
                 A

Angelic choice is just |_| (union), Demonic choice is just |^| (intersection).

For example,

                       ->
           const a : A -> B

           const a = A x |`{a}

(upclosure is |` )

Then

   const a |^| const b  = (A x |`{a}) |^| (A x |`{b})
                        = A x |`{a,b}

After some discussion about notation, composition (;;) for multirelations was defined as follows:

   x (M ;; N) X
   <=>
     (exists Y :: x M Y  /\  (forall y : y in B : y N X))

RSB mentally checked that M;;id = M (it is).

CEM then gave a Nim example, describing a single round. Nim is a game where players pick up matches, and in this example, players are allowed to pick up 1 or 2 matches, and the player to pick up the last match loses.

Defining

     x sub1 X  <=>  {x-1} subseteq X

and sub2 similarly, we have (regarding our player as the angel and the opponent as the demon)

     player = sub1 |_| sub2

     opponent = sub1 |^| sub2

Thus

     round = player ;; opponent

From these definitions, you can calculate that

   x round X  <=>  {x-2,x-3} subseteq X
                   \/ {x-3,x-4} subseteq X

Alternatively... using these lifting operations for R : A -> B

   Demonic lifting       Angelic lifting

           ->                    ->
   [R] : A -> B          <R> : A -> B


   x[R]X <=> (forall y : xRy : y in X)

   x<R>X <=> (exists y : xRy : y in X)

Then alternative definitions are

    player = <move>

    opponent = [move]

RSB wanted to know whether <R;S> = <R>;;<S> ? Yes. And [R;S] = [R];;[S] too.

RSB pondered on the expressing of the lifting operators in ordinary relations, and wrote up

    <R> = R ; in

    [R] = R / ni

but he knows the latter is wrong, and needs rewriting bearing in mind the definition of the quotient operator for relations.

Going on to talk about refinement, CEM talked about Back and von Wright's [3] two ways of doing refinement:

    R sqsubseteq S  <=>  R subseteq S
                A

    R sqsubseteq S  <=>  R supseteq S
                D

These are used for multirelations too (although not both at once).

CEM talked about guarantees for the angel, and guarantees for the demon. An angelic correctness condition is a predicate that the angel can guarantee to fulfil, whatever the demon does. So saying that post is an angelic correctness condition for input x just means

     x M X

For the demon it's more difficult. The dual of M gives the correctness conditions for the demon, and is defined as

        o              _
     x M X  <=>  ~(x M X)

In other words the demon is guaranteed to be able to fulfil any predicates that the angel can't guarantee to prevent.

There then followed a vigorous discussion on refinement and correctness.

When refining, you stop when all of one type of choice is removed, so that your multirelation is either angelic (has only angelic nondeterminism in it) or demonic (has only demonic nondeterminism in it).

Definitions of angelic and demonic (using an indexed set of sets Z):

Demonic  x M (|^| Z)  <=>  (forall X : X in Z : x M X)
Angelic  x M (|_| Z)  <=>  (exists X : X in Z : x M X)

Then SC got up to talk about a simple cake cutting protocol, known as "I cut, you choose".

Modelling a cake as the unit interval I = [0,1] and cake slices as sub-intervals [x,y], we want to divide the cake fairly between two participants. Each participant has a different notion of value of cake slices, so let

   value1, value2 : I x I -> I

where the whole cake is valued at 1, empty slices are valued at 0, and value1 and value2 are both additive (if a cake is sliced up, the values of the slices must sum to the value of the whole cake).

Taking the cake as given, rather than an input (a "global cake"), cut : 1 -> I can be defined as

     cut = 1 x I

ie for the (null) input, any position p in the interval [0,1] can be returned as a position for the knife cut. The choosing relation can be defined as choose: I -> I x I

    p choose i   <=>  i=[0,p] \/ i=[p,1]

i is the slice received by the non-cutter. Lifting this to multirelations, the protocol can be described as

    <cut> ;; [choose]

SC then went on to talk about correctness conditions. Demonic correctness (for the non-cutter) is

    value2 i >= 0.5

because the non-cutter/demon wants at least half the cake (by his evaluation). This corresponds to the notion of "fair division" for such protocols.

Angelic correctness (for the cake cutter) is

    value1 i <= 0.5

because i is the slice that the angel doesn't get.

RSB was a bit put out by this, because he was expecting the correctness conditions to be included as part of a specification. SC replied that in a sense, the <cut> ;; [choose] isn't a specification, it's the finished protocol (which does happen to satisfy these correctness conditions). If you wanted a specification from scratch, it would look something more like

    Spec = { (1,X) | (*) subseteq X }

where (*) = value1 i <= 0.5 /\ value2 i >= 0.5

SC also explained that, this calculus being new, how exactly to best manipulate specifications, protocols, games, correctness conditions was still new and evolving. You can further refine (demonically) this protocol to get rid of the angelic choice, to describe a procedure for the angel to follow. Alternatively you can refine (angelically) this protocol to get rid of the demonic choice, to describe a procedure for the demon to follow.

CEM/SC/IR are putting a paper into MPC (Mathematics of Program Construction) about multirelations.

RSB wanted to cut more cake, for n participants, and was interested to hear that there is a book on cake cutting algorithms [4], which SC has on order. It reminded him of the book on art gallery theorems [5].

[1] Ward, N. T. E. (1994): A Refinement Calculus for Nondeterministic Expressions

[2] Rewitzky, I. (2003): Binary Multirelations, Lecture Notes in Computer Science 2929 : 259-274

[3] Back, R. J. R. and von Wright, J. (1998): Refinement Calculus: A Systematic Introduction, Springer-Verlag.

[4] Jack Robertson, William Webb (1998): Cake Cutting Algorithms: Be Fair If You Can, A K Peters.

[5] Joseph O'Rourke (1987): Art Gallery Theorems and Algorithms (International Series of Monographs on Computer Science) OUP.

Friday, 9th January 2004

Present: MA, RSB, SC(minute-taker), JG, BO, BS

BO took today's meeting, and had prepared slides. These minutes accompany the slides.

[Slide 1/21] BO explained that the title was referring to generic programming with containers, not "generic containers".

[Slide 4/21] Generic Programming in Haskell The assembled throng wanted to know what GFs and DTTs were. BO explained that these refer to generic functions and datatype transformers, and then proceeded to give us examples.

An example of a generic function for equality (actually just its type declaration):

   eq <T1,T2> :: T1 -> T2 -> Bool

So eq is a function parametrized by types.

There then followed a discussion about the definition of eq.

JG pointed out that there's two main ways you can look at defining equality for general (different) types - you can either traverse over the structure of the type definitions and see if they are the same, or you can look at the structure and also the constructor names.

RSB wanted to know if he would be allowed to define it using the types int and char (since C does that in any case!). Yes. So you could have

   eq <Int,Int> x y = (x==y)
   eq <Int,Char> x y = (x == ord y)

This illustrates the difference between parametric and ad-hoc polymorphism.

Going onto datatype transformers, here's an example

   phi : Type -> Type

where "Type" is written to mean that values of type "Type" are in fact types

Or more generally,

   phi : * -> Type

The idea is that you want to be able to do things with datatype transformers that you can't do with generic functions.

RSB wanted to know whether BO wanted to make types 1st class citizens. BO replied that yes, he did.

[Slide 5/21] Scrap your boilerplate BO went on to talk about Ralf Lämmel and Simon Peyton Jones' paper Scrapping your Boilerplate: A Practical Design Pattern for Generic Programming, mentioning a quote from it: "Trying to understand gfoldl's type signature can cause brain damage."

We had a short attempt and then gave up.

[Slide 7/21] Polynomial datatypes BO gave us an example.

  dataList a  =  Nil | Cons a (List a)

  Y = Nil + Cons  = [Nil,Cons]

  Phi = 1 + a

  Theta = 1 + List a

[Slide 11/21] Polynomial Datatypes - Visitors Continuing with the above example, BO wrote up

            out
               T
   List a ----------->  1 + a * (List a)

   value :: List a -> 1 + a
   succs :: List a -> 1 + List a

MA and JG then had a discussion about type-checking in compiled versus interpreted programs.

Afterwards, BO gave the example of B-trees to explain why the c* - sometimes you have 0 parameters, sometimes 2.

BO finished by talking about what he hoped to work on and achieve in the future.

Sharon Curtis (sharoncurtis@brookes.ac.uk)

oucl

research

pdt

minutes

Updated June 2005	Home \| Search \| SiteMap \| Feedback	© OUCL, 1994-2005
Courses \| Research \| People \| About us \| News