|
|
Minutes of 2004 Meetings of the Algebra of Programming Research Group
-
Michaelmas Term 2004 (all Fridays)
-
-
Trinity Term 2004 (all Fridays)
-
-
Hilary Term 2004 (all Fridays)
-
- Friday, 10th December 2004
-
(Minutes by JG)
Michael Abbott talked today, about what he called "omega-sets". An
omega-set is a pair ( X, |= ) where |= is a relation between the naturals
and X such that
for all x in X, there exists n in N with n |= X
n is said to "witness" x. He made this into a category, in which the
objects are omega-sets, and the arrows are functions between the
underlying sets which are themselves witnessed by numbers. In particular,
he chose to witness each (computable, I guess) function by its Goedel
number.
Michael is interested in this structure, because it gives a rare example
of a setting with considerable expressive power ("a model of Martin-Löf
type theory"), but nevertheless without the usual existence of infinite
limits and colimits.
Apologies if that's a bit garbled; I came in after Michael started and
left before he finished!
- Friday, 3rd December 2004
-
Richard came with a problem:
(1) Let @ be an associative operator with an irreducible unit e. (That is,
not only is x@e=x=e@x, but also x@y=e => x=y=e.)
(2) Define <= by
x <= y <=> exists z . x @ z = y
(3) Richard claimed that <= is then a partial order, with least element e.
(4) Define + by
(x + y) <= z <=> (x <= z) && (y <= z)
(5) Richard claimed that + is then associative, commutative, and had
identity e.
(6) Richard wondered whether @ then distributes over +.
---------------
Examples include:
-
the naturals, with @ being addition. Then the ordering is the natural
ordering, and + is max. Indeed, addition distributes over max.
- the naturals, with @ being multiplication. Then the ordering is by
division, and + is least common multiple. Indeed, multiplication
distributes over lcm.
- strings, with @ being concatenation. Then the ordering is the prefix
ordering, and + is longest common prefix. Concatenation distributes (from
the left) over lcp.
- left-closed, right-open subintervals of the unit interval, with @
being the "narrow" operation from arithmetic coding. Then the ordering is
interval containment, and + is interval intersection. (I think that must
be partial: what is the intersection of disjoint intervals?) Does
narrowing distribute over intersection? Too late in the afternoon for me
to verify...
---------------
After some discussion, we were impressed by the examples, but weren't sure
of some of the theorems. In particular, the least upper bound operator +
is not always completely defined by (4), and we thought the ordering
introduced in (2) is not always a partial order.
- Friday, 26th November 2004
-
Present: SC, CEM, GJ, JG, BS
JG had been talking to Graham Hutton (GH) about the stable marriage
problem. The stable marriage problem is where you have a
number of men, the same number of women, and you want to
marry them all off. The stability refers to when you don't have one
man and one woman not married to each other who want to
elope together because of both preferring their fellow eloper to
their spouse.
For example, say we have men 1 2 3 4, women a b c d
The rankings <m for the men could be
___________
1 | b c a d
2 | c b a d
3 | b a c d
4 | b c a d
The rankings <w for the women could be
___________
a | 2 1 4 3
b | 1 3 4 2
c | 4 2 3 1
d | 1 3 4 2
A pair of marriages (m,w) (m',w')
is unstable iff (for w, m' the elopees)
m' <w m and w <m' w'
(i.e. w prefers m' to m, and m' prefers w to w')
The challenge is to find a stable matching.
It's not obvious that there exists a stable matching, but
one does exist.
It was also noted that you can get cycles of unstable
matchings, where one pair of elopers gives rise to another
pair of elopers, which ....
Both JG and GH had sat down and written programs to do this,
but they were completely different programs.
JG asked whether we wanted a specification or a program.
We chose a specification.
filter stable societies
societies = map (zip men) (perms women)
This is asymmetrical with respect to men/women, so you could
have vice versa.
stable = all pairStable . pairs
pairs xs = [(x,y) | x<- xs, y<- xs, x=/=y]
There was some concern over the type checking of the pairs (x,y)
and how you'd know that x and y were different sexes, but it
was pointed out that the (x,y) pairs were pairs of marriages not people
so you didn't need to know the sex of x and y. Or, as GJ put it,
"Marriages have no sex."
pairUnstable (m,w) (m',w') = (m' <wm && w <m' w')
|| (m <w' m' && w' <m w)
Not that we can write (x,) , but still ...
pairs (x:xs) = map (x,) xs ++ pairs xs
pairs [] = []
"It's a fold."
"No it isn't."
"No it isn't."
"It's a paramorphism."
"So it is."
---------------
JG said this was related to Dijkstra's Shortest Paths algorithm,
also a coupon collecting problem, where you have boxes containing
1 of n coupons, and you want to know how many you should collect
to get the full set of coupons.
"Can you see inside the boxes?"
"No, you can't go rummaging around opening all the boxes in the
supermarket."
pairs = concat . map tailpairs . tails
tailpairs (x:xs) = map (x,) xs
tailpairs [] = []
Moving over to the other whiteboard, JG went through the
Gale-Shapley [1] algorithm for stable marriages.
Round 1. All men propose to their most favoured women.
All women now have a (possibly-empty) collection of
proposals. Each chooses the best on offer (if they can).
___________
1 |*b c a d
2 |*c b a d
3 |*b a c d
4 |*b c a d
Proposals: 1,b 2,c 3,b 4,b
Accepted: 1,b 2,c
Round 2. All unattached men now make a proposal to their
second-favoured woman. Each woman with fresh proposals
chooses the best of those (this may involving dumping
a fiance).
___________
1 |*b c a d
2 |*c b a d
3 |*b*a c d
4 |*b*c a d
Proposals: 3,a 4,c
Accepted: 3,a 4,c (c dumps 2) 1,b still affianced
Round 3. All unattached men now make a proposal to their next
favoured woman that they haven't proposed to yet (this is
how all rounds can be described).
___________
1 |*b c a d
2 |*c*b a d
3 |*b*a c d
4 |*b*c a d
Proposals: 2,b (not accepted)
Round 4.
___________
1 |*b c a d
2 |*c*b*a d
3 |*b*a c d
4 |*b*c a d
Proposals: 2,a (accepted, a dumps 3)
Now have 1,b 2,a 4,c
Round 5.
___________
1 |*b c a d
2 |*c*b*a d
3 |*b*a*c d
4 |*b*c a d
Proposals: 3,c (not accepted)
Round 6.
___________
1 |*b c a d
2 |*c*b*a d
3 |*b*a*c*d
4 |*b*c a d
Proposals: 3,d (accepted)
Marriages are:
1,b; 4,c; 2,a; 3,d.
We checked that this was stable (it was).
There then followed a discussion about the results.
JG said this algorithm produces the best of all stable
marriages for the men, and the worst of all stable marriages
for the women. GJ and SC wanted to know what he meant by
"best" and whether he meant "maximal" really.
JG explained that when defining "best" for a man, he meant
that no man can make progress to another stable matching
that improves his lot (wife).
There were wonderings about whether the stable matchings
formed a lattice, but it was decided not, as you can get
two stable matchings whose meet isn't.
Editorial note:
Actually, the set of stable matchings for given preferences do form a
lattice.
The ordering is a matter of dominance: matching M dominates matching M'
from the men's point of view if every man's wife in M is at least as good
in his preference ordering as his wife in M'. Dually for dominance from
the women's point of view. Moreover, M dominates M' from the men's pov iff
M' dominates M from the women's.
The meet of two matchings M and M' gives to every man the more preferable
of his partners in M and M' (and to every woman the less preferable). The
join is the dual. It isn't immediately obvious to me that these yield
matchings at all, let alone stable ones, but they do.
The Gale-Shapley algorithm starts off with the "best" (in some
sense) for men, and then reduces to the best stable matching.
Whereas the women start off in the worst state, and then they
gradually improve their lot as the algorithm progresses.
Editorial note:
Right. The "best" that GS starts with isn't a matching: every man is
courting his most preferred woman, but there may be competition and there
may be wallflowers. The men's preference lists are traversed in descending
order, approaching the set of stable matchings from above from the men's
pov. At the same time, all the women start off unmatched (which, in this
traditional world, is the worst of all circumstances), and gradually get
matched to better and better men - approaching the set of stable matchings
from below from the women's pov.
JG said that his implementation implemented Gale-Shapley,
but it didn't resemble the spec. GH's was different, in that
it considered a single man's proposal at a time, rather than
in rounds: effectively there's a "bag" of unattached men, and
a random unattached man gets to make his next proposal each round.
So as far as the order with which things happen in goes, it
does a different topological sort of the graph of men's preferences.
Essentially the two programs "approach the frontier" of the
histogram *s differently.
1 |*b
2 |*c*b*a
3 |*b*a*c*d
4 |*b*c
Knuth's little book (a collection of transcripts of lectures
given in Montreal [2]) gives another traversal. It does one unattached
man at a time, but uses a stack, not a bag:
___________
1 | b c a d
2 | c b a d
3 | b a c d
4 | b c a d
The animated data changing on the white board could be transcribed as:
Stack: 1,2,3,4
Proposal: 1,b (accepted)
Arrangements now: 1,b
Stack: 2,3,4
Proposal: 2,c (accepted)
Arrangements now: 1,b 2,c
Stack: 3,4
Proposal: 3,b (not accepted)
Arrangements now: 1,b 2,c
Stack: 3,4
Proposal: 3,a (accepted)
Arrangements now: 1,b 2,c 3,a
Stack: 4
Proposal: 4,b (not accepted)
Arrangements now: 1,b 2,c 3,a
Stack: 4
Proposal: 4,c (accepted)
Arrangements now: 1,b 4,c 3,a
Stack: 2
Proposal: 2,b (not accepted)
Arrangements now: 1,b 4,c 3,a
Stack: 2
Proposal: 2,a (accepted)
Arrangements now: 1,b 4,c 2,a
Stack: 3
Proposal: 3,c (not accepted)
Arrangements now: 1,b 4,c 2,a
Stack: 3
Proposal: 3,d (accepted)
Arrangements now: 1,b 4,c 2,a 3,d
Stack:
We speculated that the general version of the algorithm was
some order in which you go through that *s histogram from
the left. The invariant for that would be that the people matched
so far (that sub society) would be matched in a stable way.
The variant is the # of remaining proposals altogether (not the
number of unattached people).
The original Gale-Shapley was for matching students to colleges,
and you could do this from the point of view of the students or
the colleges. It makes sense to have the students choose.
SC was reminded of this being used in the setting of students
being matched with hospitals [3], where it used to be done from
the hospitals point of view, but was then changed to be done from
the students point of view (so the students are the men and get
their best stable matching).
There then followed a discussion about how many marriages you
can break up in a single round of the Gale-Shapley algorithm.
Can you break up n/2? Yes, you can.
Dijkstra's shortest paths algorithm was also discussed, in that
the creeping histogram frontier was similar to the collection
of nodes to which the shortest paths were known. However nodes
don't change their minds in Dijkstra's algorithm.
JG said something about his program and co-induction, and got
made to paraphrase it, which he did, saying that it was
consumptive rather than productive (consuming proposals, and
not necessarily producing more marriages).
GJ pointed out why the program feels far away from the
specification: the algorithm consumes proposals, and the proposals
don't actually appear in the specification.
maptOptimal (filter stable (societies men women))
societies men women = map (zip man) (perms women)
We want to use proposals to build the society. We want to
consume the table for <mfor the men, in the order of the proposals.
Maybe we should expand the pairUnstable definition, to walk over
the <m ordering, as a table, whilst keeping the <w as an ordering.
By this time, lunch was looming.
Minutes by SC.
References:
[1] D. Gale and L.S. Shapley:
College admissions and stability of marriage
American Mathematical Monthly, 69(1):9-15, 1962
[2] Donald E. Knuth:
Mariages stables et leur relations avec d'autre problemes combinatoires.
Introduction a l'analyze mathematique des algorithmes
Les Presses de l'Universite de Montreal
(Collection de la Chair Aisenstadt) 106 p.
[3]
The Math of Medical Marriages
from the AMS Math in the Media column, Sept 2004.
- Friday, 12th November 2004
-
Last week (12th), Geraint was thinking back to his work of 1989-ish on
deriving the Fast Fourier Transform, after presenting the algorithm
again in a tutorial recently. He was hoping for a presentation with
less hand-waving, and fewer indices. We didn't make a whole lot of
progress.
The Discrete Fourier Transform is a matter of evaluating the polynomial
a0 x0 +
a1 x1 + ... +
an-1 xn-1
for x = 1, w, w2, ..., wn-1, where w is the complex
n-th root of unity.
The crucial step in obtaining the Fast Fourier Transform is to take the
right split of the problem into subproblems; it turns out to be a
matter of considering the even and the odd powers of omega. With some
rather involved calculations, we derived the appropriate
divide-and-conquer version of the DFT. However, the calculation
involved various cases of switching between curried and uncurried
versions of key combinators such as zipWith and pair operations - we
felt there was a cleaner, more point-less version that we were missing.
But we didn't even get as far as making the move from there to the FFT,
which is a matter of capitalizing on the fact that some of the
subproblems are repeated.
This took us well over an hour, so we remained to be convinced that
this would turn into an appropriate way to explain FFT from scratch in
a single lecture.
- Friday, 5th November 2004
-
Mike Spivey talked to us about
Categories for Breadth-First Search.
The setting is one of non-deterministic programs
f :: a -> m b
for some "monad of non-determinism" m. For example, m might
be the list datatype, giving Wadler's "list of successes"
technique. (For the sake of this note, all lists are
finite.) Since we will be considering non-deterministic
programs, one of the things we will want to be able to do is
to run two programs and choose from their results; so we
will need an operator
mplus :: m a -> m a -> m a
For good measure, we also assume an "empty computation"
mzero :: m a
These two should form a monoid. (In Haskell terms, we are
dealing with the MonadPlus class.)
But we will also want to reason about the "cost" of
different alternative solutions, so that we can enumerate
them "fairly by cost". We therefore assume an extra
operation
mwrap :: m a -> m a
which "increments the cost" of every result returned, or
"adds one to the cost counter". This gives us the following
type class:
class MonadPlus m => MonadBFS m where
mwrap :: m a -> m a
With the list monad, there is no sensible implementation of
this cost mechanism, so we define
instance MonadBFS [] where
mwrap = id
But we can pick more elaborate datatypes to represent the
non-determinism, which allow us to implement the
counter. One such is the datatype of forests:
data Tree a = Tip a | Fork (Forest a)
type Forest a = [Tree a]
instance Functor Forest where
fmap f = map (tmap f) where
tmap f (Tip a) = Tip (f a)
tmap f (Fork x) = Fork (fmap f)
instance Monad Forest where
join = map tjoin where
tjoin (Tip x) = Fork x
tjoin (Fork xs) = join xs
return a = [Tip a]
instance MonadPlus Forest where
mzero = []
mplus = (++)
instance MonadBFS Forest where
mwrap x = [Fork x]
(This is a simplification of the Haskell Monad instance,
using join instead of the two binds and omitting the
fail. I'm also cheating by using a type synonym for Forest
to avoid the constructor.)
The join operator flattens a forest of forests into a
forest, grafting each forest into place. The mplus operator
concatenates two forests. Without mwrap, the only forests
that can be constructed are lists of tips; the mwrap
operator adds an extra level to the forest.
In fact, one needn't think hard to construct the correct
definition of join, because (a standard result says) the
monadic structure is a consequence of the free algebra
construction, in this case for the algebra with operators
mzero, mplus and mwrap and laws that mplus is associative
with unit mzero. (This kind of forest is the initial object
in the category of breadth-first search strategies.)
Taking the free algebra but with additional laws gives
simpler representations of the non-determinism. For example,
adding the law that mwrap is the identity operation
collapses forests to lists.
Another implementation is (finite) lists of (finite) bags,
or what Mike calls Matrices. This is obtained from forests
by adding the laws that mplus is commutative (so the
ordering of elements at a particular depth is irrelevant)
and that mwrap distributes through mplus (so the nesting is
irrelevant too). What is left is a list of bags of elements,
one bag for each level.
type Matrix a = [[a]]
instance Functor Matrix where
fmap f = map (map f)
instance Monad Matrix where
join = a bit complicated!
return a = [[a]]
instance MonadPlus Matrix where
mzero = []
mplus = lzw (++)
instance MonadBFS Matrix where
mwrap x = [] : x
Here, lzw is "long zip with", which is like zipWith but
returns a result as long as its longer argument.
In between Forests and Matrices are "Carroll Morgan
Christmas Trees", in which mwrap distributes over mplus (so
nesting is irrelevant, although depth still is) but mplus is
not commutative (so order is still important): effectively a
list of elements each at a given depth.
Depth-bounded search forms another monad:
type DBS a = Int -> [ (a, Int) ]
This gives, for a given depth bound n, all the elements at
depth at most n, each paired with the remaining depth
available (ie the difference between their depth and n).
return a = \ n -> [(a,n)]
g >>= f = \ n -> [ (b,n'') | (a,n') <- g n, (b,n'') <- f a n' ]
mzero = \ n -> []
mplus f g = \ n -> f n ++ g n
wrap f = \ n -> if n==0 then [] else f (n-1)
Mike finished by explaining how the traditional stack/queue
thing for dfs/bfs arises. Define
search :: MonadBFS m => Forest a -> m a
search [] = mzero
search [Tip a] = return a
search [Fork x] = search x
search (x++y) = search x `mplus` search y
Instantiated for the list monad, this gives:
dfs :: Forest a -> [a]
dfs [] = []
dfs [Tip a] = [a]
dfs [Fork x] = dfs x
dfs (x++y) = dfs x ++ dfs y
from which it follows that
dfs (Fork x : y) = dfs (x++y)
and the forest behaves like a stack.
Instantiated for the matrix monad, this gives:
bfs :: Forest a -> Matrix
bfs [] = [[]]
bfs [Tip a] = [[a]]
bfs [Fork x] = bfs x
bfs (x++y) = lzw (++) (bfs x) (bfs y)
bfs' = concat . bfs
from which it follows that
bfs' (Fork x : y) = bfs' (y++x)
and the forest behaves like a queue. (A similar explanation
appears in Geraint's and Jeremy's paper
The Underappreciated Unfold.)
For further reading, see Mike's chapters with Silvija Seres
in Millenial Perspectives in Computer Science and
The Fun of Programming.
- Friday, 8th October 2004
-
Jeremy talked about "program fission", which is
program fusion in reverse. Fusion is a transformation that
improves efficiency at the cost of structure and
readability; fission regenerates that structure. These
thoughts were inspired by a chat with Jose Nuno Oliviera.
Consider, for example, the Unix utility wc, which counts
words in a text file (and lines and characters too, which
we'll ignore here). In C, it might be written:
#include <stdio.h>
#define IN 1 /* inside a word */
#define OUT 0 /* outside a word */
/* count words in input */
main()
{
int c, nw, state;
state = OUT;
nw = 0;
while ((c = getchar()) != EOF) {
if (c == ' ' || c == '\n' || c == '\t')
state = OUT;
else if (state == OUT) {
state = IN;
++nw;
}
}
printf("%d\n", nw);
}
I argue that this is "obviously equivalent" to the following
Haskell program:
> wc1 :: String -> Integer
> wc1 = fst . foldl step1 (0, False)
> step1 (n, b) c | isSpace c = (n, False)
> step1 (n, False) c = (n+1, True)
> step1 (n, True) c = (n, True)
Of course, for this problem it doesn't matter if you read
left-to-right (as is natural in C) or right-to-left (natural
in Haskell).
> wc2 :: String -> Integer
> wc2 = fst . foldr (flip step1) (0, False)
This is the result of applying fusion to:
> wc3 :: String -> Integer
> wc3 = genericLength . getWords3
> getWords3 :: String -> [String]
> getWords3 = fst . foldr glue3 ([], False)
> glue3 c (ws, b) | isSpace c = (ws, False)
> glue3 c (ws, False) = ([c]:ws, True)
> glue3 c (w:ws, True) = ((c:w):ws, True)
(Well, truer to say that fusing the right-hand two
components in
fst . pair genericLength id . getWords3
gives wc2. The relationship between the above and wc3 is a
simple law of fst.)
Conversely, applying fission to wc2 yields the more
comprehensible wc3.
The C program above maintains a count of the number of
words, and a boolean variable recording the state (whether
currently in a word or not). There is another program that
dispenses with the boolean variable, but uses a
two-character window to determine whether a certain
non-space character is the start of a word:
#include <stdio.h>
#define IN 1 /* inside a word */
#define OUT 0 /* outside a word */
/* count words in input */
int
isSpace(int c)
{
return (c == ' ' || c == '\n' || c == '\t' || c == EOF);
}
main()
{
int c, d, nw;
nw = 0;
d = ' ';
c = getchar();
while (c != EOF) {
if (!isSpace(c) && isSpace(d)) {
++nw;
}
d = c; c = getchar();
}
printf("%d\n", nw);
}
Again, the natural thing to do in Haskell is count word
endings rather than beginnings, so we use lookahead rather
than lookback. In this case, the program is a paramorphism:
> wc4 = paral step4 0
> step4 c (n, _) | isSpace c = n
> step4 c (n, []) = n+1
> step4 c (n, ' ':_) = n+1
> step4 c (n+1, _:_) = n+1
> paral :: (a -> (b,[a]) -> b) -> b -> [a] -> b
> paral f e [] = e
> paral f e (x:xs) = f x (paral f e xs, xs)
This fissions (yes, that is the verb) into length after a
another paramorphism, the latter for collecting the words
themselves:
> wc5 = genericLength . getWords5
> getWords5 = paral glue5 []
> glue5 c (ws, _) | isSpace c = ws
> glue5 c (ws, []) = [c]:ws
> glue5 c (ws, ' ':_) = [c]:ws
> glue5 c (w:ws, _:_) = (c:w):ws
In my mind, of course, getWords is neither a fold nor a
para, it's an unfold:
> wc6 = genericLength . getWords6
> getWords6 = unfoldr getWord6
> getWord6 xs | null ys = Nothing
> | otherwise = Just (span (not . isSpace) ys)
> where ys = dropWhile isSpace xs
We explored these and other structures for the wordcount
program. I haven't yet managed to explain how the fissioned
programs may be discovered - indeed, I believe some
invention is bound to be necessary for entropic reasons.
- Friday, 1st October 2004
-
Richard talked again about loopless algorithms, harking back to his
presentation on 24th October 2003. He
was again trying to get a loopless algorithm for the function
mixall (for the definitions, see the earlier minutes), but is
dissatisfied at how complicated his solutions are.
- Friday, 11th June 2004
-
Clare talked about a recent paper by Annabelle McIver and
Carroll Morgan on their probabilistic programming calculus;
we concentrated on working through an example of theirs
about investing in financial futures, and on whether it
would fit within Clare and Sharon's multi-relation
framework.
This led to a discussion about fixpoints, and reminded me about an argument
I'd had recently about coinduction and invariants. Davide Sangiorgi gave an
invited talk about coinduction at POPL in January. I wasn't there, but his
slides are on the web.
He claims (slide 25) that Hoare logic is a coinductive
method. I'd always thought of proofs of loop correctness via
invariants as an inductive method. We tried to work out
where the discrepancy is.
The coinductive proof principle (as on Sangiorgi's
slide 7) is that
x [= F(x) => x [= gfp(F)
for monotone F on a complete lattice (where [= is the
lattice order, and => is implication). Dually, the induction
principle is that
x ]= F(x) => x ]= lfp(F)
By Hoare logic, I presume he means the standard method of
invariants: that
{I and B} S {I} => {I} while B do S {I and not B}
By handwaving, I managed to convince myself and (I think)
Clare and Sharon at the meeting that this was indeed an
instance of induction, not coinduction. The rough argument
is that F corresponds to the loop body, lfp(F) the
terminating state of the loop, and X the invariant; that "X
contains F(X)" is the correctness of the loop body, so by
induction, "X contains lfp(F)", ie the final state satisfies
the invariant too.
But my subsequent steps to pin this down failed on the
grounds that the F I actually constructed had the empty set
as a lfp. That indeed suggests that a gfp is what is
wanted. But I haven't had time since then to explore this
further.
- Friday, 30th April 2004
-
Present: RSB, SC (minute-taker), CEM, GJ, JG, BO, BS
Room 441 bore a distinct resemblance to a shingle beach
today, because on arrival, RSB was busy putting
pebbles in buckets.
RSB had 4 buckets, and 16 pebbles to put in them.
| | | | | | | |
|___| |___| |___| |___|
Every time he put a pebble in a bucket, he stirred the
pebbles in that bucket. So given that the cost of putting
a pebble in a bucket which previously contained i-1 pebbles was i,
what was the lowest possible total cost of putting all 16 pebbles
in the buckets? 40, we said, by the greedy algorithm of
putting the next pebble into a least full bucket, resulting
in 4 buckets each with 4 pebbles in.
What about a more general problem with n pebbles to put
in k buckets?
Letting
q = n div k
r = n mod k
as r of the buckets will have q+1 pebbles in, and
(k-r) of the buckets will have q pebbles in.
using the same algorithm, we get that the minimum cost is
r(q+1)(q+2) + (k-r)q(q+1)
----------- -----------
2 2
"Someone simplify that in their heads!" was demanded of
the audience.
RSB relented, and simplified it to
k | n | | n |
(n - - |---|)(|---| + 1)
2 |_k_| |_k_|
RSB then introduced another operation (besides simply putting
a pebble in a bucket) - a redistribution operation. This allows
you to pick up several buckets and pour the contents into one
bucket.
| | | | | | | |
|_p1_| |_p2_| |_p3_| |_p4_|
"Your buckets have holes in them"
===>
| | | | | | | |
|_0_| |_0_| |_0_| |_p1+p2+p3+p4_|
This operation has cost of the number of pebbles involved
(in the poured buckets and the poured-into bucket),
ie p1+p2+p3+p4 in this example.
It was conjectured that this wouldn't help. RSB pointed out
that it did indeed help: consider these stages of bucket contents:
0 0 0 0
1 1 1 1
0 0 0 4
The above pebble operations from 0 0 0 0 to 1 1 1 1 to 0 0 0 4
takes a cost of 8 (4 + 4), compared to the 10 it would take to
put all the pebbles into the last bucket in the first place.
RSB then started talking about sorting, and the number of
comparisons it takes to sort. For example, it takes 29 comparisons
to sort 12 things, but it's not clear how to generalize, to get
an idea of what lower bounds for sorting are for arbitrary n
things to sort.
RSB wanted to get lower bounds for this buckets problem.
He'd been up all night trying, and the best he'd managed to do was 37:
Cost
0 0 0 4 8
1 1 2 4 5
0 0 4 4 4
2 2 4 4 6
0 4 4 4 4
4 4 4 4 10
---
37
JG and SC promptly came up with solutions with costs 36:
1 1 1 2 6 0 0 0 4 8
0 0 0 5 5 1 2 2 4 7
1 1 2 5 5 0 0 5 4 5
0 0 4 5 4 2 2 5 4 6
2 2 4 5 6 0 4 5 4 4
0 4 4 5 4 3 4 5 4 6
3 4 4 5 6 ---
--- 36
36
RSB then explained more of the context of this problem,
apparently it's related to the Ordered File Maintenance problem,
and the Online Labelling problem (see e.g. [1]).
GJ mused about Fibonacci numbers, and wondered if they might
crop up somewhere in this problem. He pointed out that if you
have enough buckets, the cheapest way to build up large
quantities of pebbles is by merging large buckets.
SC wondered whether if we knew the quantity destined to be
in the end bucket, whether there was a way to find out what
cost was required to fill that bucket.
RSB then started talking about one of the Hilbert problems
concerning Diophantine equations, and how the technique to
show that this particular Diophantine equation was unsolvable
made use of Fibonacci numbers.
Define T(k) to be the cost of putting 2^k (=n) pebbles
in k buckets.
As T(k) = 2 + 2^1 + 2^2 + ... 2^(k-1)
one algorithm is to do the algorithm for 2^(k-1) pebbles in
k-1 buckets (cost T(k-1)), then put all those pebbles in
the last bucket, then do the rest of the buckets (cost T(k-1)).
This gives the eqn
T(k) = 2T(k-1) + 2^(k-1)
T(2) = 6
to which the solution is of the order k2^k (n log n)
This gives an upper bound on T(k). This is where RSB got the
37-cost solution from, which illustrates that this can't
be optimal, because we know that 36-cost solutions are possible.
In any case, we suspected that this wasn't optimal because the
last bucket isn't being used!
What if n isn't of the form 2^k ? What about general n and k?
A side-track discussion then focused on when n=a2^k
and obtained the recurrence of
T(k) = aT(k-1) + a^2 * 2^(k-1)
RSB wanted a good algorithm for this problem. Apparently
Oege wants one too.
Overloading T such that T(k,n) is the cost of putting n pebbles
into k buckets, the following recurrence eqn is arrived at:
T(k,n) = T(k,n/2) + n/2 + T(k-1,n/2)
(Put n/2 pebbles in k buckets, merge into 1 bucket, put out
remaining n/2 pebbles)
GJ improved the bound to
T(k,n) = min { T(k,x) + x + T(k-1,n-x) }
x
(same idea, but with x pebbles, not n/2)
RSB then stated three versions of his problem:
- Problem 1:
- the optimal cost is O(f(k,n)) for what f?
- Problem 2:
- if the optimal cost is O(f(k,n))
and k = O(log n), then what is f?
- Problem 3:
- if the optimal cost is O(f(k,n))
and k = log n, then what is f?
Moving on to a fresh discourse, RSB told us that
as far as lower bounds were concerned, for
n pebbles in k buckets, a lower bound was
/-\ /n log n\
_\ /_ |-------|
\ log k /
Denote a bucket configuration by n1, n2, ... nk
where sum [n1..nk] = n
Let C represent the "complexity" (a sort of cost measure),
define
C = sum (ni log ni)
i
Define C_init = 0
(C_init the cost of the initial empty buckets configuration)
What is C_final ?
Considering the extremes where either the pebbles are evenly
distributed or piled up in one bucket,
n log(n/k) <= C_final <= n log n
RSB claimed that if the cost of a move is p, then the
corresponding increase in C is at most p log k
Consider adding a pebble to a bucket:
deltaC = (p+1) log(p+1) - p log p
= p log((p+1)/p) - log(p+1)
<= 2p
deltaC = n log n - sum(ni log ni)
i
<= n log k
Therefore at most a (log k) cost for the redistribution,
giving a lower bound on the complexity of the problem.
RSB pointed out that an exact lower bound is an interesting
open problem.
On another tack, JG was wondering about Fibonacci numbers
and whether they have unique partitions into Fibonacci numbers.
34 can be partitioned into [8,13,13] and [5,8,21], but maybe
there's a unique partition without duplicates.
Maybe this helps with the buckets problem?
Discussion had dwindled off, so at this point, SC clarified
reference [1] for the sake of the minutes, and then RSB started
explaining the online labelling problem. You are given, one-by-one,
some things to label, and the labels you give have to respect the
ordering on those things.
So, for example, if you have n items and 2^n labels to play with,
it's trivial, give the 1st item the middle label, then subdivide
depending on whether the 2nd item is larger or smaller than the
1st, etc. etc.
Now suppose you only have kn labels. You'll need some relabelling.
This can be thought of as putting the incoming items into an kn-long
array, and relabelling is moving items to make space from time to time.
RSB asserted that this must be at least of order n log n
and challenged us to think why.
This is the case because once you've labelled everything, you have
enough information to sort without making any further comparisons
between items, so you must have done at least n log n because
that's a lower bound on the number of comparisons for sorting.
We then discussed the upper bound, and couldn't quite see whether
we could get it lower than quadratic in n.
Having run out of ideas, we then trooped off to lunch, taking
our buckets and pebbles with us.
BS volunteered to do next week's,
and JG bagged the week after that.
[1] P.F.Dietz, J.I.Seiferas, J.Zhang:
"A tight lower bound for on-line monotonic list labeling"
Algorithm Theory -- SWAT '94 (Proceedings of 4th Scandinavian
Workshop on Algorithm Theory, Lecture Notes in Computer Science 824,
Springer-Verlag) (Aarhus, Denmark, July 7, 1994), pp. 131-142.
- Friday, 12th March 2004
-
Present: RSB, JG, BS, BO
BO presented an idea for defining types for generic functions. This idea
hopes to extend previous research in the area.
BO started his presentation by reviewing Generic Haskell (GH). In GH the
concept of structural polymorphism brings us the ability to define a
function or type by induction on the structure of types. The types of
generic functions (GF) are defined as kind-indexed types in Generic
Haskell. For instance, the type for equality would be:
type Eq{| * |} t = t -> t -> Bool
type Eq{| k -> v |} t = forall a. E{| k |} a -> Eq{| v |} (t a)
Traditional GH has some limitations due to the fact that recursive GF can
only use implicit recursion. Dependency style GH (DSGH) tries to solve
that problem by allowing explicit recursion. However, there are still some
problems that are not addressed by DSGH.
In both GH and DSGH, the key notion behind generic definitions are the
kinds of a datatype. Consider for instance:
data List a = Nil | Cons a (List a)
data Tree a = Empty | Fork a (Tree a) (Tree a)
The kinds for List and Tree are the same. The consequence of this is that
kind-indexed types will also be the same for those datatypes. But we know
that the traditional definition of a fold, does not follow this pattern.
Type | Kind | signature of fold
----------------------------------------------------------------
List :: * -> * | (a -> b -> b) -> b -> List a -> List b
Tree :: * -> * | (a -> b -> b -> b) -> b -> List a -> List b
Type-Transformers (TT) is a way to define types not only on the kinds but
also on the algebraic structure of types:
TT<Type> :: Type
Consider a very simple example :
outT :: T -> F T
We could define outT signature with a TT:
out<T | T -> O> :: T -> O
O depends on the structure of T, which is expressed by "| T -> O". The
body of out could be written as:
out<Data T> = out'<T, Data T | T -> O>
We would need this auxiliary function out' because we are interested in
matching the recursive pattern of T, the second parameter (Data T) would
indeed correspond to that pattern. The definition of out' would be very
similar to a DSGH definition. BO argue that it might even be possible to
have type inference with this definition. Some of the applications for TT
are: folds, zipper, possibly design patterns, possibly parsers.
- Friday, 5th March 2004
-
I [JG] don't have Sharon's flair for recording what went on; here
is a mere shadow.
Barney talked more about his PPM work. He described an
approach to selecting justifiable probabilities for symbols
that have never yet been seen in a text being adaptively
compressed. He uses recurrence relations for the conditional
probabilities of seeing certain sequences - that is,
expressions relating the probabilities of longer sequences
to those of shorter ones. Specializing these to the case of
relating probabilities of two-character sequences from those
of one-character ones, and (if I read it right) inferring a
kind of stable state, he derived a fixpoint equation for the
probabilities. When the recurrences are expressed in a
matrix form, the fixpoint turns out to be an eigenvector of
unit eigenvalue. This is sufficient to specify (and indeed,
to compute) those prior probabilities.
(Barney, my apologies if I misrepresent you. I hope you can
provide a more coherent explanation!)
I talked some more about enumerating the rationals,
following on from last week's presentation. Richard had
observed last time that the Calkin-Wilf tree of rationals we
constructed is the tree of traces of Euclid's gcd
algorithm. That gives an alternative proof of the facts that
every rational appears (Euclid's algorithm is total) exactly
once (...and deterministic).
We had been thinking about "the traces of Euclid's algorithm
on relatively co-prime pairs", since the set of rationals is
isomorphic to the set of such pairs. But there is no need to
restrict in this way; if you take the instrumented gcd
algorithm
igcd (m,n)
| m==n = []
| m<n = False : igcd (m, n-m)
| m>n = True : igcd (m-n, n)
then
m/n = m'/n' <=> igcd (m,n) = igcd (m',n')
so we can enumerate the rationals by enumerating all traces
of igcd, whether from a coprime initial state or not. The
natural arrangement of these traces is into a binary tree,
of which they form the paths.
This week, David Lester observed a connection between the
Calkin-Wilf tree and continued fractions.
____ [1] ____
____/ \____
[0,2] [2]
/ \ / \
[0,3] [1,2] [0,1,2] [3]
/ \ / \ / \ / \
[0,4] [1,3] [0,1,1,2] [2,2] [0,2,2] [1,1,2] [0,1,3] [4]
(Recall that the continued fraction [a0,a1,a2,...,a{n-1}]
represents the rational
a0 + 1/(a1 + 1/(a2 + 1/(... + a{n-1})))
Each term but the first is at least 1, and the last term
is not 1. These are in one-to-one correspondence with the
rationals. So one can enumerate the rationals by enumerating
the continued fractions. Note that the first term is the
integer part, and the reciprocal is formed by either
removing a leading zero, or consing a new zero.)
It is not hard to see how to generate this tree directly (as
opposed to mapping the rational-to-cf conversion function
over the CW tree, as we did in the meeting). Each node x has
right child x+1 - so cf n0:ns has right child (n0+1):ns -
and left child 1/(1/x + 1) - so cf 0:n1:ns has left child
0:(n1+1):ns and otherwise cf n0:n1:ns has left child
0:1:n0:n1:ns.
data Tree a = Node a (Tree a) (Tree a)
unfoldt :: (b->a) -> (b->b) -> (b->b) -> b -> Tree a
rats = unfoldt id fl fr [1]
fl (n0:ns) = (n0+1):ns
fr (0:n1:ns) = 0:(n1+1):ns
fr ns = 0:1:ns
We did notice that the cfs on each level have the same sum,
and that this sum increases from level to level. (This is
easy to see from the program, as sum.fl = sum.fr =
succ.sum.) We didn't twig at the time that therefore (a)
Euclid's algorithm takes k steps on (m,n) where k is the sum
of the cf of m/n, and (b) there are exactly 2^{k-1}
normalized cfs that sum to k - two facts that I didn't know
before.
We also couldn't see at the time how to generate each cf
from its predecessor - and so how to deforest the tree in
generating a list of rationals. However, it isn't so hard,
because each of the movements in the tree (to a left or
right child, and their inverses) are straightforward to
represent on cfs. The program is a bit messy because of
special cases for short cfs, but it takes constant time from
one cf to the next:
next [n0] = [0, n0+1]
next [0, 2] = [2]
next [0, n1] = [1, n1-1]
next [n0, 2] = [0, n0, 2]
next [n0, n1] = [0, n0, 1, n1-1]
next (0 : 1 : n2 : ns) = (n2+1) : ns
next (0 : n1 : n2 : ns) = 1 : (n1-1) : n2 : ns
next (n0 : 1 : n2 : ns) = 0 : n0 : (n2+1) : ns
next (n0 : n1 : n2 : ns) = 0 : n0 : 1 : (n1-1) : n2 : ns
This doesn't enlighten me much, though.
---------------
Since then I've thought and read a bit more. Graham, Knuth
and Patashnik (S4.5) talk about a related tree, the
Stern-Brocot tree:
0/1 ____ ____ 1/0
\_____ _____/
_ 1/1 _
_/ \_
1/2 2/1
/ \ / \
1/3 2/3 3/2 3/1
/ \ / \ / \ / \
1/4 2/5 3/5 3/4 4/3 5/3 5/2 4/1
Start with (0/1, 1/0), and between each adjacent pair (m/n,
m'/n') insert a new fraction (m+m')/(n+n'). This is closely
related to the CW tree we've talked about. It also relates
to continued fractions, to representations of exact reals,
to bit reversal permutations... in fact, most of of my work
over the last few years appears here in microcosm. But
that's a story for another day.
- Friday, 27th February 2004
-
Present: RSB, SC, CEM, JG, BO, BS
JG offered to show us a program for enumerating the rationals.
(It later turned out that he was just concentrating on the +ve
rationals.) This is from a result in [1].
JG reminded us that the usual way of enumerating the rationals
is to lay them out like this
1/1 2/1 3/1 4/1 ...
1/2 2/2 3/2 4/2 ....
1/3 ...
.
.
.
and then to traverse them in a diagonal fashion.
But there's a problem: duplicates (e.g. 2/1 and 4/2)
JG then wanted us to think about how you might do it instead,
not producing duplicates.
We satisfied ourselves that we want to keep just the fractions
where the numerator and denominator are coprime (gcd = 1).
We then had a brief discussion about the possibilities
of merging infinite lists (as they are sorted), discarding
duplicates as you go, but this was dismissed as being
very awkward - we don't want to do that.
SC wanted to know whether JG had in mind generating everything
and then throwing away duplicates, or just not generating them
in the first place. JG didn't propose to generate duplicates
in the first place.
The diagonal traversal from above can be expressed as
filter coprime (diag [[(a,b)| a <- [1..]] | b <- [1..]]
or as
filter coprime (diag [(a,d-a)| d <- [1..], a<- [1..d]]
JG paused and looked at this for a moment. RSB accused him
of thinking that this was niftier than whatever else he
had in mind. JG assured us that the program he has in mind
is definitely niftier, no question.
JG's hints hadn't been getting us very far, so he dropped
a bigger one: trees. This is because we're trying to get a
list of rationals, which is one dimensional, but the data
isn't naturally arranged in 1D, it's naturally arranged in 2D.
And trees are 2D....
We wondered if Pascal's triangle had anything to do with it?
No, said JG, that involves sharing, and he doesn't have
sharing in mind. The idea would be that we divide the
(+ve) rationals into two, the left sub-tree and right sub-tree,
and (breadth-first) traverse the tree to get all the rationals.
It was agreed that 1/1 was a good candidate to put at the
top of the tree. What next? Random guesses from the audience
put 1/2 as the left child and 2/1 as the right, which was fine,
but then the guessing degenerated into values where we couldn't
really see what was a good rule to use for determining the
children of a node.
JG dropped another hint: we want to retain the coprime invariant.
Whereupon CEM suggested a rule for children of a node, which
is correct:
a/b
/ \
/ \
a/(a+b) (b+a)/b
Thus, when a and b are coprime (gcd = 1) that means that
a+b is coprime with both a and b too, and therefore this
rule for generating the tree does only generate fractions in
their most reduced form.
RSB wondered if this had anything to do with Farey sequences
[Editor: e.g. see [2] for an explanation], but did not
elaborate further.
Writing out the tree that we get, leads us to
___ 1/1 ___
/ \
1/2 2/1
/ \ / \
1/3 3/2 2/3 3/1
/ \ / \ / \ / \
1/4 4/3 3/5 5/2 2/5 5/3 3/4 4/1
with the resulting generation of the rationals
[1/1,(1/2,2/1),((1/3,3/2),(2/3,3/1)),....]
This rang a bell for some people of work with nested datatypes.
JG also pointed out a pretty pattern in the tree and drew
little blue lines all over the diagram. The blue lines pointed
out the equality between adjacent numbers - obvious if you
look at just the last line of the tree:
1/4 4/3 3/5 5/2 2/5 5/3 3/4 4/1
-1 4-4 3-3 5-5 2-2 5-5 3-3 4-4 1-
Even the first and last number are the same!
This means a more efficient representation - we only need to
store 1 number per fraction.
But of course, as JG pointed out, there remains two big questions:
1) Is everything in the +ve rationals in the tree?
2) Are there any duplicates?
In an attempt to convince ourselves that everything was really
in the tree, RSB challenged us to provide a number for him
to find. SC suggested 5/7. RSB didn't like 5/7 and wanted to
find 22/7 instead.
SC said that 22/7 lived at a position (going from the top of
the tree "6 left then 3 right"). RSB thought it would be at
"3 right then 6 left" but "6 left then 3 right" was correct.
RSB then challenged SC to find 23/7, which was duly found
at "1 right, 3 left, 3 right".
JG pointed out that SC had been
lucky with 22/7: it is a natural plus the reciprocal of a natural.
Then with 25/7 we realised that we
needed 4/7 for that, because going from a node to a right
descendant means increasing the number by 1, but we managed
to lose 4/7 and couldn't find it in the tree. Having half
convinced ourselves that all the numbers were in the tree somewhere,
we decided to prove it.
Doing an inductive proof that r/s is in the tree (for r,s coprime)
by induction on r+s :
Base case: r+s=2 so r=s=1
Inductive case:
Suppose not all the rationals are present.
Then there exists a rational r/s with the least
r+s out of all the ones not present.
Suppose r<s, then r/(s-r) is not present either (contradiction).
Suppose r>s, then (r-s)/s is not present either (contradiction).
Meanwhile, SC had thought of a constructive proof, and thus
found out where 4/7 was hiding. RSB thought of it too:
the path of left/rights from a fraction r/s to the root
is simply the trace of the gcd algorithm for r and s
By giving a construction for where r/s is in the tree, this
means that both questions are answered: not only do we know that
r/s is in the tree, we also know from the uniqueness of the construction
that there can't be duplicates.
Meanwhile, BO had been busy scribbling to investigate his own ideas,
to do with the relationship between the sequence generated by this tree
and Pascal's triangle (see note [3]).
Finally, JG wanted to get rid of the tree.
Merely generating the rows of the tree wasn't quite what he had
in mind, as you're still going along the rows of the tree even if
the tree isn't there. Instead, if we have just generated the
fraction a/b, how do we know what the next fraction is?
If a/b is a left child, this is easy, because that portion of
the tree must look like this:
a/(b-a)
/ \
/ \
a/b b/(b-a)
and thus the next element is b/(b-a).
But what if it is a right child? Then the general format is
(a-kb)/((k+1)b-a)
/ \
(a-kb)/b b/((k+1)b-a)
. .
. .
. .
\ /
a/b b/((2k+1)b-a)
And k is just |_ a/b _|
Look! This incorporates the left child case too, because
then if a/b is a left child then a<b and k=0
Not only that, but it does the "end of the row" case as well,
because a/1 at the end of one row is followed by 1/(a+1)
at the beginning of the next, which works because b=1,k=a !!!
So we end up with this program:
rats = iterate next 1/1
next a/b = b/ ((2k+1)b-a) where k = |_ a/b _|
Alternatively you can express it with a rational x, where x=a/b,
to get
rats = iterate next 1
next x = 1/( |_x_| + 1 - {x})
where {x} denotes the fractional part of x.
We all agreed this was very cool.
RSB got excited about the use of revlex (referring
to reverse lexicographics ordering) characterizing the
order in which these rationals are generated - a
reverse lexicographic ordering on the trace of the
gcd algorithm.
[1]
Calkin, Neil, Wilf, Herbert S:
Recounting the rationals,
Amer. Math. Monthly 107 (2000), no. 4, 360--363.
[2]
Mathworld's explanation of Farey Sequences.
[3]
JG in a later email noted that:
I think you were on the right lines here, Bruno.
Sequence number A064881
at the
Online
Encyclopedia of Integer Sequences
describes the following two-dimensional sequence:
a(n,m)= a(n-1,m/2) if m is even, else a(n,m)=
a(n-1,(m-1)/2)+a(n-1,(m+1)/2, a(1,0)=1, a(1,1)=2.
That is, the first row is [1,2], and each subsequent row is
obtained from the previous one by interleaving that row with
the list of sums of its adjacent elements.
[ [1,2], [1,3,2], [1,4,3,5,2], [1,5,4,7,3,8,5,7,2], ... ]
= iterate step [1,2]
where
step xs = interleave xs (adjpairs (+) xs)
interleave [x] [] = [x]
interleave (x:xs) (y:ys) = x : y : interleave xs ys
adjpairs f xs = zipWith f (init xs) (tail xs)
The rationals in the left subtree of the tree we were
exploring (ie the subtree rooted at 1/2) are
map (adjpairs (%)) (iterate step [1,2])
The right subtree is in some sense the mirror of the left
subtree.
In fact,
concat (map (adjpairs (%)) (iterate step [1,1]))
gives all the positive rationals.
- Friday, 20th February 2004
-
Only a select few this week: JG, MA and David Lester, who is
visiting the lab on sabbatical this term. I can't hope to
follow Sharon's example, so at best a brief summary (and my
apologies to David if I misrepresent him).
DL talked about his work on computable real
arithmetic. It seems that the general consensus these days
is that continued fractions, (possibly redundant) digit
sequences, and compositions of homographies (aka Moebius
transformations, aka linear fractional transformations)
which generalize these two, are not competitive with a much
more straightforward approach. This is to represent a real
as a (computable) function from precision to rational
approximation.
The first approach you might think of is to represent a real
x by a function f of type Int -> Rational, such that
abs (f n - x) < 2 ^ (-n)
In fact, DL has concluded that for efficient arithmetic it
is better not to allow arbitrary rationals in the range of
this function: better, in fact, to insist that they have 2^n
as the denominator. That means you don't need to represent
the denominator exactly, and instead represent x by f of
type Int -> Integer, such that
(f n - 1) / (2^n) < x < (f n + 1) / (2^d)
or equivalently
f n - 1 < 2^n * x < f n + 1
Addition, subtraction, multiplication, maximum and minimum
have fairly simple pointwise implementations. Division is a
bit more complicated (you need to keep prodding the divisor
until it proves itself non-zero), and transcendental
operations rather trickier. Plain comparisons are
uncomputable, but a three-way comparison is implementable.
David has been occupying himself using the theorem prover
PVS to prove his algorithms correct - in particular, that
the functional representation always returns a value within
its advertised bounds.
- Friday, 13th February 2004
-
We discussed object-oriented models of priority queues, suitable for
implementations of Dijkstra's shortest paths algorithm. No-one took minutes
per se, but Michael and Jeremy sent summaries after the fact.
Michael's summary
It's intruiging how fiddly a problem this seems to be. I guess part of
the problem arises from the compromises that need to be made when passing
from the abstract to implementation. So perhaps an abstract model of the
problem would help.
> class Ord a => PQueue q a where
> empty :: q a
> enqueue :: q a -> a -> q a
> gethead :: q a -> (q a, a)
Oh, damn. We can't talk about changing the state of the entries and hence
forcing a requeue without introducing state and monads and so on. I think
my Haskell's too rusty for this, but maybe later on.
Ok, let's pretend we're in some kind of stateful language. I'll invent
notation as I go. Let's assume all methods can be virtual (though
actually only lt is required to be).
> class Orderable where
> lt :: Orderable -> Bool
>
> class PQueue1 where
> enqueue :: Orderable -> ()
> gethead :: () -> Maybe Orderable
> requeue :: Orderable -> ()
This is my offering, with the following intended semantics: a client of
PQueue must implement Orderable instances which should provide a mutually
consistent implementation of the lt operation and requeue must be called
if the implementation of lt changes.
This design is a little unpleasant, as it requires a delicate interaction
between the lt operation and requeue. Jeremy's suggestion provides one
way to control this, but I think I'd like to offer an alternative which I
think may be better as an abstract interface. In particular, note that
this one treats the priority as a separate *value*, not an updatable
object.
> type Priority = Orderable
>
> class PQueue2 where
> enqueue :: (Object, Priority) -> ()
> gethead :: () -> Maybe Object
> requeue :: (Object, Priority) -> ()
Now here I suggest that the queue remember the priority for the object:
here Object is the "ur class" with no methods. If the object also wishes
to remember its priority then that is its own business. So maybe we end up
with two copies of the priority, but the separation is cleaner.
With my engineering hat on I'm not at all happy about the fact that the
requeue operations need to look up their arguments in a separately
maintained lookup table. I have a couple of solutions to this (which I
used in my work) but they involve breaking the abstraction in an important
way, so I'll not develop this idea for the moment.
Now what about the user of the priority queue? We have Towns and
TownDistance objects. Let's take a peek.
> class Town where ...
> class TownDistance1 inherits Orderable where
> priority :: Real
> town :: Town
> lt(other :: Orderable) is
> otherTD = cast other as TownDistance1
> return this.priority < otherTD.priority
Now some code to change priority of towns
> td :: TownDistance1, queue :: PQueue1
> td.priority = basepri
> queue.enqueue(td)
...
> newpri = computation
> td.priority = newpri
> queue.requeue(td)
...
> td = cast queue.gethead() as TownDistance1
Let's try the same thing with PQueue2. I'll assume inheritance from
Object.
> class TownDistance2 where
> priority :: Real
> town :: Town
> queue :: PQueue2
> td.priority = basepri
> queue.enqueue(td, basepri)
...
> td.priority = newpri
> queue.requeue(td, newpri)
This is still pretty abstract, but turning this into real code (modulo
actually writing the PQueue class, which will actually require
programming) should be straightforward.
Hmm. Not sure I've really captured the problem here. The really
interesting part of the problem is how we avoid the cost of a strictly
redundant lookup when we call requeue, but I think the strong concensus
was that this is a fracture of abstraction too far!
I think I'm going to have to brush up my "stateful" Haskell: can anybody
point me to some simple code examples of stateful programming with Haskell
monads?
Jeremy's summary
Town is a simple abstraction, completely independent of
priority queues.
import java.util.Vector;
import java.util.Iterator;
public class Town {
protected String name;
protected Vector roads;
public Town (String name) {
this.name = name;
roads = new Vector();
}
public void addRoad(Road r) {
roads.add(r);
}
public Iterator exits() {
return roads.iterator();
}
public String toString() {
return name;
}
}
Likewise roads:
public class Road {
protected Town from, to;
protected int length;
public Road (Town from, Town to, int length) {
this.from = from; this.to = to;
from.addRoad(this); to.addRoad(this);
this.length = length;
}
public int getLength() {
return length;
}
public Town otherEnd(Town thisEnd) {
if (thisEnd == from)
return to;
else if (thisEnd == to)
return from;
else
throw new IllegalArgumentException();
}
}
Here's the interface for priority queues.
public interface PQueue {
/* is queue empty? */
public boolean isEmpty();
/* whether queue contains object (a query) */
public boolean contains(Object o);
/* highest-priority object (a query; queue is non-empty) */
public Object highest();
/* priority of given object (a query; assumed to be present) */
public int priorityOf(Object o);
/* enqueue the object o (assumed not already present) with priority p */
public void enqueue (Object o, int p);
/* reorder the queue, to accommodate o's increased priority p */
public void requeue (Object o, int p);
/* dequeue the highest-priority object (queue is non-empty) */
public Object dequeue ();
}
I dispensed with PrioritizedObjects and the Comparable
interface. I was wrapping Towns up as PrioritizedObjects
before inserting them into some collection; then it is
awkward to find the corresponding PrioritizedObject when you
need to update the priority of some Town neighbouring
another you have just completed. I found it easier to keep
the Object and the priority separate in the PQueue
abstraction. The only problem then is that dequeueing ought
to return a pair, an Object and its priority. I could have
defined a Pair type (ie PrioritizedObject!), but instead I
have three operations in the interface: two queries
(yielding the first Object, and the priority of any Object
in the queue) and one removal.
Here's a simple implementation of the PQueue interface.
import java.util.Hashtable;
import java.util.Map;
import java.util.Iterator;
public class UnorderedPQueue implements PQueue {
protected Map entries;
public UnorderedPQueue() {
entries = new Hashtable();
}
/* is queue empty? */
public boolean isEmpty() {
return entries.isEmpty();
}
/* whether queue contains object (a query) */
public boolean contains(Object o) {
return entries.containsKey(o);
}
/* highest-priority object (a query; queue is non-empty) */
public Object highest () {
Object highest = null; int p = 0, q;
for (Iterator i = entries.keySet().iterator(); i.hasNext();) {
Object current = i.next();
q = priorityOf(current);
if (highest==null || p > q) {
highest = current; p = q;
}
}
return highest;
}
/* priority of given object (a query; assumed to be present) */
public int priorityOf(Object o) {
return ((Integer)entries.get(o)).intValue();
}
/* enqueue the object o with priority p */
public void enqueue (Object o, int p) {
entries.put(o, new Integer(p));
}
/* reorder the queue, to accommodate o's increased priority p */
public void requeue (Object o, int p) {
enqueue(o, p); // overwrites old priority if present
}
/* dequeue the highest-priority object (queue is non-empty) */
public Object dequeue () {
Object highest = highest();
entries.remove(highest);
return highest;
}
}
The queue is stored as a finite mapping from entries to
priorities. Finding the highest-priority entry involves a
linear search. Inspecting then removing the highest-priority
element involves two searches; you'd probably want to
maintain a private reference to that element too.
Here's a slightly more sophisticated implementation.
import java.util.Hashtable;
import java.util.Map;
import java.util.Vector;
import java.util.Iterator;
public class OrderedPQueue implements PQueue {
protected Vector entries;
protected Map priorities;
public OrderedPQueue() {
entries = new Vector();
priorities = new Hashtable();
}
/* is queue empty? */
public boolean isEmpty() {
return entries.isEmpty();
}
/* whether queue contains object (a query) */
public boolean contains(Object o) {
return entries.contains(o);
}
/* highest-priority object (a query; queue is non-empty) */
public Object highest () {
return entries.elementAt(0);
}
/* priority of given object (a query; assumed to be present) */
public int priorityOf(Object o) {
return ((Integer)priorities.get(o)).intValue();
}
/* enqueue the object o with priority p */
public void enqueue (Object o, int p) {
priorities.put(o, new Integer(p));
int i = entries.size();
while (i>0 && priorityOf(entries.elementAt(i-1))>p) i--;
entries.add(i,o);
}
/* reorder the queue, to accommodate o's new priority p */
public void requeue (Object o, int p) {
priorities.put(o, new Integer(p)); // overwrites if already present
int i = entries.indexOf(o);
if (i>0) {
entries.remove(o);
while (i>0 && priorityOf(entries.elementAt(i-1))>p) i--;
entries.add(i,o);
} // else already highest priority
}
/* dequeue the highest-priority object (queue is non-empty) */
public Object dequeue () {
Object highest = highest();
entries.remove(highest);
return highest;
}
}
This time the queue is stored as a Vector of entries (in
order of decreasing priority), and a Hashtable mapping
entries to priorities. Now enqueuing involves a linear
search from the end of the Vector to the correct position,
and requeuing involves some search for the current position
plus a linear search from the old position to the new
position. If the standard implementation indexOf of the
search for the current position isn't good enough, you could
keep a second mapping from entries to positions in the
Vector.
Here's a simple main program. (This is "structured
programming in Java": all features are static.)
import java.util.Iterator;
import java.util.Map;
import java.util.Hashtable;
public class Routeplanner {
protected static Town[] towns = { new Town("London"), // 0
new Town("Oxford"), // 1
new Town("Birmingham"), // 2
new Town("Manchester"), // 3
new Town("Nottingham"), // 4
new Town("Carlisle"), // 5
new Town("Newcastle"), // 6
new Town("Edinburgh") }; // 7
protected static Road[] roads = { new Road(towns[0],towns[1],56),
new Road(towns[1],towns[2],68),
new Road(towns[2],towns[3],87),
new Road(towns[0],towns[4],131),
new Road(towns[2],towns[4],53),
new Road(towns[3],towns[5],120),
new Road(towns[4],towns[6],162),
new Road(towns[5],towns[6],59),
new Road(towns[5],towns[7],99),
new Road(towns[6],towns[7],108) };
public static void main(String[] args) {
Map results = new Hashtable();
PQueue pqueue = new OrderedPQueue();
pqueue.enqueue(towns[0],0);
while (!pqueue.isEmpty()) {
Town here = (Town)pqueue.highest();
int distance = pqueue.priorityOf(here);
pqueue.dequeue();
results.put(here, new Integer(distance));
for (Iterator i = here.exits(); i.hasNext(); ) {
Road road = (Road)i.next();
Town there = road.otherEnd(here);
if (pqueue.contains(there)) {
if (pqueue.priorityOf(there) > distance+road.getLength())
pqueue.requeue(there, distance+road.getLength());
}
else if (! results.containsKey(there)) {
pqueue.enqueue(there, distance + road.getLength());
}
}
}
for (Iterator i = results.keySet().iterator(); i.hasNext(); ) {
Town here = (Town)i.next();
System.out.println("Distance from " + towns[0] +
" to " + here + " is " + results.get(here));
}
}
}
- Friday, 6th February 2004
-
Present: SC, CEM, JG, BO, BS
SC has recently received a new book on cake cutting algorithms
[1],
and brought it along especially to show RSB, who had asked to see it,
but he wasn't there, and so instead, she talked about two algorithms
for dividing a cake between n people.
The first was called the moving knife protocol and the second
was a divide and conquer algorithm.
For each of the following algorithms it is assumed that the cake is
represented by the unit interval [0,1]
In the moving knife protocol, a knife is moved from left to right
across the cake until someone shouts "stop!". The knife then stops
moving from left to right and cuts the cake. The person who shouted
is given the piece of cake to the left of the cut and the process
is repeated for the remaining n-1 people and the rest of the cake
until there is only one person left. If several people shout
simultaneously then one of them is chosen at random to receive the
piece of cake.
To illustrate, we tried this out with a whiteboard diagram instead
of a real cake.
The book [1]
describes this as being not a finite algorithm, because
the participants need to evaluate the position of the knife
continuously, but SC wasn't sure about this, as you could just
decide for the first cut where you thought was fair, and shout out
once the knife had got past the cut. So that results in
n + (n-1) + .... 2 evaluations made by the participants during the
execution of the protocol, which is quadratic.
SC then explained the divide and conquer algorithm as
[1] presents
it, but the assembled throng reworded it to make it a lot nicer.
This is the reworded explanation:
The divide and conquer algorithm is easiest to explain for the case
when n = 2^k for some k, but it generalises to any n. So, first
suppose n = 2^k. Each of the n people must mark the position on the
cake that they think is halfway. The cake is then cut anywhere
between the two middle marks. The 2^(k-1) people whose marks were to
the left of the cut are then assigned the left half of the cake,
and the rest are given the right half of the cake. The process is
then repeated for each half of the cake until everyone is left with
just one piece.
In the general case, each person is asked to mark the cake in a
position that divides it in the ratio m : (n-m) for some m < n.
The cake is then cut anywhere between the mth and (m+1)th mark.
The m people whose marks were to the left of the cut are then
assigned the left half of the cake, and the rest are given the
right half of the cake. The process is then repeated for each half
of the cake until everyone is left with just one piece.
It was observed that the first protocol was a special case of the
second. JG said that they have different complexity: the moving
knife protocol is quadratic and the divide and conquer protocol
is O(n log n).
We noticed some similarities between these protocols and
sorting algorithms, in particular quicksort, which is O(n log n)
if you have a good pivot point, but quadratic in the worst case.
We wondered whether they could be shown to be unfolds in the
same way that sorting algorithms have been.
SC then talked about the Dawson team sharing protocol, a
description of which can be found in [2]. It's an example
used in a paper that CEM/SC/IR have submitted to MPC [3].
These resource division protocols look like providing a nice
source of examples for the multirelational calculus
(see minutes from Jan 23rd 2004).
Minutes by CEM+SC.
[1] Jack Robertson, William Webb (1998):
Cake Cutting Algorithms: Be Fair If You Can.
A K Peters.
[2] Ivan Petersen:
Toward
a Fairer Expansion Draft.
[3] C. E. Martin, S. A. Curtis, I. Rewitzky:
Modelling Nondeterminism.
- Friday, 23rd January 2004
-
Present: RSB, SC(minute-taker), CEM, IR, BS
Welcome to our guest today, Ingrid Rewitsky (IR).
CEM took today's meeting, and started talking about
set-valued relations (also known as multirelations).
These are equivalent to monotonic predicate transformers,
but have the advantage that they can model programs
going forward rather than backwards, and they provide
a more powerful domain than ordinary relations
(can express both angelic and demonic nondeterminism).
CEM explained that she'd been originally thinking of
functional programs, and Nigel Ward had done some
work on this, and developed a refinement calculus
[1].
With a refinement calculus, you get specifications,
and laws about them. The laws get proved in the
predicate transformer world, and you can't do things
from first principles.
This is where the set-valued relations/multirelations
come in [2].
You can work with these and refine to what
you want, rather than have to work in a separate calculus.
RSB wanted an example. CEM mentioned vending machines,
and then went on to give some definitions:
A set-valued relation/multirelation is of type
(writing |P for powerset)
|P (A x |P B)
With an ordinary relation R,
x R y
means that for input x, the output can be y.
But for set-valued relations/multirelations, then
x M post
means that for input x, the output can satisfy the
predicate post (predicate represented as a set).
The idea is that the angel can choose the predicate
that the output is to satisfy.
For example, suppose the angel has the choice of
predicates
{ x-1 }
{ x-2, x+2 }
If the angel chooses {x-1}, then the demon has no
choice. But if the angel chooses { x-2, x+2 }, then
the demon gets to choose which of x-2 or x+2 is output.
CEM pointed out that a condition is necessary for these
predicates/postconditions: if x M post and
post subseteq post', then post => post' , and it must
also be true that x M post'
Therefore these set-valued relations/multirelations have
to be up-closed. An up-closed multirelation from type
A to type B is written
->
M : A -> B
RSB vociferously objected at this point at things going from
left to right. He wrote that we have
y = f x
y <- R x
post <- M x
so it is more natural to go from right to left, but of course
with a program, you'd put the postcondition on the right.
SC asserted that this is why multirelations go
from left to right, and RSB quietened down.
CEM then pointed out something crucial: the angel ONLY gets
to choose from the STRONGEST postconditions. (This is necessity,
not convenience.) This is key to understanding them.
An example: the identity multirelation is
->
in : A -> A
A
Angelic choice is just |_| (union),
Demonic choice is just |^| (intersection).
For example,
->
const a : A -> B
const a = A x |`{a}
(upclosure is |` )
Then
const a |^| const b = (A x |`{a}) |^| (A x |`{b})
= A x |`{a,b}
After some discussion about notation, composition (;;) for
multirelations was defined as follows:
x (M ;; N) X
<=>
(exists Y :: x M Y /\ (forall y : y in B : y N X))
RSB mentally checked that M;;id = M (it is).
CEM then gave a Nim example, describing a single round.
Nim is a game where players pick up matches, and in this
example, players are allowed to pick up 1 or 2 matches, and
the player to pick up the last match loses.
Defining
x sub1 X <=> {x-1} subseteq X
and sub2 similarly, we have (regarding our player as the
angel and the opponent as the demon)
player = sub1 |_| sub2
opponent = sub1 |^| sub2
Thus
round = player ;; opponent
From these definitions, you can calculate that
x round X <=> {x-2,x-3} subseteq X
\/ {x-3,x-4} subseteq X
Alternatively... using these lifting operations
for R : A -> B
Demonic lifting Angelic lifting
-> ->
[R] : A -> B <R> : A -> B
x[R]X <=> (forall y : xRy : y in X)
x<R>X <=> (exists y : xRy : y in X)
Then alternative definitions are
player = <move>
opponent = [move]
RSB wanted to know whether <R;S> = <R>;;<S> ? Yes.
And [R;S] = [R];;[S] too.
RSB pondered on the expressing of the lifting operators
in ordinary relations, and wrote up
<R> = R ; in
[R] = R / ni
but he knows the latter is wrong, and needs rewriting
bearing in mind the definition of the quotient operator
for relations.
Going on to talk about refinement, CEM talked about Back
and von Wright's [3] two ways of doing refinement:
R sqsubseteq S <=> R subseteq S
A
R sqsubseteq S <=> R supseteq S
D
These are used for multirelations too (although not both
at once).
CEM talked about guarantees for the angel, and guarantees
for the demon. An angelic correctness condition is a predicate
that the angel can guarantee to fulfil, whatever the demon does.
So saying that post is an angelic correctness condition for
input x just means
x M X
For the demon it's more difficult.
The dual of M gives the correctness conditions for the demon,
and is defined as
o _
x M X <=> ~(x M X)
In other words the demon is guaranteed to be able to fulfil any
predicates that the angel can't guarantee to prevent.
There then followed a vigorous discussion on refinement and
correctness.
When refining, you stop when all of one type of choice is removed,
so that your multirelation is either angelic (has only angelic
nondeterminism in it) or demonic (has only demonic nondeterminism in it).
Definitions of angelic and demonic (using an indexed set of sets Z):
Demonic x M (|^| Z) <=> (forall X : X in Z : x M X)
Angelic x M (|_| Z) <=> (exists X : X in Z : x M X)
Then SC got up to talk about a simple cake cutting protocol,
known as "I cut, you choose".
Modelling a cake as the unit interval I = [0,1]
and cake slices as sub-intervals [x,y], we want to divide
the cake fairly between two participants.
Each participant has a different notion of value of cake slices,
so let
value1, value2 : I x I -> I
where the whole cake is valued at 1, empty slices are valued at 0,
and value1 and value2 are both additive (if a cake is sliced up,
the values of the slices must sum to the value of the whole cake).
Taking the cake as given, rather than an input (a "global cake"),
cut : 1 -> I can be defined as
cut = 1 x I
ie for the (null) input, any position p in the interval [0,1] can
be returned as a position for the knife cut. The choosing relation
can be defined as choose: I -> I x I
p choose i <=> i=[0,p] \/ i=[p,1]
i is the slice received by the non-cutter.
Lifting this to multirelations, the protocol can be described as
<cut> ;; [choose]
SC then went on to talk about correctness conditions.
Demonic correctness (for the non-cutter) is
value2 i >= 0.5
because the non-cutter/demon wants at least half the cake (by his
evaluation). This corresponds to the notion of "fair division"
for such protocols.
Angelic correctness (for the cake cutter) is
value1 i <= 0.5
because i is the slice that the angel doesn't get.
RSB was a bit put out by this, because he was expecting the correctness
conditions to be included as part of a specification. SC replied
that in a sense, the <cut> ;; [choose] isn't a specification, it's
the finished protocol (which does happen to satisfy these correctness
conditions). If you wanted a specification from scratch, it would look
something more like
Spec = { (1,X) | (*) subseteq X }
where (*) = value1 i <= 0.5 /\ value2 i >= 0.5
SC also explained that, this calculus being new, how exactly to best
manipulate specifications, protocols, games, correctness conditions
was still new and evolving. You can further refine (demonically) this
protocol to get rid of the angelic choice, to describe a procedure for
the angel
to follow. Alternatively you can refine (angelically) this protocol to
get rid of the demonic choice, to describe a procedure for the demon to
follow.
CEM/SC/IR are putting a paper into MPC (Mathematics
of Program Construction) about multirelations.
RSB wanted to cut more cake, for n participants, and was interested
to hear that there is a book on cake cutting algorithms
[4],
which
SC has on order. It reminded him of the book on art gallery
theorems [5].
[1]
Ward, N. T. E. (1994):
A Refinement Calculus for Nondeterministic Expressions
[2]
Rewitzky, I. (2003):
Binary Multirelations,
Lecture Notes in Computer Science 2929 : 259-274
[3]
Back, R. J. R. and von Wright, J. (1998):
Refinement Calculus: A Systematic Introduction,
Springer-Verlag.
[4]
Jack Robertson, William Webb (1998):
Cake Cutting Algorithms: Be Fair If You Can,
A K Peters.
[5]
Joseph O'Rourke (1987):
Art Gallery Theorems and Algorithms
(International Series of Monographs on Computer Science)
OUP.
- Friday, 9th January 2004
-
Present: MA, RSB, SC(minute-taker), JG, BO, BS
BO took today's meeting, and had
prepared slides.
These minutes accompany the slides.
[Slide 1/21]
BO explained that the title was referring to generic programming
with containers, not "generic containers".
[Slide 4/21] Generic Programming in Haskell
The assembled throng wanted to know what GFs and DTTs were.
BO explained that these refer to generic functions and
datatype transformers, and then proceeded to give us examples.
An example of a generic function for equality
(actually just its type declaration):
eq <T1,T2> :: T1 -> T2 -> Bool
So eq is a function parametrized by types.
There then followed a discussion about the definition of eq.
JG pointed out that there's two main ways you can look at
defining equality for general (different) types -
you can either traverse over the structure of the type definitions
and see if they are the same, or you can look at the structure
and also the constructor names.
RSB wanted to know if he would be allowed to define it using the types
int and char (since C does that in any case!). Yes. So you could
have
eq <Int,Int> x y = (x==y)
eq <Int,Char> x y = (x == ord y)
This illustrates the difference between parametric and ad-hoc
polymorphism.
Going onto datatype transformers, here's an example
phi : Type -> Type
where "Type" is written to mean that values of type "Type"
are in fact types
Or more generally,
phi : * -> Type
The idea is that you want to be able to do things with datatype
transformers that you can't do with generic functions.
RSB wanted to know whether BO wanted to make types 1st class citizens.
BO replied that yes, he did.
[Slide 5/21] Scrap your boilerplate
BO went on to talk about Ralf Lämmel and Simon Peyton Jones' paper
Scrapping your
Boilerplate: A Practical Design Pattern for Generic Programming,
mentioning a quote from it:
"Trying to understand gfoldl's type signature can cause brain damage."
We had a short attempt and then gave up.
[Slide 7/21] Polynomial datatypes
BO gave us an example.
dataList a = Nil | Cons a (List a)
Y = Nil + Cons = [Nil,Cons]
Phi = 1 + a
Theta = 1 + List a
[Slide 11/21] Polynomial Datatypes - Visitors
Continuing with the above example, BO wrote up
out
T
List a -----------> 1 + a * (List a)
value :: List a -> 1 + a
succs :: List a -> 1 + List a
MA and JG then had a discussion about type-checking in
compiled versus interpreted programs.
Afterwards, BO gave the example of B-trees to explain why
the c* - sometimes you have 0 parameters, sometimes 2.
BO finished by talking about what he hoped to work on and
achieve in the future.
Sharon Curtis
(sharoncurtis@brookes.ac.uk)
|
|