GRAPH FOLDS
Jeremy Gibbons (inspired by Dave Wile)
IFIP WG2.1 Working Paper 804 CBB-8.
IFIP WG2.1#55, Cochabamba, Friday 19th January 2001


Graph nodes are given unique identifiers, and also have a label and a
list of successors.

> type Id = Int
> data Graph a = Node Id a [Graph a]

> getId :: Graph a -> Id
> getId (Node i a ns) = i


Here is an example graph. Note that it is a cyclic structure.

> g :: Graph String
> g = node1 where
>   node1 = Node 1 "one"   [ node2, node3 ]
>   node2 = Node 2 "two"   [ node1 ]
>   node3 = Node 3 "three" [ node1 ]


The following function computes the adjacency-list representation of
the graph. The auxilliary function builds up an accumulating
parameter, the representation of the subgraph already visited, and
traverses a stack of nodes.

> edges :: Graph a -> [ (Id, a, [Id]) ]
> edges x = edges' [] [x] where
>   edges' es [] = es
>   edges' es (n@(Node i a ns):ms)
>     | i `elem` map fst3 es = edges' es ms
>     | otherwise = edges' (addedge n es) (ns++ms)
>   addedge (Node i a ns) es = (i, a, map getId ns) : es
>   fst3 (a,b,c) = a


Graphs are cyclic, so evaluating a fold involves computing a fixpoint.
The first attempt at a fold for graphs will compute the fixpoint
implicitly using Haskell's recursive "let" (actually "where")
bindings. We construct a table, giving the computed value for each
identifier in the graph:

> type Table b = [(Id,b)]
> t `at` i = head [ b | (j,b) <- t, j == i ]

The fold fixes on the table.

> ifold :: (a->[b]->b) -> Graph a -> b
> ifold f x = t `at` getId x  where
>   es = edges x
>   t = map (step t f) es
>   step t f (i,a,is) = (i, f a (map (t `at`) is))

For example, we can untie a graph to get the corresponding (probably
infinite) tree:

> data Tree a = Branch a [Tree a]  deriving Show

> untie :: Graph a -> Tree a
> untie = ifold Branch

(A tree is just like a graph, but without the unique identifiers. The
intention is that trees should be acyclic, whereas graphs need not
be.) You will probably wait a long time if you try to print the result
of untying a tree, so here is a function to prune the tree to a
certain depth:

> prune :: Int -> Tree a -> Tree a
> prune 0 (Branch a ts) = Branch a []
> prune (n+1) (Branch a ts) = Branch a (map (prune n) ts)

One thing you might expect is that folding a graph gives the same as
folding the tree to which it unties. That is,

  ifold f = foldtree f . untie

> foldtree :: (a->[b]->b) -> Tree a -> b
> foldtree f (Branch a ts) = f a (map (foldtree f) ts)

Of course, for cyclic graphs this will only be interesting for
suitably non-strict folds; but for acylic graphs it works for all
folds.

> t :: Graph String
> t = Node 1 "one" [ Node 2 "two" [], Node 3 "three" [] ]

> size a ns = 1 + sum ns
> test = ifold size t == foldtree size (untie t)

That fold only works if computing the fixpoint via Haskell's recursive
"let"s does what you want, that is, if making steps from the undefined
value is sufficient. Sometimes it isn't, and you might want to compute
the fixpoint yourself. In this version we unfold an infinite list of
tables, each defined in terms of the previous (in the same way that
the table was defined in terms of itself in the first attempt). Then
we compute the "fixpoint" explicitly, by finding two adjacent elements
of the stream satisfying the test "h", and look up the element of this
table corresponding to the root of the graph. We have to provide an
explicit starting value, whereas for the first attempt the starting
value was implicitly bottom.

> efold :: b -> (a->[b]->b) -> (b->b->Bool) -> Graph a -> b
> efold e f h x = fix h (unfold (step f es) (start e es)) `at` getId x 
>  where
>   es = edges x
>   start e es = [ (i,e) | (i,a,is) <- es ]
>   step f es t = [ (i, f a (map (t `at`) is)) | (i,a,is) <- es ]
>   unfold f u = u : unfold f (f u)
>   fix h (a:b:as) = if eqUnder h a b then a else fix h (b:as)
>   eqUnder h t u = and [ h a b | ((i,a),(i',b)) <- zip t u ]

(Doaitse observed that this could be made more efficient. Rather than
computing a single infinite sequence of tables, one should compute an
infinite sequence of values, one sequence for each node, or
equivalently a single table of infinite sequences. The "abstraction
function" for the data refinement is a zip (transpose) between the
sequence of tables and the table of sequences. This zip is related to
the thing that Oege, Richard and Geraint did in their paper "More
Haste, Less Speed": it can be done more asymptotically efficiently
lazily than is possible strictly.)

For example, we can compute the set of reachable nodes of a graph. The
starting value is the empty set, the step function is just a matter of
unioning sets of nodes, and the test for completion is when the table
has equivalent sets for each identifier.

> reachable :: Eq a => Graph a -> [a]
> reachable = fold2 [] combine eq
> combine a xs = [a] `union` bigunion xs where
>   x `union` y = foldr insert y x
>   insert a x = if a `elem` x then x else a:x
>   bigunion = foldr union []
> x `eq` y = all (`elem` y) x && all (`elem` x) y