GRAPH FOLDS Jeremy Gibbons (inspired by Dave Wile) IFIP WG2.1 Working Paper 804 CBB-8. IFIP WG2.1#55, Cochabamba, Friday 19th January 2001 Graph nodes are given unique identifiers, and also have a label and a list of successors. > type Id = Int > data Graph a = Node Id a [Graph a] > getId :: Graph a -> Id > getId (Node i a ns) = i Here is an example graph. Note that it is a cyclic structure. > g :: Graph String > g = node1 where > node1 = Node 1 "one" [ node2, node3 ] > node2 = Node 2 "two" [ node1 ] > node3 = Node 3 "three" [ node1 ] The following function computes the adjacency-list representation of the graph. The auxilliary function builds up an accumulating parameter, the representation of the subgraph already visited, and traverses a stack of nodes. > edges :: Graph a -> [ (Id, a, [Id]) ] > edges x = edges' [] [x] where > edges' es [] = es > edges' es (n@(Node i a ns):ms) > | i `elem` map fst3 es = edges' es ms > | otherwise = edges' (addedge n es) (ns++ms) > addedge (Node i a ns) es = (i, a, map getId ns) : es > fst3 (a,b,c) = a Graphs are cyclic, so evaluating a fold involves computing a fixpoint. The first attempt at a fold for graphs will compute the fixpoint implicitly using Haskell's recursive "let" (actually "where") bindings. We construct a table, giving the computed value for each identifier in the graph: > type Table b = [(Id,b)] > t `at` i = head [ b | (j,b) <- t, j == i ] The fold fixes on the table. > ifold :: (a->[b]->b) -> Graph a -> b > ifold f x = t `at` getId x where > es = edges x > t = map (step t f) es > step t f (i,a,is) = (i, f a (map (t `at`) is)) For example, we can untie a graph to get the corresponding (probably infinite) tree: > data Tree a = Branch a [Tree a] deriving Show > untie :: Graph a -> Tree a > untie = ifold Branch (A tree is just like a graph, but without the unique identifiers. The intention is that trees should be acyclic, whereas graphs need not be.) You will probably wait a long time if you try to print the result of untying a tree, so here is a function to prune the tree to a certain depth: > prune :: Int -> Tree a -> Tree a > prune 0 (Branch a ts) = Branch a [] > prune (n+1) (Branch a ts) = Branch a (map (prune n) ts) One thing you might expect is that folding a graph gives the same as folding the tree to which it unties. That is, ifold f = foldtree f . untie > foldtree :: (a->[b]->b) -> Tree a -> b > foldtree f (Branch a ts) = f a (map (foldtree f) ts) Of course, for cyclic graphs this will only be interesting for suitably non-strict folds; but for acylic graphs it works for all folds. > t :: Graph String > t = Node 1 "one" [ Node 2 "two" [], Node 3 "three" [] ] > size a ns = 1 + sum ns > test = ifold size t == foldtree size (untie t) That fold only works if computing the fixpoint via Haskell's recursive "let"s does what you want, that is, if making steps from the undefined value is sufficient. Sometimes it isn't, and you might want to compute the fixpoint yourself. In this version we unfold an infinite list of tables, each defined in terms of the previous (in the same way that the table was defined in terms of itself in the first attempt). Then we compute the "fixpoint" explicitly, by finding two adjacent elements of the stream satisfying the test "h", and look up the element of this table corresponding to the root of the graph. We have to provide an explicit starting value, whereas for the first attempt the starting value was implicitly bottom. > efold :: b -> (a->[b]->b) -> (b->b->Bool) -> Graph a -> b > efold e f h x = fix h (unfold (step f es) (start e es)) `at` getId x > where > es = edges x > start e es = [ (i,e) | (i,a,is) <- es ] > step f es t = [ (i, f a (map (t `at`) is)) | (i,a,is) <- es ] > unfold f u = u : unfold f (f u) > fix h (a:b:as) = if eqUnder h a b then a else fix h (b:as) > eqUnder h t u = and [ h a b | ((i,a),(i',b)) <- zip t u ] (Doaitse observed that this could be made more efficient. Rather than computing a single infinite sequence of tables, one should compute an infinite sequence of values, one sequence for each node, or equivalently a single table of infinite sequences. The "abstraction function" for the data refinement is a zip (transpose) between the sequence of tables and the table of sequences. This zip is related to the thing that Oege, Richard and Geraint did in their paper "More Haste, Less Speed": it can be done more asymptotically efficiently lazily than is possible strictly.) For example, we can compute the set of reachable nodes of a graph. The starting value is the empty set, the step function is just a matter of unioning sets of nodes, and the test for completion is when the table has equivalent sets for each identifier. > reachable :: Eq a => Graph a -> [a] > reachable = fold2 [] combine eq > combine a xs = [a] `union` bigunion xs where > x `union` y = foldr insert y x > insert a x = if a `elem` x then x else a:x > bigunion = foldr union [] > x `eq` y = all (`elem` y) x && all (`elem` x) y