An Efficient Bottom-Up Distance between Trees

  • advantages of bottom-up distance:
    1. no particular tree edit operations together with their costs need to be defined (same to top-down distance, I think);
    2. low complexity: it can be computed in time linear in the size of the trees, on rooted, loabeled, ordered trees of unbounded degree. Worst-case O(n1+n2) time.
  • contrary to top-down distance, bottom-up distance defined that two nodes i and j match to each only when all the child nodes of i and j match to each.
    For DOM-trees, top-down distance seems the better choice. Or, can we combine these two measurement together?
    Isolated-subtree distance is a possibility, which would give better mapping result for DOM-trees.
  • bottom up distance
  • The bottom-up distance between two rooted trees T1 and T2 can be computed by the following steps:
    1. Obtain a compacted directed acyclic graph representation G of the forest F consisting of the disjoint union of T1 and T2, together with a corresponding K between the nodes of T1 and T2 and the nodes of G.
    2. Extract a mapping M from T1 to T2, according to graph G and node correspondence K.
    3. Compute the bottom-up distance between T1 and T2 according to the mapping M.
    click for large size
last update: April 3, 2005