Lowest common ancestor

Template:Two other uses Let T be a rooted tree with n nodes. The lowest common ancestor (LCA) is defined between two nodes v and w as the lowest node in T that has both v and w as descendants (where we allow a node to be a descendant of itself).

The LCA of v and w in T is the shared ancestor of v and w that is located farthest from the root. Computation of lowest common ancestors may be useful, for instance, as part of a procedure for determining the distance between pairs of nodes in a tree: the distance from v to w can be computed as the distance from the root to v, plus the distance from the root to w, minus twice the distance from the root to their lowest common ancestor.

In a tree data structure where each node points to its parent, the lowest common ancestor can be easily determined by finding the first intersection of the paths from v and w to the root. In general, the computational time required for this algorithm is O(h) where h is the height of the tree (length of longest path from a leaf to the root). However, there exist several algorithms for processing trees so that lowest common ancestors may be found more quickly:

Dov Harel and Robert Tarjan were the first to develop an efficient lowest common ancestor data structure, in 1984. Their algorithm processes any tree in linear time, so that subsequent lowest common ancestor queries may be answered in constant time per query. However, their data structure is complex and difficult to implement.

Baruch Schieber and Uzi Vishkin (1988) simplified the data structure of Harel and Tarjan, leading to an implementable structure with the same asymptotic preprocessing and query time bounds. Their simplification is based on the principle that, in two special kinds of trees, lowest common ancestors are easy to determine: if the tree is a path, then the lowest common ancestor can be computed simply from the minimum of the levels of the two queried nodes, while if the tree is a complete binary tree, the nodes may be indexed in such a way that lowest common ancestors reduce to simple binary operations on the indices. The structure of Schieber and Vishkin decomposes any tree into a collection of paths, such that the connections between the paths have the structure of a binary tree, and combines both of these two simpler indexing techniques.

Omer Berkman and Uzi Vishin (1993) discovered a completely new way to answer lowest common ancestor queries, again achieving linear preprocessing time with constant query time. Their method involves forming an Euler tour of a graph formed from the input tree by doubling every edge, and using this tour to write a sequence of level numbers of the nodes in the order the tour visits them; a lowest common ancestor query can then be transformed into a query that seeks the minimum value occuring within some subinterval of this sequence of numbers. They then handle this range minimum query problem by combining two techniques, one technique based on precomputing the answers to large intervals that have sizes that are powers of two, and the other based on table lookup for small-interval queries. This method was later rediscovered and slightly simplified by Michael Bender and Martin Farach-Colton (2000).

Further simplifications were made by Alstrup, Gavoille, Kaplan, and Rauhe (2002) and Fischer and Heun (2006).

External links

C++-implementation of Fischer and Heun's RMQ-method (2006) and a more space-conscious alternative.
Python implementation of the algorithm of Bender and Farach-Colton, by David Eppstein
Lecture notes on LCAs from a 2003 MIT Data Structures course. Course by Erik Demaine, notes written by Loizos Michael and Christos Kapoutsis.

References

Fischer, Johannes; Heun, Volker (2006). "Theoretical and Practical Improvements on the RMQ-Problem, with Applications to LCA and LCE" (PDF). Proceedings of the 17th Annual Symposium on Combinatorial Pattern Matching. Springer-Verlag, Lecture Notes in Computer Science 4009. pp. 36–48. {{cite conference}}: Unknown parameter |booktitle= ignored (|book-title= suggested) (help)CS1 maint: multiple names: authors list (link)

Alstrup, Stephen; Gavoille, Cyril; Kaplan, Haim; Rauhe, Theis (2002). "Nearest Common Ancestors: A Survey and a New Distributed Algorithm". Proceedings of the Fourteenth Annual ACM Symposium on Parallel Algorithms and Architectures. ACM Press. pp. 258–264. {{cite conference}}: Unknown parameter |booktitle= ignored (|book-title= suggested) (help)CS1 maint: multiple names: authors list (link)

Bender, Michael A.; Farach-Colton, Martin (2000). "The LCA problem revisited". Proceedings of the 4th Latin American Symposium on Theoretical Informatics. Springer-Verlag, Lecture Notes in Computer Science 1776. pp. 88–94. {{cite conference}}: Unknown parameter |booktitle= ignored (|book-title= suggested) (help)CS1 maint: multiple names: authors list (link)

Berkman, Omer; Vishkin, Uzi (1993). "Recursive Star-Tree Parallel Data Structure". SIAM J. Comput. 22 (2): 221–242.{{cite journal}}: CS1 maint: multiple names: authors list (link)

Harel, Dov; Tarjan, Robert E. (1984). "Fast algorithms for finding nearest common ancestors". SIAM J. Comput. 13 (2): 338–355.{{cite journal}}: CS1 maint: multiple names: authors list (link)

Schieber, Baruch; Vishkin, Uzi (1988). "On finding lowest common ancestors: simplification and parallelization". SIAM J. Comput. 17: 1253–1262.{{cite journal}}: CS1 maint: multiple names: authors list (link)