Question:
Suppose I have a
- set of male people, and
- a function isAncestor(person1,person2) that checks whether person1 is an ancestor of person2 in O(1) time. Eg, isAncestor(grandfather, grandson) would return true.
What are some fast algorithms for constructing a complete "family tree" by using only this information? (or "family forest" as not everyone may be related to a single common ancestor).
Comment:
This is not homework or from a book. It is derived from some issues I'm having coupling an adaptive finite element code to some blackbox mesh generation software, boiled down to the core problem and expressed in an analogy that shares the same mathematical structure. Right now I have a brute-force $O(n^3)$ method that seems horribly inefficient.
Thank you for any help.
Results:
I've implemented Gilles suggestion in Matlab and tested it. Here is the pseudocode, code, and unit test script. I release it all to the public domain.
Pseudo-code:
function buildAncestorForest roots = empty list for each person p roots = graft(p, roots) subfunction new_roots = graft(q, old_roots) descendents = empty list nondescendents = empty list for each person t in old_roots if q is an ancestor of t append t to descendents else append t to nondescendents if descendents is still empty q_is_incomparable = true; for each person t in old_roots if t is an ancestor of q new_roots = old_roots; new children of t = graft(q, old children of t) q_is_incomparable = false; break for loop if (q_is_incomparable) new_roots = append q to old_roots; else new_roots = append q to nondescendents; for each person t in descendents children of q = graft(t, children of q);
Matlab code:
https://github.com/NickAlger/MeshADMM/blob/master/buildAncestorForest.m
Unit test script:
https://github.com/NickAlger/MeshADMM/blob/master/test_buildAncestorForest.m
Asked By : Nick Alger
Answered By : Gilles
Assumptions
We have a finite set $\mathscr{D}$ of persons and a relation $\prec$: $x \prec y$ means that $x$ is an ancestor of $y$. This relation has the following properties:
- Transitivity: if $x \prec y$ and $y \prec z$ then $x \prec z$ (the ancestor of my ancestor is my ancestor).
- Antisymmetry: if $x \prec y$ then $y \nprec x$ (I am not my ancestor's ancestor).
- Chain: if $x \prec z$ and $y \prec z$ then either $x \prec y$ or $x = y$ or $y \prec x$ (my ancestors form an ordered chain).
Data structures
Let it be given a set data structure on which the following operations are defined:
- empty set
- add an element to a set
- splitting a set according to a property: iterate over the elements, compute a property on each element, and build two subsets, the set of elements for which the property is true and the set for which the property is false. This operation is assumed to be linear in the size of the set.
We will also use a tree data structure, in which each node contains a person and the set of its children. If $t$ is a tree, write $t_R$ for the person at the root node and $t_C$ for the set of children. A forest is a set of trees.
Algorithm
We start with an empty forest and add persons one by one. At each step, the parent-child relationship in the forest is the parent-child relationship for the ancestor relation. We define a recursive function graft that inserts a tree into a forest.
Input of graft: a forest $F$, and a tree $s$ such that none of the elements of the tree is related to any of the persons in $F$. Let $x = s_R$.
First, check whether $x$ is the ancestor of any of the existing persons: split the set $F$ according to the property $x \prec t_R$ for $t \in F$. Let $C$ be the subset of $F$ consisting of descendants of $x$ (if the root of a tree is a descendant of $x$ then all elements are also descendants by transitivity), and $F'$ be the remaining subset. By antisymmetry, none of the persons in $C$ is an ancestor of $x$. None of the elements of $s_C$ are related to any of the persons in $C$ by assumption. Let $u$ be the tree where the root person is $x$ and the set of children of the root is $C \cup s_C$.
By the chain assumption, $x$ can have ancestors in at most one of the trees in $F'$. Iterate over the trees $t \in F'$, and determine which one if any has an ancestor of $x$ at its root (this is a splitting operation where one of the sets is guaranteed to contain at most one element).
Output of graft:
- If no tree in $F'$ has an ancestor of $F'$ at the root, then add the new node to the remaining forest: return $F' \cup \{u\}$.
- If there is a tree $t$ in $F'$ such that $t_R \prec s$, then we need to insert the new element with its descendants into that tree. Let $G = \text{graft}(t_C, u)$. Return $(F' \setminus \{t\}) \cup \{t'\}$ where $t'$ has $t_R$ at the root and $G$ for children.
Main algorithm: Start from an empty forest, and iteratively graft the persons in $\mathscr{D}$ (grafting a person means grafting the tree with this person at the root and no children).
Correctness
From the remarks of this algorithm, it should be easy to prove that the recursive calls to graft are valid (all relatives of the nodes in the tree being inserted are inside that tree), and that $x$ is an ancestor of $y$ in the tree iff $x \prec y$.
Complexity
$\textrm{graft}(F,t)$ performs at most $2 n$ ancestor tests, where $n$ is the number of persons in $F$: one test per root of $F$, one test per root that is not a descendant of the new node, and tests performed by the recursive call if any which applies to a subset of $F$ not including any of its roots. The complexity of other operations is linear in the number of ancestor tests performed.
Therefore the forest construction algorithm performs at most $2 + 4 + 6 + \ldots + 2(n-1) = n (n-1)$ ancestor tests where $n$ is the total number of persons. In fact, it is easy to see that it never performs the same test twice. The overall complexity is therefore $O(n^2)$.
In the worst case, all possible $n (n-1)$ tests are performed. This happens when no two persons are related, and it is unavoidable in this case.
Best Answer from StackOverflow
Question Source : http://cs.stackexchange.com/questions/13685
0 comments:
Post a Comment
Let us know your responses and feedback