Lecture 5 - Notes
January 18, 2016
Today
- Finish comparison based sorting
- 2-3 and Red-Black Trees
Comparison Based Bounds
Problem | Size | Lower Bound | Upper Bound | Reason |
---|---|---|---|---|
max | of $n$ | $O(n-1)$ | $O(n-1)$ | See graph theory proof |
merge | $n+n$ | $O(2n-1)$ | $O(2n-1)$ | See adversarial argument proof |
search on ordered table | $n$ | $O(\lg n)$ | $\lceil \lg n \rceil$ | Information Theory |
max & min | of $n$ | $\lceil \frac{3}{2}n -2 \rceil$ | $\lceil \frac{3}{n} -2 \rceil$ | Tricky adversarial argument |
max & next | of $n$ | $n + \lceil \lg n \rceil - 1$ | $n + \lceil \lg n \rceil - 1$ | Tricky adversarial argument |
sorting | $n$ | $O(n \ log n)$ | $\lceil \lg n! \rceil$ | Information theory |
Note: I know that $O(2n-1) = O(n)$ but Ruskey insists on "more accurate numbers". So I've added the big-o's even though they aren't strictly necessary.
Max
Theorem: $n - 1$ is the best upper bound for finding the maximum in an unordered array.
Proof: Make a graph $G = (V,E)$,
Each time a comparison is made say $a_i \lt a_j$ add the edge $(i,j)$. If the algorithm is correct, then at the end $G$ must be connected. For $G$ to be connected it must have at least $\left|E\right| \ge n - 1$ edges.
Merge
Theorem: $2n - 1$ is the best upper bound for the merge operation.
Proof: Suppose the input is ordered so that,
If some comparison, either $a_i \lt b_i$ or $b_i \lt a_{i+1}$ is not made, then the algorithm will fail. This is called an adversarial argument.
Max & Min
If we use a tree to model the comparisons, we have $\frac{n}{2}$ comparisons on the first row. Since an extended binary tree has $N - 1$ internal nodes (where $N$ is the number of leaves), and our comparison tree has $\frac{n}{2}$ leaves
Max & Next Max
If we use a tree like before, the next maxes will all be those that lost to the first max. Since we have one on each level we have another "sub" comparison set.
Balanced Search Trees
Slides
2-3 Trees
definition: A 2-3 Tree is a search tree which allows,
- 2-node: one key, two children or
- 3-node: two keys, three children
and we maintain,
- Symmetric order: Inorder traversal yields keys in ascending order
- Perfect balance: Every path from root to null link has same length
Search
Search work more-or-less as it would on a binary search tree working it's way down from the root.
Insertion
If we insert into a 2-node we can just make it a 3-node. If we insert into a 3-node,
- Add new key to 3-node to create temporary 4-node
- Move middle key in 4-node into parent
- Repeat up the tree, as necessary
- If you reach the root and it's a 4-node, split it into three 2-nodes
Because the rest of the tree isn't affected, splitting a 4-node is a local transformation and requires constant time.
Perfect Balance
Perfect balance means that every path from the root to a leaf is the same length. The height is therefore,
Tree Height | Reason | |
---|---|---|
Worst Case | $\lg N$ | All 2-nodes |
Best Case | $\log_3 N$ | All 3-nodes |
In practice this means between 12 and 20 for a million nodes and between 18 and 30 for a billion nodes. The bottom line is 2-3 trees provide guaranteed logarithmic performance.
Implementation
Directly implementing a 2-3 tree is complicated, because,
- Maintaining multiple node types is cumbersome
- Need multiple compares to move down tree
- Need to move back up the tree to split 4-nodes
- Large number of cases for splitting
for example,
public void put(Key key, Value val)
{
Node x = root;
while (x.getTheCorrectChild(key) != null)
{
x = x.getTheCorrectChildKey();
if (x.is4Node()) x.split();
}
if (x.is2Node()) x.make3Node(key, val);
else if (x.is3Node()) x.make4Node(key, val);
}