Table of Contents
- What is a Tree?
- 1.1 Tree Terminology
- Binary Trees: The Foundation
- Binary Search Trees (BSTs)
- 3.1 Properties of BSTs
- 3.2 Core Operations (Search, Insert, Delete)
- 3.3 Limitation: Unbalanced BSTs
- Self-Balancing Trees: AVL Trees
- 4.1 Balance Factor
- 4.2 Rotations (LL, RR, LR, RL)
- 4.3 Insertion and Deletion in AVL Trees
- Self-Balancing Trees: Red-Black Trees
- 5.1 Key Properties
- 5.2 Insertion: Fixing Violations
- 5.3 Why Red-Black Trees?
- AVL vs. Red-Black Trees: A Comparison
- Applications of Tree Data Structures
- Conclusion
- References
What is a Tree?
A tree is a hierarchical data structure consisting of nodes connected by edges, with no cycles. It is composed of:
- Nodes: Entities storing data.
- Edges: Connections between nodes, representing parent-child relationships.
Unlike arrays or linked lists (linear structures), trees have a root-to-leaf hierarchy, making them ideal for representing hierarchical data (e.g., folder structures, organizational charts).
Tree Terminology
To navigate trees effectively, learn these key terms:
| Term | Definition |
|---|---|
| Root | The topmost node with no parent (e.g., the “C:” drive in a file system). |
| Parent/Child | A node directly connected to a child node below it (e.g., a folder and its subfolder). |
| Leaf Node | A node with no children (e.g., a file with no subfiles). |
| Subtree | A subset of the tree rooted at a specific node. |
| Height | The length of the longest path from a node to a leaf (root height = tree height). |
| Depth | The length of the path from the root to a node (root depth = 0). |
Example Tree:
1 (Root)
/ \
2 3
/ \
4 5 (Leaves: 4, 5, 3)
- Height of root (1): 2 (path 1→2→4 or 1→2→5).
- Depth of node 5: 2 (path 1→2→5).
Binary Trees: The Foundation
A binary tree is a tree where each node has at most two children: a left child and a right child. This constraint simplifies traversal and manipulation.
Types of Binary Trees
Binary trees are classified based on node arrangement:
-
Full Binary Tree: Every node has 0 or 2 children (no node has 1 child).
Example:1 / \ 2 3 / \ 4 5 -
Complete Binary Tree: All levels are fully filled except possibly the last, which is filled left-to-right.
Example:1 / \ 2 3 / 4 -
Perfect Binary Tree: All levels (including the last) are completely filled.
Example (height 2):1 / \ 2 3 / \ / \ 4 5 6 7 -
Degenerate Binary Tree: Each node has only one child (resembles a linked list, with O(n) operations).
Example:1 → 2 → 3 → 4 (all right children)
Binary Tree Traversals
Traversal is the process of visiting all nodes in a tree systematically. Common methods for binary trees:
1. In-Order Traversal (Left → Root → Right)
Visits left subtree first, then the root, then the right subtree.
Example (tree: 1 ←2→ 3, 2 ←4→5):
In-order: 4 → 2 → 5 → 1 → 3
2. Pre-Order Traversal (Root → Left → Right)
Visits root first, then left subtree, then right subtree.
Example: 1 → 2 → 4 → 5 → 3
3. Post-Order Traversal (Left → Right → Root)
Visits left subtree, then right subtree, then root.
Example: 4 → 5 → 2 → 3 → 1
4. Level-Order Traversal (Breadth-First)
Visits nodes level by level, starting from the root.
Example: 1 → 2 → 3 → 4 → 5
Binary Search Trees (BSTs)
A Binary Search Tree (BST) is a binary tree with a key property:
- For any node, all values in its left subtree are less than the node’s value.
- All values in its right subtree are greater than the node’s value.
This property enables efficient search, insertion, and deletion (ideally O(log n) time).
Properties of BSTs
- No duplicate values (by convention; some BSTs allow duplicates with left/right rules).
- In-order traversal of a BST yields nodes in sorted order.
Core Operations
1. Search
To search for a value x:
- Start at the root.
- If
x == root.value, return the node. - If
x < root.value, search the left subtree. - If
x > root.value, search the right subtree. - If no subtree exists,
xis not in the tree.
Time Complexity: O(h), where h is the tree height.
2. Insert
To insert a value x:
- Traverse the tree as in search.
- Insert
xas a leaf node in the correct position (left ifx < current node, right ifx > current node).
Example: Insert 4 into BST [5, 3, 7, 2, 6]:
5
/ \
3 7
/ /
2 6
Insert 4 → becomes right child of 3.
3. Delete
Deletion is trickier, with three cases:
- Case 1: Node is a leaf: Simply remove it.
- Case 2: Node has one child: Replace the node with its child.
- Case 3: Node has two children: Replace the node with its in-order successor (smallest value in the right subtree) or in-order predecessor (largest value in the left subtree), then delete the successor/predecessor.
Limitation: Unbalanced BSTs
If a BST is skewed (e.g., inserting sorted data like 1, 2, 3, 4), it degenerates into a linked list. This makes operations O(n) instead of O(log n). To solve this, we need self-balancing trees.
AVL Trees
Named after inventors Adelson-Velsky and Landis, AVL trees are self-balancing BSTs where the difference between the heights of left and right subtrees (balance factor) of any node is at most 1.
Balance Factor
For any node:
Balance Factor (BF) = Height(left subtree) - Height(right subtree)
Valid BF values: -1, 0, 1. If BF is 2 or -2, the tree is unbalanced and requires rotation.
Rotations
Rotations rebalance the tree by adjusting node positions. There are four cases:
1. Left-Left (LL) Imbalance
- Cause: Insertion in the left subtree of the left child.
- Fix: Right rotation.
Example:
30 (BF=2)
/
20 (BF=1)
/
10 (new node)
After right rotation:
20
/ \
10 30
2. Right-Right (RR) Imbalance
- Cause: Insertion in the right subtree of the right child.
- Fix: Left rotation.
Example:
10 (BF=-2)
\
20 (BF=-1)
\
30 (new node)
After left rotation:
20
/ \
10 30
3. Left-Right (LR) Imbalance
- Cause: Insertion in the right subtree of the left child.
- Fix: Left rotation on the left child, then right rotation on the unbalanced node.
4. Right-Left (RL) Imbalance
- Cause: Insertion in the left subtree of the right child.
- Fix: Right rotation on the right child, then left rotation on the unbalanced node.
Insertion and Deletion
Insertion in AVL trees follows BST insertion, then checks and fixes balance factors from the inserted node up to the root. Deletion is similar but may require multiple rotations (since deleting a node can unbalance ancestors).
Red-Black Trees
Red-Black trees are another self-balancing BST, but with looser balance constraints than AVL trees. They ensure the tree height is O(log n) via color rules, making insertions/deletions faster (fewer rotations).
Key Properties
- Every node is either red or black.
- The root is black.
- All NIL leaves (dummy nodes representing empty children) are black.
- If a node is red, both its children are black (no red-red parent-child pairs).
- Every path from a node to its descendant NIL leaves has the same number of black nodes (black height).
Insertion: Fixing Violations
New nodes are inserted as red. If insertion violates properties (e.g., red parent and red child), fix via:
- Recoloring: Flip colors of parent, uncle, and grandparent.
- Rotations: If recoloring doesn’t fix the violation (e.g., uncle is black), perform rotations (similar to AVL).
Why Red-Black Trees?
Red-Black trees have a maximum height of 2 log(n+1), which is higher than AVL’s log n, but they require fewer rotations during insertion/deletion. This makes them faster for dynamic datasets with frequent updates.
Comparison: AVL vs. Red-Black Trees
| Feature | AVL Trees | Red-Black Trees |
|---|---|---|
| Balance | Strict (BF ≤ 1) | Loose (black height) |
| Height | O(log n) (tighter bound) | O(log n) (looser bound) |
| Rotations | More (during insert/delete) | Fewer (faster updates) |
| Use Case | Lookup-heavy applications | Insert/delete-heavy applications |
Applications of Tree Data Structures
- BSTs: Simple sorting, symbol tables.
- AVL Trees: Databases with frequent lookups (e.g., dictionary apps).
- Red-Black Trees: Java’s
TreeMap, C++’sstd::map, Linux kernel (task scheduling). - B-Trees/B+ Trees: Database indexing (e.g., MySQL InnoDB uses B+ trees).
- Tries: Autocomplete, spell checkers.
Conclusion
Trees are foundational for efficient data management, with binary trees serving as the base. While BSTs offer O(log n) operations in theory, they degrade without balancing. AVL and Red-Black trees solve this with self-balancing mechanisms, each optimized for different use cases (lookup vs. updates). Mastering these structures unlocks the ability to design efficient algorithms for hierarchical and sorted data.
References
- Cormen, T. H., Leiserson, C. E., Rivest, R. L., & Stein, C. (2009). Introduction to Algorithms (3rd ed.). MIT Press.
- GeeksforGeeks. “Tree Data Structure.” https://www.geeksforgeeks.org/tree-data-structure/
- Wikipedia. “AVL Tree.” https://en.wikipedia.org/wiki/AVL_tree
- Wikipedia. “Red-Black Tree.” https://en.wikipedia.org/wiki/Red%E2%80%93black_tree