C binary search tree BST traversal insertion deletion guide
|

C Trees and Binary Search Trees: The Complete Guide for 2026

What Are Trees?

A tree is a hierarchical data structure made of nodes connected by edges. Unlike arrays and linked lists that are linear, trees branch out — which lets you organize data in ways that enable incredibly fast searches, insertions, and deletions.

Key terminology you need to know:

  • Root — The topmost node (no parent)
  • Parent/Child — A node that connects downward to another
  • Leaf — A node with no children
  • Height — The longest path from root to a leaf
  • Depth — The distance from the root to a specific node
  • Subtree — Any node and all its descendants form a subtree

Trees are everywhere: your file system is a tree, HTML documents are trees, compilers build abstract syntax trees, and databases use B-trees for indexing. Once you understand trees, you’ll see them in every system you touch.

Binary Trees in C

A binary tree restricts each node to at most two children: left and right. In C, we represent this with a struct containing data and two pointers. If you need to brush up on how pointers and structs interact, review C Pointers and C Structs.

#include <stdio.h>
#include <stdlib.h>
#include <stdbool.h>

typedef struct TreeNode {
    int data;
    struct TreeNode *left;
    struct TreeNode *right;
} TreeNode;

TreeNode *node_create(int data) {
    TreeNode *node = malloc(sizeof(TreeNode));
    if (!node) {
        perror("malloc");
        exit(EXIT_FAILURE);
    }
    node->data = data;
    node->left = NULL;
    node->right = NULL;
    return node;
}

// Build a simple tree manually:
//        10
//       /  \
//      5    15
//     / \     \
//    3   7    20

int main(void) {
    TreeNode *root = node_create(10);
    root->left = node_create(5);
    root->right = node_create(15);
    root->left->left = node_create(3);
    root->left->right = node_create(7);
    root->right->right = node_create(20);

    printf("Root: %d\n", root->data);
    printf("Left child: %d\n", root->left->data);
    printf("Right child: %d\n", root->right->data);

    return 0;
}

Each node is independently allocated with malloc. The tree’s shape is defined entirely by how you wire up the left and right pointers. No arrays, no indices — just pointers linking nodes together.

Tree Traversals

Traversal means visiting every node exactly once. There are four standard ways to traverse a binary tree, and each has different applications.

Inorder (Left, Root, Right)

For a BST, inorder traversal visits nodes in sorted order. This is the most commonly used traversal.

void inorder(TreeNode *node) {
    if (!node) return;
    inorder(node->left);
    printf("%d ", node->data);
    inorder(node->right);
}
// For our tree: 3 5 7 10 15 20

Preorder (Root, Left, Right)

Preorder visits the root first. Useful for creating a copy of the tree or serializing it.

void preorder(TreeNode *node) {
    if (!node) return;
    printf("%d ", node->data);
    preorder(node->left);
    preorder(node->right);
}
// For our tree: 10 5 3 7 15 20

Postorder (Left, Right, Root)

Postorder visits the root last. Perfect for freeing a tree — you delete children before the parent.

void postorder(TreeNode *node) {
    if (!node) return;
    postorder(node->left);
    postorder(node->right);
    printf("%d ", node->data);
}
// For our tree: 3 7 5 20 15 10

Level-Order (Breadth-First)

Level-order visits nodes level by level, left to right. This requires a queue instead of recursion:

#define MAX_QUEUE 100

void level_order(TreeNode *root) {
    if (!root) return;

    TreeNode *queue[MAX_QUEUE];
    int front = 0, rear = 0;

    queue[rear++] = root;

    while (front < rear) {
        TreeNode *current = queue[front++];
        printf("%d ", current->data);

        if (current->left)  queue[rear++] = current->left;
        if (current->right) queue[rear++] = current->right;
    }
    printf("\n");
}
// For our tree: 10 5 15 3 7 20

All three recursive traversals are naturally expressed with C Recursion. The base case is always if (!node) return;, which handles both leaves and empty subtrees elegantly.

The Binary Search Tree Property

A Binary Search Tree (BST) adds one crucial rule to a binary tree: for every node, all values in the left subtree are smaller, and all values in the right subtree are larger.

This property is what makes BSTs so powerful. To find a value, you don’t scan every node — you go left or right at each step, cutting the search space in half. That’s O(log n) for a balanced tree.

// Valid BST:         Invalid BST:
//      8                  8
//     / \                / \
//    3   10              3  10
//   / \    \            / \   \
//  1   6   14          1   9  14    (9 > 8, wrong side!)
//     / \
//    4   7

BST Insertion

Insertion follows the BST property. Compare the new value with the current node: go left if smaller, right if larger. When you hit NULL, that’s where the new node goes.

TreeNode *bst_insert(TreeNode *root, int data) {
    if (!root) return node_create(data);

    if (data < root->data)
        root->left = bst_insert(root->left, data);
    else if (data > root->data)
        root->right = bst_insert(root->right, data);
    // If data == root->data, we skip (no duplicates)

    return root;
}

// Usage:
// TreeNode *root = NULL;
// root = bst_insert(root, 8);
// root = bst_insert(root, 3);
// root = bst_insert(root, 10);
// root = bst_insert(root, 1);
// root = bst_insert(root, 6);

The recursive approach is elegant. The key insight is that bst_insert returns the (possibly new) root of the subtree. When root is NULL, we create and return a new node. Otherwise, we recurse and reassign the child pointer.

Searching follows the same logic as insertion:

TreeNode *bst_search(TreeNode *root, int target) {
    if (!root) return NULL;              // Not found
    if (target == root->data) return root; // Found it
    if (target < root->data)
        return bst_search(root->left, target);
    else
        return bst_search(root->right, target);
}

// Iterative version (avoids function call overhead):
TreeNode *bst_search_iter(TreeNode *root, int target) {
    while (root && root->data != target) {
        root = (target < root->data) ? root->left : root->right;
    }
    return root;
}

BST Deletion: The Three Cases

Deletion is the hardest BST operation because you must maintain the BST property after removing a node. There are three cases:

Case 1: Leaf node (no children). Just remove it. Set the parent’s pointer to NULL and free the node.

Case 2: One child. Replace the node with its only child. The subtree stays valid.

Case 3: Two children. This is the tricky one. Find the inorder successor (the smallest node in the right subtree), copy its value into the node being deleted, then delete the successor (which has at most one child, so it falls into Case 1 or 2).

// Helper: find the minimum node in a subtree
static TreeNode *find_min(TreeNode *node) {
    while (node->left) node = node->left;
    return node;
}

TreeNode *bst_delete(TreeNode *root, int data) {
    if (!root) return NULL;

    if (data < root->data) {
        root->left = bst_delete(root->left, data);
    } else if (data > root->data) {
        root->right = bst_delete(root->right, data);
    } else {
        // Found the node to delete

        // Case 1 & 2: No left child or no right child
        if (!root->left) {
            TreeNode *right_child = root->right;
            free(root);
            return right_child;
        }
        if (!root->right) {
            TreeNode *left_child = root->left;
            free(root);
            return left_child;
        }

        // Case 3: Two children
        TreeNode *successor = find_min(root->right);
        root->data = successor->data;
        root->right = bst_delete(root->right, successor->data);
    }
    return root;
}

Complete BST Implementation

Let’s put it all together with utility functions for height, counting, validation, printing, and memory cleanup:

#include <stdio.h>
#include <stdlib.h>
#include <stdbool.h>
#include <limits.h>

typedef struct TreeNode {
    int data;
    struct TreeNode *left;
    struct TreeNode *right;
} TreeNode;

TreeNode *node_create(int data) {
    TreeNode *n = malloc(sizeof(TreeNode));
    if (!n) { perror("malloc"); exit(1); }
    n->data = data;
    n->left = n->right = NULL;
    return n;
}

TreeNode *bst_insert(TreeNode *root, int data) {
    if (!root) return node_create(data);
    if (data < root->data)
        root->left = bst_insert(root->left, data);
    else if (data > root->data)
        root->right = bst_insert(root->right, data);
    return root;
}

TreeNode *bst_search(TreeNode *root, int target) {
    while (root && root->data != target)
        root = (target < root->data) ? root->left : root->right;
    return root;
}

static TreeNode *find_min(TreeNode *n) {
    while (n->left) n = n->left;
    return n;
}

TreeNode *bst_delete(TreeNode *root, int data) {
    if (!root) return NULL;
    if (data < root->data)
        root->left = bst_delete(root->left, data);
    else if (data > root->data)
        root->right = bst_delete(root->right, data);
    else {
        if (!root->left) {
            TreeNode *tmp = root->right; free(root); return tmp;
        }
        if (!root->right) {
            TreeNode *tmp = root->left; free(root); return tmp;
        }
        TreeNode *succ = find_min(root->right);
        root->data = succ->data;
        root->right = bst_delete(root->right, succ->data);
    }
    return root;
}

// --- Utility functions ---

int bst_height(TreeNode *root) {
    if (!root) return -1;  // Empty tree has height -1
    int lh = bst_height(root->left);
    int rh = bst_height(root->right);
    return 1 + (lh > rh ? lh : rh);
}

int bst_count(TreeNode *root) {
    if (!root) return 0;
    return 1 + bst_count(root->left) + bst_count(root->right);
}

// Validate BST property using min/max bounds
static bool validate_helper(TreeNode *node, long min, long max) {
    if (!node) return true;
    if (node->data <= min || node->data >= max) return false;
    return validate_helper(node->left, min, node->data)
        && validate_helper(node->right, node->data, max);
}

bool bst_is_valid(TreeNode *root) {
    return validate_helper(root, LONG_MIN, LONG_MAX);
}

// Print tree structure (sideways, right subtree on top)
void bst_print(TreeNode *root, int depth) {
    if (!root) return;
    bst_print(root->right, depth + 1);
    for (int i = 0; i < depth; i++) printf("    ");
    printf("%d\n", root->data);
    bst_print(root->left, depth + 1);
}

void inorder(TreeNode *root) {
    if (!root) return;
    inorder(root->left);
    printf("%d ", root->data);
    inorder(root->right);
}

void bst_free(TreeNode *root) {
    if (!root) return;
    bst_free(root->left);
    bst_free(root->right);
    free(root);
}

int main(void) {
    TreeNode *root = NULL;
    int values[] = {50, 30, 70, 20, 40, 60, 80, 10, 35, 65};
    int n = sizeof(values) / sizeof(values[0]);

    for (int i = 0; i < n; i++)
        root = bst_insert(root, values[i]);

    printf("Tree structure:\n");
    bst_print(root, 0);

    printf("\nInorder: ");
    inorder(root);
    printf("\n");

    printf("Height: %d\n", bst_height(root));
    printf("Count: %d\n", bst_count(root));
    printf("Valid BST: %s\n", bst_is_valid(root) ? "yes" : "no");

    // Search
    int target = 40;
    TreeNode *found = bst_search(root, target);
    printf("Search %d: %s\n", target, found ? "FOUND" : "NOT FOUND");

    // Delete node with two children
    printf("\nDeleting 30...\n");
    root = bst_delete(root, 30);
    printf("Inorder after delete: ");
    inorder(root);
    printf("\n");

    bst_free(root);
    return 0;
}

Notice how bst_free uses postorder traversal — it frees children before the parent. If you freed the parent first, you’d lose access to its children, causing memory leaks. This is a pattern you’ll see in every tree-based data structure. For more on managing dynamic memory properly, see C Dynamic Memory Allocation.

Why Balancing Matters

A BST’s performance depends on its shape. If you insert elements in sorted order (1, 2, 3, 4, 5), the tree degenerates into a linked list:

// Balanced tree (height 2):    Degenerate tree (height 4):
//        3                      1
//       / \                      \
//      1   5                      2
//       \   \                      \
//        2   7                      3
//                                    \
//                                     4
//                                      \
//                                       5

The balanced tree gives O(log n) operations. The degenerate tree gives O(n) — no better than a linked list. For 1,000,000 elements, that’s the difference between 20 comparisons and 1,000,000.

Self-balancing trees like AVL trees and Red-Black trees solve this by automatically restructuring after each insertion or deletion. They guarantee O(log n) height regardless of insertion order. Those are more advanced topics, but understanding why they exist starts here.

Real-World Applications

  • File systems — Directory structures are trees. Each folder is a node with children (files and subfolders).
  • Databases — B-trees and B+ trees power database indices (MySQL, PostgreSQL). They’re designed for disk-based storage where each node holds multiple keys.
  • Compilers — Abstract Syntax Trees (ASTs) represent parsed code. The expression 3 + 4 * 5 becomes a tree with + at the root, 3 on the left, and a * subtree on the right.
  • Networking — Routing tables use tries (prefix trees) for IP address lookup.
  • Autocomplete — Tries store dictionaries efficiently, enabling prefix-based search in real time.

Common Mistakes

1. Not Handling the Empty Tree

Every BST function must handle root == NULL. Forgetting this check leads to segfaults on the first dereference.

2. Losing the Root After Deletion

When deleting the root node, the tree’s root changes. Always reassign: root = bst_delete(root, value);. If you just call bst_delete(root, value) without capturing the return value, your root pointer may be dangling.

3. Memory Leaks in Deletion

In Case 3 (two children), you copy the successor’s data and then delete the successor. Make sure the successor’s node is actually freed. The recursive call to bst_delete handles this, but if you implement deletion iteratively, it’s easy to forget.

4. Incorrect BST Validation

A common mistake is checking only immediate children: “left child < parent < right child.” This is insufficient. A node in the left subtree must be less than all ancestors above it on the right path. The min/max bounds approach shown above is correct.

5. Freeing Nodes in the Wrong Order

Always use postorder traversal for freeing. Preorder or inorder traversal would free a parent while its children still reference it. Check C Memory Bugs for tools to catch these issues.

Practice Exercises

  1. Find the kth smallest element in a BST. Hint: use inorder traversal with a counter.
  2. Find the Lowest Common Ancestor (LCA) of two nodes in a BST. Use the BST property: if both values are smaller than the current node, go left; if both are larger, go right; otherwise, the current node is the LCA.
  3. Convert a sorted array to a balanced BST. Pick the middle element as root, recurse on left and right halves.
  4. Serialize and deserialize a BST. Convert the tree to a string (preorder traversal), then reconstruct it from that string.
  5. Check if two BSTs are identical. Write a recursive function that compares structure and values simultaneously.
  6. Implement iterative inorder traversal using an explicit stack (no recursion). This is commonly asked in interviews and forces you to understand how the call stack works.

Trees are a turning point in your C journey. Once you can build and manipulate tree structures with confidence, you’re ready for graphs, heaps, and the full range of advanced algorithms. Continue with the C Programming Roadmap to see what’s next.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *