Data Structures
For the longest time, I only used two data structures: arrays and objects. They got me surprisingly far—JavaScript's Array and {} are flexible enough to solve most problems. But when I started working on features that handled real volume—search results, dependency graphs, priority queues—I hit a wall. Not because the code was wrong, but because the wrong data structure made it slow, awkward, or both.
Learning data structures isn't about memorizing textbook definitions. It's about building intuition for the question: given how I need to access this data, what shape should it have?
Arrays
Arrays are the default. Contiguous, ordered, index-accessible. In JavaScript, arrays are actually more like hash maps under the hood (V8 optimizes dense arrays to real arrays), but the mental model is a numbered list.
const items = ["apple", "banana", "cherry"]
items[1] // O(1) — index access
items.push("date") // O(1) amortized — append
items.includes("banana") // O(n) — linear scan
items.splice(1, 1) // O(n) — shift elements after removalUse arrays when: you need ordered data, index-based access, or iteration. They're the right default.
Watch out for: linear-time operations hiding in innocent-looking code. Array.includes(), Array.indexOf(), Array.find() are all O(n). Inside a loop, they become O(n²).
| Operation | Time |
|---|---|
| Access by index | O(1) |
| Push / pop (end) | O(1) amortized |
| Unshift / shift (start) | O(n) |
| Search by value | O(n) |
| Insert / delete at index | O(n) |
Linked Lists
A linked list is a chain of nodes, each pointing to the next. Unlike arrays, elements aren't stored contiguously in memory—each node is allocated separately on the heap.
type Node<T> = {
value: T
next: Node<T> | null
}
function createNode<T>(value: T, next: Node<T> | null = null): Node<T> {
return { value, next }
}
const list = createNode(1, createNode(2, createNode(3, null)))Strengths: O(1) insertion and deletion at a known position (you just rewire pointers). No shifting elements like arrays.
Weaknesses: O(n) access by index (you have to walk from the head). No cache locality—nodes are scattered across memory, so CPU caches don't help.
When to use them: Rarely in JavaScript. Arrays with push/pop cover most cases. Linked lists shine in systems programming (OS schedulers, memory allocators) and when you need a queue with frequent insertion/removal at both ends.
In practice, JavaScript's Array is almost always faster than a hand-rolled linked list due to V8's optimizations and cache-friendly memory layout. But understanding them matters because they underpin many other structures: stacks, queues, hash table chaining, and graph adjacency lists.
Stacks and Queues
Both are restricted-access structures. They trade flexibility for clarity about how data flows.
Stack (LIFO — Last In, First Out): The last element added is the first removed. Think of a stack of plates.
const stack = []
stack.push("a") // ['a']
stack.push("b") // ['a', 'b']
stack.push("c") // ['a', 'b', 'c']
stack.pop() // 'c' — last in, first outStacks appear everywhere: the JavaScript call stack, undo/redo systems, parsing expressions, DFS traversal.
Queue (FIFO — First In, First Out): The first element added is the first removed. Think of a line at a store.
const queue = []
queue.push("a") // ['a']
queue.push("b") // ['a', 'b']
queue.push("c") // ['a', 'b', 'c']
queue.shift() // 'a' — first in, first outQueues appear in: BFS traversal, task scheduling, message queues, print spoolers, rate limiters.
Note: Array.shift() is O(n) because it re-indexes all elements. For performance-critical queues, use a linked list or a ring buffer. For most application code, the array-based approach is fine.
Hash Maps and Sets
Hash maps (called Map or plain objects in JavaScript, dict in Python, HashMap in Java) are the most important data structure after arrays. They store key-value pairs and provide O(1) average-case lookups.
const userMap = new Map()
userMap.set("u_123", { name: "Alice", role: "admin" })
userMap.set("u_456", { name: "Bob", role: "viewer" })
userMap.get("u_123") // O(1)
userMap.has("u_789") // O(1)
userMap.delete("u_456") // O(1)How they work: A hash function converts the key into an array index. The value is stored at that index. When two keys hash to the same index (a collision), the map uses chaining (linked list at that slot) or open addressing (probing nearby slots).
Sets are hash maps without values—they just track membership.
const seen = new Set()
seen.add("apple")
seen.add("banana")
seen.has("apple") // true — O(1)
seen.has("cherry") // false — O(1)The practical rule: If you're calling Array.includes() or Array.find() repeatedly, you probably want a Set or Map instead. Converting an array to a Set is O(n) once, then every lookup is O(1).
| Operation | Map/Set | Array |
|---|---|---|
| Lookup by key | O(1) | O(n) |
| Insert | O(1) | O(1) push, O(n) unshift |
| Delete | O(1) | O(n) |
| Check existence | O(1) | O(n) |
Trees
Trees are hierarchical structures where each node has a value and zero or more children. The DOM is a tree. File systems are trees. JSON is a tree. ASTs are trees.
Binary Search Trees (BST)
A BST is a tree where each node has at most two children, and for every node: all values in the left subtree are smaller, all values in the right subtree are larger.
8
/ \
3 10
/ \ \
1 6 14
/ \ /
4 7 13
Searching for a value is like binary search—you eliminate half the tree at each step. In a balanced BST, search, insert, and delete are all O(log n).
The catch: If you insert sorted data into a BST, it degenerates into a linked list—O(n) for everything. This is why balanced trees exist.
Balanced Trees
AVL trees and Red-Black trees are self-balancing BSTs. They automatically restructure (rotate) after insertions and deletions to maintain O(log n) height. Red-Black trees are used internally by Java's TreeMap, C++'s std::map, and many database engines.
B-Trees
B-trees are the data structure behind database indexes. Unlike binary trees (2 children per node), B-trees have many children per node (hundreds or thousands). This minimizes disk reads—each node fits in a single disk page, so a tree of millions of entries is only 3-4 levels deep.
[17 | 35]
/ | \
[3|8|12] [20|28] [38|45|60]
When you add an index to a PostgreSQL column, you're building a B-tree. This is why indexed lookups are O(log n) while full table scans are O(n).
Heaps
A heap is a complete binary tree where each parent is smaller (min-heap) or larger (max-heap) than its children. The root always holds the extreme value.
Min-heap:
1
/ \
3 2
/ \
7 5
Heaps are stored as arrays (no pointers needed—parent of index i is at floor((i-1)/2), children at 2i+1 and 2i+2).
| Operation | Time |
|---|---|
| Find min/max | O(1) |
| Insert | O(log n) |
| Extract min/max | O(log n) |
| Build from array | O(n) |
Use heaps for: priority queues (task scheduling, event processing), finding the k-th smallest/largest element, merge k sorted lists, Dijkstra's shortest path algorithm.
// Conceptual: a priority queue using a min-heap
const pq = new MinHeap()
pq.insert({ task: "critical-alert", priority: 1 })
pq.insert({ task: "log-cleanup", priority: 10 })
pq.insert({ task: "user-email", priority: 3 })
pq.extractMin() // { task: 'critical-alert', priority: 1 }
pq.extractMin() // { task: 'user-email', priority: 3 }JavaScript doesn't have a built-in heap, but you can implement one in ~40 lines or use a library.
Graphs
A graph is a set of nodes (vertices) connected by edges. Unlike trees, graphs can have cycles, multiple paths between nodes, and nodes with no connections.
Graphs model: social networks (users and friendships), road maps (cities and routes), dependency trees (packages and imports), the internet (pages and links), state machines (states and transitions).
Representation
Adjacency list — each node stores a list of its neighbors. Memory-efficient for sparse graphs.
const graph = {
A: ["B", "C"],
B: ["A", "D"],
C: ["A", "D"],
D: ["B", "C"],
}Adjacency matrix — a 2D array where matrix[i][j] = 1 if there's an edge from node i to j. Better for dense graphs and when you need O(1) edge existence checks.
Traversal
BFS (Breadth-First Search): Visits all neighbors at the current depth before moving deeper. Uses a queue. Finds the shortest path in unweighted graphs.
function bfs(graph, start) {
const visited = new Set()
const queue = [start]
visited.add(start)
while (queue.length > 0) {
const node = queue.shift()
for (const neighbor of graph[node]) {
if (!visited.has(neighbor)) {
visited.add(neighbor)
queue.push(neighbor)
}
}
}
return visited
}DFS (Depth-First Search): Goes as deep as possible before backtracking. Uses a stack (or recursion). Useful for detecting cycles, topological sorting, and exploring all paths.
Practical example: Your bundler (webpack, Vite, turbopack) builds a dependency graph of your modules and uses topological sort (a DFS-based algorithm) to determine the correct build order.
Tries (Prefix Trees)
A trie is a tree where each node represents a character, and paths from root to leaf spell out strings. They're optimized for prefix-based lookups.
root
/ | \
c d t
| | |
a o h
| | |
t g e
This trie stores "cat", "dog", "the". Looking up a word of length k takes O(k) time, regardless of how many words are in the trie.
Use tries for: autocomplete (find all words with prefix "ca"), spell checking, IP routing tables, dictionary implementations. If you've ever wondered how search bars suggest completions as you type, tries (or compressed variants like radix trees) are the answer.
Choosing the Right Structure
The decision tree I use:
- Need ordered, index-based access? → Array
- Need fast lookup by key? → Map or object
- Need fast membership checks? → Set
- Need LIFO access? → Stack (array with push/pop)
- Need FIFO access? → Queue (array or linked list)
- Need sorted data with fast insert/search? → Balanced BST or sorted array + binary search
- Need the min or max at all times? → Heap
- Need to model relationships/connections? → Graph
- Need prefix-based string lookups? → Trie
The wrong data structure doesn't just make code slow—it makes it awkward. When you're fighting the structure to express your logic, that's a signal to reconsider your choice. The right structure makes the algorithm obvious.
The Pragmatic Takeaway
You don't need to implement a red-black tree from scratch to be an effective engineer. But you do need to know that it exists, what it's good for, and when to reach for it (or for the library that uses it internally).
Most application code runs on arrays, maps, and sets. Master those three—especially knowing when to convert between them—and you'll solve 90% of performance problems related to data access patterns. The remaining 10% is when trees, heaps, graphs, and tries go from "textbook knowledge" to "the thing that saved my feature."
Data structures are the vocabulary of efficient code. Algorithms are the sentences. You need both, but the vocabulary comes first.