Discrete Math and Theory of Computation

Mar 18, 2026

Computer Science

I avoided math for most of my career. I figured that since I wasn't doing machine learning or graphics, I didn't need it. Then I tried to understand why regular expressions can't match balanced parentheses, why some problems are "hard" in a precise sense, and why a proof that P ≠ NP would be worth a million dollars. The answers all lived in discrete math and the theory of computation—a branch of CS that felt abstract until I realized it's the foundation that explains the limits of what software can and cannot do.

This post isn't a textbook. It's the engineer-usable subset of discrete math and computation theory: the ideas that sharpen your thinking and show up—sometimes disguised—in real systems.

Logic

Propositional logic deals with statements that are either true or false, connected by operators.

Operator	Symbol	Meaning	Example
AND	∧	Both true	`isAdmin ∧ isLoggedIn`
OR	∨	At least one true	`hasTicket ∨ isVIP`
NOT	¬	Opposite	`¬isBlocked`
IMPLIES	→	If A then B	`isPremium → canAccessContent`

These map directly to code:

// Propositional logic in a permission check
const canEdit = isAdmin && isLoggedIn
const canView = hasTicket || isVIP
const isAllowed = isPremium ? canAccessContent : false

De Morgan's Laws are the most useful identities for simplifying conditions:

¬(A ∧ B) = ¬A ∨ ¬B — "not (A and B)" = "not A or not B"
¬(A ∨ B) = ¬A ∧ ¬B — "not (A or B)" = "not A and not B"

// These are equivalent
if (!(isAdmin && isLoggedIn)) { ... }
if (!isAdmin || !isLoggedIn) { ... }

Predicate logic adds variables and quantifiers:

∀x (for all x): "Every user has an email" → ∀u ∈ Users: u.email ≠ null
∃x (there exists x): "Some order is pending" → ∃o ∈ Orders: o.status = 'pending'

These correspond to Array.every() and Array.some() in JavaScript:

const allHaveEmail = users.every((u) => u.email !== null) // ∀
const hasPending = orders.some((o) => o.status === "pending") // ∃

Sets and Functions

A set is an unordered collection of unique elements. Set theory is the language of databases, type systems, and access control.

const A = new Set([1, 2, 3, 4])
const B = new Set([3, 4, 5, 6])
 
// Union: A ∪ B — all elements in either set
const union = new Set([...A, ...B]) // {1,2,3,4,5,6}
 
// Intersection: A ∩ B — elements in both sets
const intersection = new Set([...A].filter((x) => B.has(x))) // {3,4}
 
// Difference: A \ B — elements in A but not B
const difference = new Set([...A].filter((x) => !B.has(x))) // {1,2}

SQL operations map to set operations: UNION, INTERSECT, EXCEPT. A JOIN is a filtered Cartesian product. Understanding sets helps you reason about database queries without running them.

A function in math maps each input to exactly one output. The concepts of injective (one-to-one), surjective (onto), and bijective (both) matter for hashing, encoding, and cryptography:

A hash function is not injective (many inputs can produce the same hash—collisions).
An encryption function must be bijective (each plaintext maps to exactly one ciphertext, and vice versa).

Graphs in Mathematics

Graphs are a mathematical structure of nodes (vertices) and edges. They're the formal basis for everything from social networks to dependency resolution.

Key terminology:

Directed vs undirected: Edges have direction (one-way streets) or not (two-way roads).
Weighted vs unweighted: Edges have costs/distances or all edges are equal.
Cycle: A path that starts and ends at the same node. Detecting cycles matters in dependency graphs (circular dependencies) and deadlock detection.
Connected: Every node can reach every other node. A connected component is a maximal connected subgraph.
DAG (Directed Acyclic Graph): A directed graph with no cycles. Build systems, package managers, and spreadsheet cell dependencies are all DAGs.

Euler's formula for planar graphs: V - E + F = 2 (vertices - edges + faces = 2). This shows up in computational geometry and explains why certain network topologies have structural constraints.

Graph coloring: Assigning colors to nodes such that no two adjacent nodes share a color. This is used in register allocation (compilers), scheduling (no two conflicting tasks at the same time), and map coloring.

Combinatorics

Combinatorics is the math of counting. How many ways can things be arranged, selected, or combined?

Permutations: Ordered arrangements. The number of ways to arrange n items is n! (n factorial).

3 items: 3! = 6 arrangements
10 items: 10! = 3,628,800 arrangements
20 items: 20! ≈ 2.4 × 10^18 arrangements

This is why brute-force solutions that try all permutations are impractical for even moderately sized inputs.

Combinations: Unordered selections. Choosing k items from n: C(n,k) = n! / (k!(n-k)!)

Choose 2 from 5: C(5,2) = 10
Choose 5 from 52 (poker hand): C(52,5) = 2,598,960

The pigeonhole principle: If n items are put into m containers and n > m, at least one container has more than one item. This sounds trivial but has profound implications:

A hash table with more items than slots must have collisions.
Any lossless compression algorithm must make some inputs larger (you can't map all inputs to shorter outputs—there aren't enough shorter strings).
In a room of 367 people, at least two share a birthday.

Proofs (Intuition Level)

You don't need to write proofs, but understanding proof techniques helps you reason about correctness.

Proof by contradiction: Assume the opposite of what you want to prove, and show it leads to a contradiction.

The halting problem proof uses this: Assume a program H exists that can determine whether any program halts. Construct a program P that calls H on itself and does the opposite (halts if H says it loops, loops if H says it halts). P contradicts H's answer, so H can't exist.

Proof by induction: Prove a base case, then prove that if the statement holds for n, it holds for n+1.

This is exactly how recursive algorithms work. A recursive function has a base case and an inductive step. If both are correct, the function is correct for all valid inputs.

Invariants: A property that remains true throughout an algorithm's execution. Loop invariants are the formal way to prove a loop is correct. In practice, you use them informally: "at the start of each iteration, the min variable holds the smallest value seen so far."

Regular Languages and Finite Automata

A finite automaton (FA) is the simplest model of computation: a machine with a finite set of states that reads input one character at a time and transitions between states.

This automaton accepts any string that contains "ab" as a substring. Double circle = accepting state.

Regular expressions are the textual notation for finite automata. Every regex you write in JavaScript compiles to an FA internally.

;/ab/.test("xaby") // true — the FA accepts this string

What regular languages CAN do:

Match patterns: email formats, phone numbers, URLs.
Tokenize: split source code into tokens (lexers).
Search: find substrings matching a pattern.

What regular languages CANNOT do:

Match balanced parentheses: ((())) vs (()
Match nested structures: HTML tags, JSON
Count: "same number of a's as b's"

This is why you can't reliably parse HTML with regex. HTML has nested structure, which requires a more powerful model (a context-free grammar). This isn't a limitation of your regex engine—it's a mathematical impossibility.

Context-Free Grammars

Context-free grammars (CFGs) are more powerful than regular languages. They can describe nested and recursive structures.

A CFG for balanced parentheses:

S → (S)
S → SS
S → ε       (ε = empty string)

This generates: (), (()), ()(), ((())), (()()), etc.

A simplified CFG for arithmetic expressions:

Expr → Expr + Term | Expr - Term | Term
Term → Term * Factor | Term / Factor | Factor
Factor → ( Expr ) | number

This grammar captures operator precedence (multiplication binds tighter than addition) and associativity through its structure.

Where CFGs appear in practice:

Programming languages: Every language's syntax is defined by a CFG (or close to it). The parser in V8, TypeScript's compiler, and Babel all implement CFG-based parsers.
JSON: The JSON specification is a context-free grammar.
SQL: SQL syntax is context-free.
Markdown: Markdown parsers use grammar-like rules (though Markdown's spec is famously ambiguous).

Turing Machines

A Turing machine is a theoretical model of computation: an infinite tape, a read/write head, and a finite set of rules. It's the most powerful model of computation we know—anything that can be computed can be computed by a Turing machine.

Why it matters:

Church-Turing thesis: Any function that is intuitively "computable" can be computed by a Turing machine. Every programming language (JavaScript, Python, C, Haskell) is Turing complete—they can all compute exactly the same set of functions. No language is fundamentally more powerful than another (they differ in convenience, not capability).
The Halting Problem: There is no general algorithm that can determine whether an arbitrary program will halt or run forever. Alan Turing proved this in 1936. This has real consequences:
- You can't build a perfect deadlock detector.
- You can't build a perfect virus scanner.
- You can't build a linter that catches all bugs.
- Compilers can't always determine if code is reachable.

These aren't engineering limitations—they're mathematical impossibilities.

P, NP, and NP-Completeness

This is the most famous open problem in computer science.

P = problems solvable in polynomial time (O(n), O(n²), O(n³), etc.). These are "efficient" problems. Sorting, shortest path, searching—all in P.

NP = problems where a solution can be verified in polynomial time, even if finding the solution might take longer. Given a proposed solution, you can check if it's correct quickly.

NP-Complete = the hardest problems in NP. If you could solve any NP-Complete problem in polynomial time, you could solve ALL NP problems in polynomial time (P = NP).

Examples of NP-Complete problems:

Traveling Salesman: Find the shortest route visiting all cities exactly once.
Boolean Satisfiability (SAT): Given a logical formula, is there an assignment of variables that makes it true?
Graph Coloring: Can you color a graph with k colors such that no adjacent nodes share a color?
Knapsack: Given items with weights and values, what's the most valuable combination that fits in a knapsack of limited capacity?

Why this matters for engineers:

When you encounter an NP-Complete problem in your work (and you will—scheduling, optimization, resource allocation), you know:

No efficient exact solution is known (and most experts believe none exists).
You should look for approximation algorithms (solutions that are "good enough"), heuristics (smart guessing), or constrained versions of the problem that are tractable.

Package dependency resolution is related to SAT. Task scheduling is related to graph coloring. Route optimization is related to TSP. Knowing the theoretical landscape tells you when to stop searching for a perfect algorithm and start engineering a practical one.

Decidability

A problem is decidable if an algorithm exists that always halts and gives a correct yes/no answer. It's undecidable if no such algorithm exists.

Undecidable problems:

The halting problem: Does this program halt on this input?
Rice's theorem: Any non-trivial property of the behavior of a program is undecidable. You can't write a program that determines if another program outputs "hello"—in general.
Post correspondence problem: A combinatorial problem with no algorithmic solution.

Practical implications: Every static analysis tool (linters, type checkers, security scanners) works within the decidability boundary. TypeScript's type system is decidable (with some restrictions)—the compiler always terminates with an answer. But it achieves this by rejecting some valid programs (false positives). A type system that accepts all valid programs and rejects all invalid ones for a Turing-complete language is impossible.

The Pragmatic Takeaway

Discrete math and computation theory aren't prerequisites for writing code. But they're prerequisites for understanding why certain things are impossible, why certain problems are hard, and why tools work the way they do.

When someone tells you "just use a regex to parse HTML," you know why that can't work—it's not about the regex being too short, it's about regular languages being structurally incapable of matching nested constructs.

When someone asks you to write an algorithm that finds the optimal solution to a scheduling problem with 100 variables, you know this might be NP-Complete, and you should look for approximations instead of trying to be clever.

When TypeScript rejects your code with a confusing type error, you know the type checker is navigating a decidability boundary—it's being conservative because being perfect is mathematically impossible.

These aren't academic curiosities. They're the guardrails that keep you from chasing impossible solutions and help you recognize the shape of a problem before you start coding.