Graphs: Representation and Traversal
A social network where users follow each other. A city map where intersections are connected by roads. A dependency tree where packages depend on other packages. These are all graphs.
A graph is one of the most versatile data structures in computer science. At its core, it is a collection of vertices (also called nodes) and edges (the connections between them). What makes graphs so powerful is their flexibility: unlike trees or linked lists, graphs impose very few structural constraints. Any vertex can connect to any other vertex, and those connections can carry direction, weight, or both.
Directed vs Undirected Graphs
A directed graph (or digraph) is one where edges have a direction. If there is an edge from vertex A to vertex B, that does not automatically mean there is an edge from B to A. Think of Twitter’s follow model: if I follow you, that does not mean you follow me back. Each edge is a one-way street.
An undirected graph is one where edges have no direction. If vertex A is connected to vertex B, then B is also connected to A. A Facebook friendship is a good example: if we are friends, the relationship goes both ways. Under the hood, you can think of each undirected edge as two directed edges pointing in opposite directions.
Weighted vs Unweighted Graphs
In an unweighted graph, all edges are treated equally. The only thing that matters is whether a connection exists or not. In a weighted graph, each edge carries a numerical value (a weight) that might represent distance, cost, latency, capacity, or any other metric.
Consider a map application. The cities are vertices and the roads are edges. If you only care about whether two cities are directly connected, that is an unweighted graph. If you also care about the distance or travel time along each road, you need a weighted graph. Algorithms like Dijkstra’s shortest path rely on edge weights, whereas BFS naturally finds shortest paths in unweighted graphs.
Cyclic vs Acyclic Graphs
A cycle occurs when you can start at some vertex, follow a sequence of edges, and arrive back at the starting vertex. A graph that contains at least one cycle is called a cyclic graph. A graph with no cycles is called an acyclic graph.
A practical example: think of Wikipedia pages. You click a link on one page, which takes you to another page, and if you keep clicking links you might eventually end up back on the page where you started. That is a cycle.
A directed acyclic graph (DAG) is particularly important in practice. Build systems, task schedulers, and package managers all model their dependencies as DAGs. If your dependency graph had a cycle, you would have a circular dependency, and nothing could be resolved.
Connected vs Disconnected Graphs
A connected graph is one where there is a path between every pair of vertices. Every vertex is reachable from every other vertex. If that is not the case, the graph is disconnected, meaning it has at least two separate components with no edges between them.
For directed graphs, connectivity gets more nuanced. A directed graph is strongly connected if for every pair of vertices (x, y), you can reach y from x and also reach x from y. It is weakly connected if replacing all directed edges with undirected edges produces a connected graph. In other words, the underlying structure is connected even though the directions might prevent you from actually traveling between every pair of vertices.
Graph Representations
There are two primary ways to represent a graph in code: the adjacency list and the adjacency matrix. The choice depends on the density of the graph and which operations you need to perform most often.
Adjacency List
Each vertex stores a list of its neighbors. This is memory-efficient for sparse graphs (few edges relative to vertices) because it uses O(V + E) space, where V is the number of vertices and E is the number of edges.
1
2
3
4
5
6
7
8
9
10
11
12
Map<Integer, List<Integer>> graph = new HashMap<>();
graph.computeIfAbsent(0, k -> new ArrayList<>()).add(1);
graph.computeIfAbsent(0, k -> new ArrayList<>()).add(2);
graph.computeIfAbsent(1, k -> new ArrayList<>()).add(3);
graph.computeIfAbsent(2, k -> new ArrayList<>()).add(3);
graph.computeIfAbsent(3, k -> new ArrayList<>());
// Graph structure:
// 0 -> [1, 2]
// 1 -> [3]
// 2 -> [3]
// 3 -> []
For a weighted graph, you can use a simple int[] pair or a dedicated Edge class to store both the neighbor and the weight:
1
2
3
4
5
6
7
8
9
10
11
Map<Integer, List<int[]>> weightedGraph = new HashMap<>();
// int[] = {neighbor, weight}
weightedGraph.computeIfAbsent(0, k -> new ArrayList<>()).add(new int[]{1, 5});
weightedGraph.computeIfAbsent(0, k -> new ArrayList<>()).add(new int[]{2, 3});
weightedGraph.computeIfAbsent(1, k -> new ArrayList<>()).add(new int[]{3, 7});
weightedGraph.computeIfAbsent(2, k -> new ArrayList<>()).add(new int[]{3, 2});
// Graph structure:
// 0 -> [(1, weight=5), (2, weight=3)]
// 1 -> [(3, weight=7)]
// 2 -> [(3, weight=2)]
If you want something more readable, wrapping this in a small class works well:
1
2
3
4
5
record Edge(int to, int weight) {}
Map<Integer, List<Edge>> graph = new HashMap<>();
graph.computeIfAbsent(0, k -> new ArrayList<>()).add(new Edge(1, 5));
graph.computeIfAbsent(0, k -> new ArrayList<>()).add(new Edge(2, 3));
Adjacency Matrix
A 2D array of size V x V where entry [i][j] indicates whether an edge exists between vertex i and vertex j. For weighted graphs, the value at [i][j] stores the edge weight instead of just 0 or 1.
1
2
3
4
5
6
7
8
9
10
int n = 4;
int[][] matrix = new int[n][n];
matrix[0][1] = 1; // edge from 0 to 1
matrix[0][2] = 1; // edge from 0 to 2
matrix[1][3] = 1; // edge from 1 to 3
matrix[2][3] = 1; // edge from 2 to 3
// For an undirected graph, set both directions:
// matrix[0][1] = 1;
// matrix[1][0] = 1;
This uses O(V^2) space regardless of the number of edges, which makes it costly for sparse graphs but efficient for dense ones. The main advantage is O(1) edge lookup time.
Comparison
| Operation | Adjacency List | Adjacency Matrix |
|---|---|---|
| Space | O(V + E) | O(V^2) |
| Check edge exists | O(degree) | O(1) |
| Find all neighbors | O(degree) | O(V) |
| Find vertex degree | O(1)* | O(V) |
| Add edge | O(1) | O(1) |
| Remove edge | O(degree) | O(1) |
*O(1) if you store the neighbor list’s size, which standard List implementations do.
In practice, adjacency lists are the default choice for most graph problems because real-world graphs tend to be sparse. If you are working with a dense graph or need constant-time edge lookups, the matrix is the better option.
Graph Traversal
The two fundamental ways to traverse a graph are Breadth-First Search (BFS) and Depth-First Search (DFS). Both visit every reachable vertex exactly once, but they differ in the order of exploration.
BFS – Breadth-First Search
BFS explores all neighbors at the current depth before moving deeper. It uses a queue to process vertices level by level. This property makes BFS the natural choice for finding the shortest path in an unweighted graph.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
void bfs(Map<Integer, List<Integer>> graph, int start) {
Set<Integer> visited = new HashSet<>();
Queue<Integer> queue = new LinkedList<>();
visited.add(start);
queue.add(start);
while (!queue.isEmpty()) {
int node = queue.poll();
System.out.println("Visited: " + node);
for (int neighbor : graph.getOrDefault(node, List.of())) {
if (!visited.contains(neighbor)) {
visited.add(neighbor);
queue.add(neighbor);
}
}
}
}
The key detail is that we mark a vertex as visited when we add it to the queue, not when we process it. This prevents the same vertex from being added to the queue multiple times.
BFS runs in O(V + E) time and uses O(V) space for the visited set and queue.
DFS – Depth-First Search
DFS explores as far as possible along each branch before backtracking. It can be implemented recursively (using the call stack) or iteratively (using an explicit stack).
Recursive DFS:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
void dfs(Map<Integer, List<Integer>> graph, int node, Set<Integer> visited) {
visited.add(node);
System.out.println("Visited: " + node);
for (int neighbor : graph.getOrDefault(node, List.of())) {
if (!visited.contains(neighbor)) {
dfs(graph, neighbor, visited);
}
}
}
// Usage:
// Set<Integer> visited = new HashSet<>();
// dfs(graph, startNode, visited);
Iterative DFS (using an explicit stack):
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
void dfsIterative(Map<Integer, List<Integer>> graph, int start) {
Set<Integer> visited = new HashSet<>();
Deque<Integer> stack = new ArrayDeque<>();
stack.push(start);
while (!stack.isEmpty()) {
int node = stack.pop();
if (visited.contains(node)) {
continue;
}
visited.add(node);
System.out.println("Visited: " + node);
for (int neighbor : graph.getOrDefault(node, List.of())) {
if (!visited.contains(neighbor)) {
stack.push(neighbor);
}
}
}
}
Note the subtle difference from BFS: in the iterative DFS version, we check visited after popping from the stack rather than before pushing. This is because the same vertex may be pushed onto the stack multiple times from different neighbors. The recursive version avoids this naturally by checking before the recursive call.
DFS also runs in O(V + E) time and uses O(V) space.
Cycle Detection
DFS is particularly useful for detecting cycles in a directed graph. The idea is to maintain three states for each vertex during the traversal:
- Unvisited – the vertex has not been explored yet
- Visiting (in the current recursion stack) – the vertex is being explored, and we have not finished processing all its descendants
- Visited – the vertex and all its descendants have been fully explored
If during DFS we encounter a vertex that is currently in the Visiting state, we have found a back edge, which means a cycle exists.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
boolean hasCycle(Map<Integer, List<Integer>> graph, int numVertices) {
int[] state = new int[numVertices]; // 0 = unvisited, 1 = visiting, 2 = visited
for (int i = 0; i < numVertices; i++) {
if (state[i] == 0 && dfsCycleCheck(graph, i, state)) {
return true;
}
}
return false;
}
boolean dfsCycleCheck(Map<Integer, List<Integer>> graph, int node, int[] state) {
state[node] = 1; // mark as visiting
for (int neighbor : graph.getOrDefault(node, List.of())) {
if (state[neighbor] == 1) {
return true; // back edge found, cycle exists
}
if (state[neighbor] == 0 && dfsCycleCheck(graph, neighbor, state)) {
return true;
}
}
state[node] = 2; // mark as fully visited
return false;
}
This three-state approach is also the foundation for topological sorting, since a valid topological order only exists in a DAG (no cycles).
Common Graph Algorithms
Beyond BFS and DFS, several other graph algorithms come up frequently:
- Dijkstra’s Algorithm – finds the shortest path from a source vertex to all other vertices in a weighted graph with non-negative weights. Uses a priority queue and runs in O((V + E) log V).
- Bellman-Ford Algorithm – finds shortest paths from a source vertex even when negative edge weights are present. Slower than Dijkstra at O(V * E), but handles negative weights and can detect negative cycles.
- Topological Sort – produces a linear ordering of vertices in a DAG such that for every directed edge (u, v), u comes before v. Can be implemented using DFS (post-order) or BFS (Kahn’s algorithm with in-degree tracking).
Knowing which algorithm to reach for depends on the problem constraints: BFS for unweighted shortest paths, Dijkstra for non-negative weighted graphs, Bellman-Ford when negative weights are involved, and topological sort whenever you need to process things in dependency order.
Related posts: Trees: Binary Trees, BST, and Balanced Trees, Depth First Search in Java