# 100 Prisoners and a Lightbulb

The mysterious man whispers, "Riddle me this, good stranger..."

There are 100 prisoners with life sentences in solitary cells. There is a central room with one light bulb that is initially off. No prisoner can see the light bulb from his or her own cell. Every day, the warden picks a prisoner uniformly at random, and that prisoner visits the living room. While there, the prisoner can toggle the bulb if he or she wishes. Also, the prisoner has the option of asserting that all 100 prisoners have been to the living room by now. If this assertion is false, then all 100 prisoners are executed. However, if it is indeed true, then all prisoners are set free. Thus, the assertion should only be made if the prisoner is 100% certain of its validity. The prisoners are allowed to convene one night in the courtyard, to discuss a plan. What plan should they agree on, so that eventually, someone will make a correct assertion?

*In this post, I present both the canonical solution to the above riddle, as well as a "dumb" solution that isn't so dumb! Then, we'll take a look at the expected time it takes for all prisoners to be set free in both cases.*

This is a somewhat well-known riddle that you might have seen before. It is a tricky one! The challenge of the problem lies in using a single light bulb as a means of communication across 100 players.

A light bulb can only relay a single bit of information, be it *true* or *false*, 1/0.

Furthermore, for any prisoner that is brought into the room, the prisoner is effectively *memoryless*. Suppose prisoner $i$ was brought into the room on days 11 and 37. $i$ has no way of determining which prisoners were brought in on the days between! Since each day a prisoner is selected at random, a single prisoner could be brought in multiple days in a row... or a prisoner could never be brought in for a long time! Then, how could any single prisoner gain any additional information besides the number of times they alone have been brought to the room?

If all prisoners act the same way, it is probably impossible!

Then, what if not all prisoners function identically? A canonical solution to the above riddle is as follows:

### Canonical Solution

*The prisoners elect a single prisoner to be the Counter. The Counter will behave as follows:*

On the first day, the Counter starts his count at 0. On any day, if the Counter is brought to the room and the light is on, he will turn it off and increment the count by 1. If the Counter's count reaches 99, he asserts that all prisoners have visited the room.

*All other prisoners behave as follows:*

The first time a prisoner is brought to the room and the light is off, the prisoner will turn it on. If the light is already on, the prisoner will do nothing. On any subsequent visits, if the prisoner has already turned the light on once before, the prisoner will not do anything.

Then, by restricting the behavior of the prisoners by only allowing the Counter to turn the switch off, we allow the other prisoners to communicate to the Counter. Additionally, since each prisoner can only flip the switch a single time, the Counter can uniquely count the exact number of distinct prisoners that have visited the room. Of course, it is possible that multiple distinct prisoners may have visited the room in the time between the light first been switched on and the Counter visiting the room.

This protocol guarantees that *eventually* all of the prisoners will get set free. However, intuitively one would expect that the amount of time the procedure takes is **very, very** long. Let's ball-park how long it might take!

#### Analyzing the Canonical Game Duration

Let's first examine how long it would take for the Counter to re-visit the central room $100$ times. We know that the amount of time the above approach takes must be at least this much – in between each of the Counter's visits, we are omitting the amount of time for a new prisoner to be brought to the room.

The prison warden has $1/100$ chance of choosing the Counter each day. How long, on average, would it take for the prison warden to successfully pick the Counter?

It turns out that this situation may be modeled by the **geometric distribution**. We can think of it this way – suppose the warden is flipping a biased coin. If the coin lands heads, then the warden picks the Counter. Otherwise, he picks any other prisoner. The geometric distribution models this exact scenario! Details aside, it tells us that in expectation, it would take $\frac{1}{1/100} = 100$ days for the warden to pick the Counter.

Then, in between each of the $100$ visits of the Counter, we'd have to wait $100$ days on average.. giving $100 \times 100 = 10000$ days! That's at least $27$ years, not even counting the time in between that we'd have to wait to get a new prisoner.

Yikes! That's a long time for all of the prisoners to wait. What if the prisoners could be 99% sure that all of them have visited the room **and** guess much earlier? If they only have 1% chance of being wrong and it meant not having to wait 20 years, they'd probably take it.

It might surprise you, but it turns out that they can!

### The Dummy's Solution

All prisoners simply wait $X$ days. Whichever prisoner is brought in on the $X$'th day proclaims that all prisoners have visited the room.

You might scoff at how ridiculously simple this approach seems, but what if I told you that if $X$ were large enough, the prisoners' probability of success asymptotically approaches $1$?

This shouldn't be too surprising. We can model the above problem as rolling a 100 sided die. Given an infinite amount of time, one would expect to have rolled each number at least once.

What's even more incredible about this solution is that it requires **no communication** between any of the prisoners. Each prisoner only need to keep track of the number of days that have passed!

The prisoners will obviously not live forever, so choosing $X = \infty$ is suboptimal. Let's try to find a finite value of $X$!

### Analyzing the Dummy's Game Duration

Let's consider the expected amount of days it takes for each prisoner to have visited the room.

#### Coupon Collectors

If you've taken a course in discrete math, you've probably seen the **coupon collector's problem** before:

Given $n$ different types of coupons, you wish to collect all $n$ coupons. At each step of the game, you select a coupon uniformly at random from a hat with replacement (i.e. you put the coupon back after checking it). Each coupon is equally likely to be drawn, and each choice is independent. What is the expected amount of time it takes for you to collect all $n$ coupons?

It is known that the coupon collector's problem admits expected time of $O(n \log n)$. That is, the expected number of draws is less than $c n \log n$ for some positive constant $c$. I won't provide the proof here as it is well documented in many areas. (Note: the PDF in the link shows that the expected time is $n H_n$, where $H_n$ is the $n$th harmonic number. It is well known that the harmonic series is *tightly* bound by $\log n$.)

The 100 Prisoners puzzle is precisely the coupon collector's problem - where the coupons are the prisoners! Then, a quick application of coupon collector's shows that in expectation, at least some $c \times 460.5$ days need to pass before all prisoners have visited the central room.

We could stop here, but let's try looking at this problem from *another* perspective: **random walks on graphs**.

Don't worry if you are unfamiliar with graph theory/random walks! I try to deliver the key concepts succinctly in the next section. (If you are comfortable with these topics, feel free to skip ahead).

#### Graph Theory Primer

We define a **graph** to be a mathematical structure that describes relationships between entities. In graph theory, we label these entities as **vertices** and relationships as **edges**. For example, consider a social network like Facebook. Your social network forms a graph, where you and your friends' profiles are vertices – and your friendships form the edges.

More formally, we can define a graph $G$ as the pair $G = (V,E)$, where $V$ is the set of vertices $\{1, 2, \dots n\}$ and $E$ is the set of edges $(u,v)$, where $u, v \in V$.

We say that a graph is **undirected** if the edge $(u,v)$ has no direction associated with it. In the Facebook example, the social network is undirected. If Alice is friends with Bob, then Bob is friends with Alice!

*In the above figure, Alice is friends with Bob and David, but not Charlie. In graph terms, Bob and David are neighbors of Alice, and Alice's degree is 2.*

For a particular vertex $u \in V$, we define the **neighbors** of $u$ to be the set of vertices $v$ such that there is an edge from $u$ to $v$. i.e. $neighbors(u) = \{ v | (u,v) \text{ or } (v,u) \in E \}$. We also say that the **degree** of a vertex is the number of its neighbors.

#### Random Walks on Undirected Graphs

We now define a **random walk** on an undirected graph $G$. Intuitively, suppose that you are a wayfaring traveler driving aimlessly across the country. Each city in the country has direct roads to some of the other cities. At each city you visit, you randomly select one of the neighboring cities as your next destination. This is a random walk!

Why are random walks useful? Random walks have applications in many different fields, ranging from modeling how molecules move, describing genetic drift, and even Google's PageRank algorithm! Here, we will use model the 100 Prisoners problem as a random walk, and give a bound on the number of steps needed to have a high probability of having selected each prisoner.

More formally, we define a random walk of length $k$ on $G$ is a sequence of vertices $X_1, X_2, \dots, X_k$ with the following property:

$$\text{For each $i \geq 1$, $X_{i+1}$ is chosen uniformly at random from the neighbors of $X_{i}$}$$

In the above diagram, suppose we start a random walk at Alice. Then, a possible sequence could be *Alice, David, Charlie, Bob*. At Alice, we flip a coin to choose between Bob and David. At David, we flip another coin to choose between Alice and Charlie. Of course, this means that we could always go "back" in our traversal since the graph is undirected! That is, our random walk could easily generate *Alice, David, Alice, David, Alice...*. (The probability of this happening is $\frac{1}{2^k}$ for a $k$ long random walk!)

While performing the random walk, we denote that at state/step $i$ we are at vertex $X_i$.

**Memorylessness.** A key observation we'll make is that for any particular vertex $X_i$ visited in the random walk, the choice of which vertex to visit next depends **only** on the current state. In other words, it does not matter how we got to $X_i$ - our choice of how to move forward only depends on the neighbors of $X_i$. Then, our random walk is memoryless!

#### 100 Prisoners as a Random Walk

We now observe that we can model the above scenario as a **random walk** on an *undirected graph* $G = (V,E)$ with $100$ vertices. We observe that $G$ must be **complete.** That is, there is an edge from each vertex to *every* other vertex. We also observe that each vertex has a self-loop (i.e. an edge to itself).

Then, each prisoner corresponds to a vertex in the graph. Advancing from state $i$ to state $i+1$ in the random walk corresponds to picking the next prisoner for the $i+1$'th day. Since each edge has $1/100$ probability of being picked, the probabilities are exactly the same as in the original puzzle.

To better illustrate the correspondence, consider a variation where we only have $6$ prisoners. The graph would then look like this:

*Each vertex has 6 edges adjacent to it, and the random walk will choose an edge with $1/6$ (uniform) probability at each step!*

Then, how long does it take for a random walk on $G$ to visit each vertex at least once? Can you see why answering this also tells us how long it takes for all prisoners to visit the room at least once?

It turns out that for a complete graph on $n$ vertices (with self edges), the expected number of steps is $O(n \log n)$. (simply put, $≤ c n \log n$ for some constant $c > 0$) You can perform this analysis using coupon collector's from above – so instead let's go through something slightly different.

**Claim.** After performing the random walk for $c n \log n$ steps for some $c > 2$, the probability of not having visited all vertices in the graph is polynomially small.

**Proof.** Suppose that we start at any vertex $X_0$. There is $\frac{1}{n}$ probability of picking any distinct edge from $X_0$. Then, fix a vertex $v$. The probability of not visiting $v$ at a particular step is $1 - \frac{1}{n}$; this is equivalent to picking any of the other vertices to visit. The probability of not visiting $v$ in $m$ steps is $(1 - \frac{1}{n})^m$ (i.e. the probability of not rolling $v$ on an $n$ sided die each time).

If we plug in $m = c n \log n$, we get:

$$\begin{align*}

P(\text{certain vertex not visited in $m$ steps}) &= (1 - \frac{1}{n})^{c n \log n}\

&\leq \frac{1}{e^{c \log n}} && (1 - \frac{1}{n})^n \leq \frac{1}{e}\

&\leq \frac{1}{n^c}

\end{align*}$$

Then, we can use the union bound to bound the event in which the probability any vertex is not visited:

$$P(\text{at least one vertex not visited in $m$ steps}) \leq \sum_{i=1}^n \frac{1}{n^c} = \frac{1}{n^{c-1}}$$

For substantially large $n$, picking any $c > 2$ results in the probability approaching $0$ very quickly (and the probability of successfully visiting all vertices approaches $1$). Then, if we were to run our random walk for $3 n \log n$ steps, we would have near $100%$ confidence that the random walk has visited each vertex. In the original puzzle, since $n = 100$ this would be around $1382$ days, or $3.79$ years. Not too bad for prisoners sentenced for life, and **much** better than waiting at least $27$ years.

#### Cover Time

Now that you understand the concept... we denote the *expected time for a random walk starting at a vertex $v$ to visit each vertex in the graph* as the **cover time** $C_v$.

The *cover time* of a graph $G$, $C(G)$, is defined to be the maximum cover time over all possible starting vertices,

$$C(G) = \max_{v \in V}\{C_v\}$$

We then have that for the undirected, complete graph $G$ with self-loops, the cover time $C(G) = O(n \log n)$.

The above are definitions commonly used in the description of random walks on graphs.

### Conclusion

In writing this blog post, I meant to illustrate key intuition about how a seemingly unrelated puzzle has connections to both the Coupon Collector's problem as well as graph theory. In computer science, tools that may seem only useful in a theoretical context may be used to analyze or describe interesting scenarios in ways that you never could have imagined!

I delivered this puzzle to my cousins at a family Thanksgiving dinner, a few weeks after I had learned about cover times in class. I would have never drawn the connection otherwise!

There are many interesting, hidden connections that lay waiting to be discovered. Only some may prove to have profound implications on technology or society, but there is great value and enjoyment to be found in exploring any problem.

Cheers!

*As always, leave a comment if you found this helpful or have any questions/corrections. Thank you so much for reading!*

*Special thanks to Jake Hart and Lawrence Chan for proofreading and recommendations.*

*Title image by @Kirokaze*

*Other image assets by @Randall Munroe, Bill Watterson*