Dice
"Rolling a fair, six-sided die" often has the implication of a uniform distribution over the set \(\{1,2,3,4,5,6\}\). But which part of that statement induced the constraint on the die faces? Are some die... more fair than others?We begin by working through some of the mathematics used to describe dice, in pursuit of analyzing a rollercoaster of a dice tournament run by my friend.
Terminology
-
A
fair die entails that each face has an equal probability of being rolled. -
An
\(n\)-sided die is a die with \(n\) faces. -
A
standard die hereby refers to a die whose faces are the set \(\{1,2,\ldots,n\}\), where \(n\) is the number of faces.
Yes, our non-standard die still define a valid probability distribution. Cry about it. (You don't need to read that.)
Properties of Dice
Traditionally, we represent the probability distribution of \(X\) as a function \(f_X(x)\) that maps a value \(x\) to its probability \(f_X(x)\). This allows us to write the expected value of \(X\) as \[\mathbb{E}[X]=\sum_{x\in\Omega}xf_X(x),\] where \(\Omega\) is the set of all possible values of \(X\), or the sample space.
In our case, \(\Omega\) is just the faces of the die. Moreover for an \(n\)-sided fair die, \(f_X(x)=\frac{1}{n}\), as all faces are equally likely.
The expected value of a fair, six-sided, standard die is thus \[\mathbb{E}[X]=\sum_{x=1}^{6} x\cdot \frac{1}{6} = \frac{1}{6}(1+2+3+4+5+6)=3.5\]
For example, if we are to roll a fair, six-sided, standard die three times, we would have three random variables \(X_1,X_2,X_3\), each representing one roll. If our outcomes are realized to be \(x_1=3,x_2=1,x_3=5\), then the order statistics are realized as \(x_{(1)}=1,x_{(2)}=3,x_{(3)}=5\).
What is interesting though, is that the order statistics of random variables are themselves random variables. We may define their distributions and compute their expected values.
Proof. The distribution of a random variable \(X\) is defined by its cumulative distribution function (CDF) \(F_X(x)\), the probability that \(X\) is less than or equal to \(x\). So in this case, we may compute the CDF of \(X_{(j)}\) as \(F_{X_{(j)}}(x) = P(X_{(j)}\leq x).\)
But by definition, \(X_{(j)}\leq x\) if and only if at least \(j\) of the \(X_1,\ldots,X_n\) random variables are less than or equal to \(x\). We can compute this probability by viewing it as a binomial distribution, where the probability of success is \(F(x)\) and the number of trials is \(n\). Thus, we have \[F_{X_{(j)}}(x) = P(X_{(j)}\leq x) = \sum_{k=j}^{n}\binom{n}{k} F(x)^k (1-F(x))^{n-k}.\] If you are not familiar with binomial distributions, notice that \(F(x)^k\) is the probability that \(k\) of the \(X_1,\ldots,X_n\) random variables are less than or equal to \(x\), and \((1-F(x))^{n-k}\) is the probability that the remaining \(n-k\) random variables are greater than \(x\). Then, we sum from \(k=j\) to \(n\) for all amounts of \(k\) such that at least \(j\) random variables are less than or equal to \(x\), and \(\binom{n}{k}\) counts how many ways we can choose which \(k\) random variables were less than or equal to \(x\). \(\blacksquare\)
Proof. The density of a random variable \(X\) is the function discussed in the previous section, \(f_X(x)\), which maps values \(x\) in the sample space to their probability \(f_X(x)\).
We can do a similar combinatorial argument here. We want to find the probability that \(X_{(j)}\) is equal to some value \(x\). This entails that exactly \(j-1\) of the \(X_1,\ldots,X_n\) random variables are less than or equal to \(x\), while \(n-j\) random variables are greater than \(x\), and with the final remaining random variable being equal to \(x\).
Thus, we have that the density of \(X_{(j)}\) is \[f_{X_{(j)}}(x) = \binom{n}{j-1, 1, n-j} F(x)^{j-1} (1-F(x))^{n-j} f(x).\] The multinomial coefficient \(\binom{n}{j-1, 1, n-j}\) is an extension of the binomial coefficient in the section before, and counts the number of ways we can divide the \(n\) random variables into three groups:
- \(j-1\) random variables less than or equal to \(x\)
- 1 random variable equal to \(x\)
- \(n-j\) random variables greater than \(x\)
Proof. We can describe the \(n\) rolls as \(n\) independent and identically distributed random variables \(X_1,\ldots,X_n\), where each \(X_i\) is a random variable representing the value of the \(i\)-th roll. Then, the maximum of the \(n\) rolls is the \(n\)-th order statistic \(X_{(n)}\).
We can compute the expected value of \(X_{(n)}\) by using the density of order statistics from the previous section. \[\mathbb{E}[X_{(n)}] = \sum_{x=1}^{6} x \binom{n}{n-1, 1, n-n} F(x)^{n-1} (1-F(x))^{n-n} f(x).\] Notice that since \(j=n\), a lot of the terms in the density of order statistics cancel out, leaving us with the much simpler \[\mathbb{E}[X_{(n)}] = \sum_{x=1}^{6} x n F(x)^{n-1} f(x).\] Also, we know that \(f(x)=\frac{1}{6}\), and it is easily thought through how \(F(x)=\frac{x}{6}\). So we can substitute these in, giving the workable \[\mathbb{E}[X_{(n)}] = \sum_{x=1}^{6} x n \left(\frac{x}{6}\right)^{n-1} \frac{1}{6}.\] Now, we can just rearrange the terms to get a simple result: \[\mathbb{E}[X_{(n)}] = \frac{n}{6^n} \sum_{x=1}^{6} x^{n}.\quad \blacksquare\]
Beginning our Descent into Madness
- You can put any positive rational numbers on each of the 6 sides of the dice, as long as all the numbers add up to Exactly 21.
- You may not put 3.5 on every face (to prevent a potential situation where there cannot be a winner).
You may submit up to 3 dice, which may be the same or different, it's up to you! This is to prevent someone from being very unlucky and eliminated too early."
Exactly 21 is, of course, a reference to the sum of the numbers on the faces of a fair, six-sided, standard die. Is merely sharing the same sum enough to make most dice similar? How much does the game-theoretic component of the tournament matter?
I was honestly fascinated, and decided to submit my dice.
- \(\{4/3, 7/3, 13/3, 13/3, 13/3, 13/3\}\)
- \(\{1/2, 3/2, 7/2, 7/2, 9/2, 15/2\}\)
- \(\{1/5, 1, 16/5, 16/5, 26/5, 41/5\}\)
If we substitute this into the expected value of the maximum of \(n\) dice rolls, we get \[\mathbb{E}[X_{(n)}] = \frac{n}{6^n} \sum_{i=1}^{6} a_i \Big(\sum_{a_i\leq x} a_i\Big)^{n-1}.\quad \blacksquare\]
Moreover, we have the constraint that the sum of the faces must be 21, so \[\mathbb{E}[X] = \frac{1}{6} \sum_{i=1}^{6} a_i = \frac{21}{6}.\quad \blacksquare\]
But we don't care about single rolls in this tournament. Even in the simplest situation, we have 2 rolls, one for each player.
Unfortunately though, the two dice do not have a common distribution. We could define the faces of the dice as \(a_i\) and \(b_i\) and proceed with computation, but this would be a mess. Things are getting out of hand fast, and we need a more efficient representation.
- Find a robust way to look at the maximum of \(n\) rolls fr arbitrary faces
- Find a robust way to compare two dice
- Find a robust way to consider the sum of dice
"A generating function is a clothesline on which we hang up a sequence of numbers for display."
— Herbert Wilf, Generatingfunctionology (1994)