Saturday, March 30, 2013

Lads and ladders



Everything posted so far has pertained to a game played with only one person, but since Chutes and Ladders is a multiplayer game, we should also consider how the data changes when there are 2 players, 3 players, or some number n of our choosing players. I want to focus this post specifically on how the expected number of moves for a game changes as the number of players vary.

Actually, that could use a bit more clarification. When dealing with one player, the number of moves was equal to the total number of times the spinner is spun. While that is a perfectly acceptable definition for games involving more than one player, the generalized definition of "number of moves" I want to use is the number of times the winner of the game spins the spinner. This removes any complications about what order the players make their moves in.

Well, why should there be any change? After all, unlike in Sorry or a similar game, players' game pieces do not interact with each other. The best way to think about it is that the more players there are in the mix, the more likely one of them is to get "lucky" and finish the game in fewer than the expected number of moves for an individual player (39.2). An analogous situation is that if one person were rolling a die, it would take her, on average, six throws to get a 6, but if she and ten other people were all rolling dice at the same time, a 6 would be expected to come up much sooner.

There are two ways to find the expected number of moves: by using probabilities of completing the game in 1 though n moves, where n is a large number, (the "approximation" method) and by doing some magic math on the canonical form ("Q matrix") of a transition matrix for a Markov Chain (the "exact" method). I have no idea where to begin with the second option, so I'll start with the first.

How might we find the probability that a game with p players finishes with n moves? This is probably easier to think about if we pretend that, instead of the game stopping as soon as one player reaches the final square, it continues until all players reach the finish. The number of "moves" in a game, however, will still be the number of moves the winner of the game makes.  Let's think about what has to happen: we need to have at least one of the p players finish the game in n moves (the "at least one" is crucial here. With the way I've defined "moves" for games with more than one player, it is possible for there to be more than one winner) and we need to have no players finish the game in fewer than n moves. So, if Pa~n is the probability that at least one player finishes the game in n moves and Pb~n is the probability that no player finishes the game in fewer than n moves, the probability that both of them happen together is Pa~n times Pb~n  since they are both independent events.

Hmm. Pa~n would probably be fairly straight forward to find if we wanted to know the probability of exactly one player finishing in n moves, but finding the "at least one" complicates it immensely, since we now have to sum the individual probabilities of one, two, three, all the way up to p players winning the game in n moves, which could be rather tedious for large p. If you ever get a complicated event like this, sometimes it's easier to consider the complement of that event, that is, the chance that event does not happen. The complement of Pa~n is the probability that no players win the game in n moves, and this is something that we can work with. The probability that one player wins in 7 moves is 0.0015, so the probability that one player does not win in 7 moves is 1 - 0.0015. The probability that all p players don't win in 7 moves is therefore (1 - 0.0015)^p since these are all independent events. This is the complement of Pa~n, so to find Pa~n, all we have to do is subtract this from 1, or 1 - (1 - 0.0015)^p. Why is this true? Remember that the sum of the probabilities of all events must total to 1. Since any event and its complement make up all events and there is no overlap between the two, their probabilities must sum to 1.

Since I'm about 300 words past the word limit already, I'm going to finish this out in the next post.

Saturday, March 23, 2013

Playground construction

            Keeping up with the spirit of tweaking the conditions of the game and observing how the length of the game is effected, this week I also wanted to mess around with the placement of chutes and ladders. Normally we think of chutes as harmful and ladders as helpful, but this, perhaps, mustn't always be the case. For instance, if a short chute puts a player a few squares away from a very long ladder that the player had already passed, that chute may actually be advantageous. Can we find a case on the Chutes and Ladders board for which this is true?



            To find out how a chute or ladder impacts the length of the game, we can remove it (which alters the transition matrix) and see how the expected number of moves changes. I started with what seemed to be the most likely candidates for these "good" chutes and "bad" ladders. The chute from square 47 to square 26 caught my eye since it puts the player in a good position to land on the behemoth ladder on square 28; however, removing this chute still decreased the expected number of moves (from 39.2 to 36.8). My second choice was the chute from square 16 to 6, but removing this also decreased the expected number of moves (from 39.2 to 38.0). After checking the remaining cases, no such chute or ladder could be found.


            So, then, how about we create one ourselves? Let's see what happens if we put a chute from square 29 to square 27. This makes it a bit more likely for the player to land on the ladder at square 28 without setting them back too far, and it decreases the expected number of moves from 39.2 to 38.0. What happens if we instead make the chute go from square 29 to square 26? The expected number of moves drops again to 37.9. Even though this set the player back further, the increased chance that this gives him or her of landing on the big ladder makes up for it. In fact, this chute has the best effect if it goes from square 29 to 23, where the number of moves set back is equal to the size of the spinner (expected number of moves is 37.2).

           I want to generalize this result a bit more and perhaps establish under what conditions a chute or ladder will be "good" or "bad", but for now, that's all folks.

You spin me right round...

    This week I was finally able to indulge my childish tendencies by breaking the rules of Chutes and Ladders.

    In a normal game of Chutes and Ladders, the spinner has a "size" of six: there are six possible moves. This week I decided to figure out what spinner size would produce the lowest expected number of moves to complete the game. While it might seem that increasing the spinner size would allow us to advance along the board faster, and hence, reduce the expected number of moves, increasing the size also makes landing on the final square more difficult. The ideal value is therefore in the middle of two extremes.

    Thankfully, two weeks ago I prepared a program to find the expected number of moves for a standard game with a spinner of size six, and finding this value for other spinner sizes only requires a small tweak in how the transition matrix is constructed. The results were as follows:

Spinner size -- Expected number of moves

1 -- Infinite (the game cannot be completed with a spinner of size one)
2 -- 60.7
3 -- 65.9
4 -- 54.5
5 -- 45.6
6 -- 39.2
7 -- 34.7
8 -- 31.8
9 -- 30.3
10 -- 28.8
11 -- 27.4
12 -- 27.0
13 -- 26.2
14 -- 26.0
15 -- 25.8
16 -- 25.9
17 -- 26.0
...

    The results continue to shoot up after this, so the ideal spinner size is 15.

    As you can see, the ladders and chutes cause some irregularities in the pattern (most noticeably when the spinner has a size of 3). This begs the question: what would be the ideal spinner size for a modified game of Chutes and Ladders with no chutes or ladders? A similar analysis shows that the ideal size is then 13. Thanks to the simplified nature of this version, however, this value can be found without using any complicated Markov chain analysis. The expected number of moves for a game of spinner size n can be approximated by:




         The first fraction is an approximation for the expected number of moves required to be within n moves of the finish. "n" is also the expected number of moves required to reach the final square when you are within n moves of that square (the probability is always 1/n, so the expected number of moves is n).

    Plugging in various values of n, again, reveals that the ideal spinner size is 13.

    Let us take a moment to breathe in and enjoy this brief respite from the pain of Markov Chains.
    

    

Saturday, March 9, 2013

Great expectations (part 2)


            Picking right up from last time, remember that Chutes and Ladders is what we call an absorbing Markov Chain, which means that there is at least one state from which it is impossible to escape. Those states are referred to as absorbing states. In the case of Chutes and Ladders, the only absorbing state is the state corresponding with the last square on the game board, since the game ends once a player reaches that square. What we want to do is rearrange the transition matrix's rows and columns so that all of the non-absorbing come first in the matrix. Then we want to create a new matrix called Q that represents a transition matrix from all the non-absorbing states to all the non-absorbing states (the first step makes this easier to do). That probably won't make much sense without a picture:


 

                Once Q is found, the next step is to subtract it from the identity matrix, I, (the identity matrix is a square matrix with 1's going diagonally from the top-left corner to the bottom-right corner. It is called the "identity" matrix because when you multiply any matrix by the identity matrix, you always get the same number back, just like how whenever you multiply a number by 1 you get the same number back. The identity matrix always has the same dimensions of the matrix that you're adding or multiplying it to.) giving you the matrix (I - Q). The next step is to take the inverse of (I - Q) (the inverse, M^-1, of some matrix M, is the matrix such that (M)*(M^-1) = I), which will give us a new matrix which we'll call N. If we take the sum of the values in the n'th row of N, we get the expected number of iterations that occur before an absorbing state is entered (in Chutes and Ladders, an "iteration" is a spin of the spinner). 

              Why does that work? Unless you're Euler or Gauss, it shouldn't be obvious at all. There is a proof of this that uses a bit of linear algebra and probability theory available here --> http://www.dartmouth.edu/~chance/teaching_aids/books_articles/probability_book/Chapter11.pdf if you want to see why, but I won't explain it here (partially because the notation would be a pain to do in Microsoft Paint).

            Getting back into Chutes and Ladders, since the only absorbing state is the one-hundredth square which is already located at the bottom-right corner of the matrix, we don't have to rearrange it at all to find Q. Q is just the matrix formed by the first hundred rows and hundred columns of the transition matrix. I quickly modified the program to produce the transition matrix for Chutes and Ladders to get Q. To get from Q to I - Q,  all you have to do is take all the values along the diagonal, subtract them from one, and take the negative of all other values.

            Now comes the tricky part. Taking the inverse of a matrix is not trivial. Rather than stumble around trying to program a "matrix inverter" on my own, I decided to search the web for some Java code that could help me out. What I found was a package called "JAMA" (http://math.nist.gov/javanumerics/jama/) that contained a ton of tools for performing operations on matrices, among them being a tool for inverting matrices. Using this program, I found (I - Q)^-1 and added together all the values in the first row (representing the expected number of moves required to reach the absorbing state from the first square).

            The value obtained, perhaps very, very slightly off due to Java floating-point inaccuracies, is:

            39.225122308234

            Ahh, now that's better.

Friday, March 8, 2013

Great expectations


            Last week I finally managed to find out the probabilities of completing a game of Chutes and Ladders in various numbers of moves, so can we finally (mathematically) answer the nagging question, what is the expected number of moves required for a player to complete a game of Chutes and Ladders?

            When we determined this value using the computer simulation, we had actual data to work with: the total number of games played and the number of moves each game lasted for. We divided the latter number by the former number and out popped the average. Can we come to the same result with just some theoretical probabilities? 

            Well, we can get pretty close. To reiterate, when we're talking about the average number of moves per game, we mean the total number of moves divided by the total number of games:


            We can then break down the "total number of moves" as follows:



            Note that the numerator continues on infinitely since there is no theoretical limit to the maximum number of moves a game can last for.

            Now separating each term in the numerator into an individual fraction and simplifying...


            This is an expression we can work with, since we found the exact probabilities of completing the game in various numbers of move using the transition matrix. There is still one problem though. As already mentioned, this expression requires us to know the probability of completing the game in every possible number of moves, which is a theoretical impossibility since there are infinitely many. However, we can still make a good approximation based on a large number of moves. As you can see from the previous weeks' graphs, the probability of finishing a game in a triple-digit number of moves or more is very slim, so we aren't hurt too much if we leave a lot of these larger values out. Using the probabilities of completing the game in 1,2,3,...,300 moves, I got an approximate average of about 39.21529. The estimated average of completing the game from the computer simulation was also around 39.2, so not bad!

            While that much precision is probably good enough for Chutes and Ladders, there might be other scenarios where more precision is needed, so it's still important to know whether there's a way to determine the exact number of expected moves. It turns out there is a way to do this using (you guessed it!) matrices.

            This will probably be the most obscure part of this project to date (and everything else hasn't exactly been a skip through the park), and since I'm already nearing the 500 word limit here, I think I'll just push off that explanation into the next entry. Stay tuned.

Saturday, March 2, 2013

Closed-form expressions: the "Ladders" of math


            In order to find the probability of completing the game in a certain number of moves, we have to make a 101 by 101 matrix, giving us a total of 101 * 101 = 10201 entries. However, thus far, we have really only cared about one entry: the entry that gives the probability of moving from the first square to the last square. This means that we have 10200 matrix entries that are essentially meaningless. A natural question, therefore, is can we somehow ignore the information that doesn't matter to us and thereby simplify the process of finding out what does matter?

            Unfortunately, we can’t just start cutting away at rows and columns of the transition matrix and expect the entry we care about to remain unchanged. It’s the intermediate steps that ultimately determine how we get from point A to point B. If we want to find a way to simplify it, we have to look for more general shortcuts in matrix multiplication. One solution is to see if there exists a “closed form” expression for the value of an entry in a matrix as that matrix is raised to various powers. What exactly does that mean?

            Consider the following example. Suppose we wanted to find the sum of the first one hundred positive integers, that is, the value of 1 + 2 + 3 + … + 100. We could go about this problem by adding the terms individually (1 plus 2 is 3, 3 plus 3 is 6, 6 plus 4 is ten…), but that would clearly take a while. However, there is a shortcut. It can be shown that the sum of the first n positive integers is equal to [(n)(n+1)]/2. The sum of the first one hundred integers is therefore equal to (100)(101)/2, or 5050. [(n)(n+1)]/2 would be considered a closed form expression for the sum of the first n integers since it allows us to perform an otherwise complicated calculation in one simple step by plugging in a value for one variable.

            So, can we do the same thing for matrices? Well, in some cases, certainly. If we have a matrix filled entirely with 0's, we can determine the value for any given entry in that matrix if it’s raised to a power: it’s always 0. Of course, that trivial case doesn't help too much. Can we do the same for a matrix whose entries change as it’s raised to various powers? Well, here’s one I found. Let’s call it M:



            Now see what happens as we raise M to a few different powers:



            Do you see the pattern? The value in the bottom-left corner is always equal to (1/2) times the power that the matrix is raised to and all the other values remain constant. Thus, we can determine the value of the entries as M is raised to some positive integral power without actually multiplying M by itself over and over again. Of course, this is still a rather simple case and doesn't provide us with some general pattern that can be applied to other, more complicated matrices. I did some extensive Google searching to see what sort of matrices have these convenient closed-form expressions, and unfortunately I couldn't find anything that would be applicable to the behemoth transition matrix for Chutes and Ladders (if Google doesn't have the solution, chances are it’s time to throw in the towel). I will keep this question in the back of my head as I keep going on with this project, but at this point it doesn't look like there is a simple solution (I could see what happens if I replace the numbers with variables in the 101 by 101 matrix, but… chances are that won’t be useful. Try it with a 2 by 2 matrix and you’ll see why!).

Friday, March 1, 2013

Data-crunching and (kinda) pretty graphs


Phew. As mentioned at the end of my last blog, this week I combined the two programs I wrote last week -- the program that generates the transition matrix for Chutes and Ladders and the program that computes the powers of an inputted square matrix -- and found the probability of a player computing a game of Chutes and Ladders in various numbers of moves. Without further adieu, here are the results:



This graph represents the cumulative probability of completing a game of Chutes and Ladders by a certain move (represented on the horizontal axis) up to the hundredth move. The probability of finishing a game tends towards 1 as the number of moves increases. This must be true since the last square is an absorbing state, but it also makes intuitive sense. A game of Chutes and Ladders must end at some time!

The next graph represents the probability of completing a game of Chutes and Ladders in exactly a certain number of moves. This can be easily obtained from the data used to make the last graph. For instance, if we looked at the value of the probability when the number of moves equals 3, we would get the probability of finishing the game in 1 moves plus the probability of finishing in 2 moves plus the probability of finishing in 3 moves. The value of the probability when the number of moves equals 2 would be finishing in 1 move plus the probability of finishing in 2 moves. If you subtract the two, you are left with the probability of completing a game in exactly 3 moves. This can be generalized: to find the probability of completing the game in exactly n moves, subtract the probability that the game will be finished by the end of the (n-1)th move from the probability that the game will be finished by the end of the nth move. Anyways, here is the graph:



Look familiar? This graph really resembles the graph generated from running the Chutes and Ladders simulation (with a million trials), which is shown below.




It's important to realize that while these graphs are similar, they are not the same. The first graph gives the "absolute" probabilities of completing a game of Chutes and Ladders in certain numbers of moves while the second one merely estimates these values. The law of large numbers discussed in a previous post tells us that the second graph should resemble the first graph more and more as additional trials are performed (for example, we would expect the graphs to be more similar for a million trial simulation than for a ten-thousand trial simulation). A million being a fairly large number gives us a pretty close estimation.

Although it might be hard to see from the graphs, the minimum number of moves required to win is seven. A person is most likely to complete the game on the twenty-second move, which has about a 2.49% chance of occurring (this represents the peaks of the last two graphs and the point where the slope is increasing most rapidly in the first graph).