Simulating a Tournament – Data, Math & Matchups

What would happen if someone tried to use Chat GPT to simulate an entire Pokemon tournament? I gave it a shot, and it actually worked.

This might be a big deal. It means that basically anybody can run simulated tournaments and figure out what the best deck is for any given tournament. It means you don’t have to guess what the play is anymore – you can just figure it out in a somewhat objective way. You could run hundreds of tournaments without playing a single game.

While it is rather impressive what this software can do for you, we must remember that the answers it gives are dependent on the information we give to it. In other words, the matchup percentages entered need to be accurate if we want to feel confident that the outputs it gives are valuable. We also need to make sure that the meta percentages we give it are accurate. Basically, if we feed it any bad information, we will get a bad output; in other words, it will tell us that bad decks are good! But when we do give it good information, it can be extremely useful.

The great thing about Chat GPT is that you can also ask it very specific questions that normal software would not answer. For example, I asked “Which deck is the best?”. This should be very hard for a non-human to answer, because it gives you no metric to determine what ‘best’ means. Does best refer to the deck that wins the highest number of individual games? Does it refer to which deck wins simulated tournaments the most? Does it refer to which deck makes it into Top 8 most often? It seemed difficult for an AI to answer my reductive questions, but Chat GPT had no problem giving me a great answer immediately. You can even ask it something like “What is the best deck for me to maximize my chances of making it to Day 2?” In other words, it will tell you what to play no matter what your goal is.

Now that you understand the utility of running a simulated tournament (or hundreds of them), I will explain how to do it. This article covers how I set up the simulation as well as how I came up with my data. The majority of my time was spent on getting good data and figuring out which data should be used. It can be very hard to decide what to input for a match up at times! For example, if two average level players play one game of Raging Bolt vs Terapagos, the result is around 55/45 in Terapagos’ favor. But if you play the same matchup in a Best-of-3 match with two perfect players, the result is more like 70/30 in Terapagos’ favor. That is a big difference! How do we deal with that? In order to account for match discrepancies based on skill level, we need to let the program know how to handle these situations. Then there is stuff like the tie rate for specific matchups, and what we do about the large number of decks in this format. More on this later!

After upgrading to Stage 2 you will see the rest of Phinn Lynch’s article and an audio recording of this article narrated by Andy Hyun:

private accessYou must have a Stage 2 Membership or greater to see the rest of this post. If you don't have a Stage 2 account, you can Sign Up for one here.

One thought on “Simulating a Tournament – Data, Math & Matchups

Comments are closed.