How strong is my opponent? Using Bayesian methods for skill assessment
Darina Goldin
If A beat B and B beat C, should A be ranked higher than C? What seems like an easy question quickly gets more complex if we ask ourselves when A played B for the last time, what the scores were and whether or not B's grandmother had died just before the match. The question of correctly ranking contestants has been around as long as people have been playing sports, and it doesn't lose its relevance. In online video games, rankings are used to build teams of fair skill. In betting, ranking directly translates to the probability of a contestant winning the match. And of course every chess player world wide can tell you their Elo rating. In this talk we will look at the most established ranking algorithms: Elo, Glicko2 and Trueskill. All three are based on Bayesian updating. We will consider the theoretical foundation of the three, and compare their use cases and shortcomings. All the three algorithms are readily available as Python packages. Using a real-life data set we will generate a ranking of German Bundesliga teams and compare it to the currently accepted status quo.
Darina Goldin
Affiliation: Bayes Esports Solutions
Darina came to Data Science via a Ph.D. in Control Science. For the past four years, she's been working on predicting outcomes of esports matches. By now she has probably applied and implemented every ranking algorithm that's been published since the 60ies.