How strong is my opponent? Using Bayesian methods for skill assessment Darina Goldin PyConDE & PyDataBerlin 2019 conference

How strong is my opponent? Using Bayesian methods for skill assessment

Darina Goldin

Thursday 16:05 in Saal 6 thursday thursday-1605

Type/Track Talk PyData

If A beat B and B beat C, should A be ranked higher than C? What seems like an easy question quickly gets more complex if we ask ourselves when A played B for the last time, what the scores were and whether or not B's grandmother had died just before the match. The question of correctly ranking contestants has been around as long as people have been playing sports, and it doesn't lose its relevance. In online video games, rankings are used to build teams of fair skill. In betting, ranking directly translates to the probability of a contestant winning the match. And of course every chess player world wide can tell you their Elo rating. In this talk we will look at the most established ranking algorithms: Elo, Glicko2 and Trueskill. All three are based on Bayesian updating. We will consider the theoretical foundation of the three, and compare their use cases and shortcomings. All the three algorithms are readily available as Python packages. Using a real-life data set we will generate a ranking of German Bundesliga teams and compare it to the currently accepted status quo.

Tags Algorithms

Level Domain Expertise none Python Skill Level none

Darina Goldin

Affiliation: Bayes Esports Solutions

Darina came to Data Science via a Ph.D. in Control Science. For the past four years, she's been working on predicting outcomes of esports matches. By now she has probably applied and implemented every ranking algorithm that's been published since the 60ies.

visit the speaker at: Twitter • Homepage