Another world cup and another opportunity to predict how the tournament will play out – we all want to know which teams will hang around and which will exit the competition, and, more importantly, who will win.
So, how should one go about making a prediction? You could use some high level historical trends like this, you could ask an expert, or you could utilise the prediction powers of a psychic octopus.
Alternatively, if you are Goldman Sachs, you could use machine learning to run 200,000 probability trees to mine data on team characteristics and individual players to predict match scores. You could then use this to simulate one million possible evolutions of the tournament to determine the probability of each team progressing through the rounds. Sounds like some serious leg work and fancy footwork, but hats off to Goldman Sachs for doing just this.
Goldman Sachs have used this methodology to predict which teams will make it to the round of 16 and how the tournament will play out from this point to the eventual winner. The model predicts the number of goals scored in each match, and the un-rounded score is used to determine the winner. For example, according to their results (as shown in the image below), Germany will meet England in a quarter final match and will narrowly beat England by 1.47 goals to 1.28 goals.