After the barrage of Wordle type games appeared in 2022, there were many attempts to show the best opening move using data analysis (Example 1, Example 2). Whilst not all of these gave the same answer, the general strategy is the same: to eliminate as many answers as possible and increase your odds of victory. Here, we apply the same logic to the F1 driver game Stewardle.
How Stewardle works
Stewardle gives you six attempts to guess an F1 driver who raced (or races) during the hybrid era. This means that whilst many famous names such as Senna and Schumacher (Michael) or not included, lesser known drivers like Rio Haryanto or Mick Schumacker are included. As we shall see, attempting a household name as your first guess is generally not a good strategy.
To help players find the right answer, the game has 6 clues: Nationality, Team, Car Number, Age, Debut Year, and Number of Wins. For numerical answers, it tells you whether the driver guessed has numbers bigger, smaller or equal to the answer.
As previously mentioned, the most efficient root to success in these types of games is typically to consistently eliminate as many options as possible regardless of what the outcome is. For example, if your first guess is Kimi Räikkönen, is it likely that you won’t gain much information from the response. Why? Well Räikkönen is significantly older, more experienced and more successful than the vast majority of F1 drivers included. This means that the most likely response is that the answer is younger, debuted later and has fewer race wins, which unfortunately includes almost every driver from the hybrid era.
What we are therefore looking for an “average” F1 driver from the database. If we look at data for every driver on the system, we find that the average driver has a Race Number of 22, and Age of 31, a Debut F1 year of 2015 and 0 race wins. The best starting point is therefore a driver as close to as many of these stats as possible. (Actually a starting driver with 1 race win would be ideal, as the response would tell us if the diver in question had 0 wins, 1 win or more.)
The qualitative data that is given is the nationality and team of the driver. Neither of these are as important as the numerical values, as even with the ideal response is unlikely to be correct. However, ideally we would want a driver that matched as many other drivers as possible. For nationalities, the most common one of British.
After reviewing every choice possible, it is clear that Kevin Magnussen is the standout choice for an “average” F1 driver from the hybrid era, and therefore and ideal first guess. Several other drivers also have values that are generally close to the average (Marcus Ericsson, Felipe Nasr, Alexander Rossi and Will Stevens are also strong choices), but none of them are as consistent as Magnussen, and none of them are as consistent at leaving a relatively small number of drivers remaining for subsequent guesses.
|Variable||Ideal Value||Magnussen’s values|
Are all variables equally known?
There are probably two main issues with the analysis presented so far. Firstly, many F1 fans will either not have heard of some of the drivers, or not be able to recall them when playing the game. In many cases this is not a major barrier, but if the answer is someone you’ve never heard of then no analysis will save you!
Secondly, some variables are much easier to know (or approximate) than others. If we take Pastor Maldonado as an example, a long time F1 fan would probably recall that he is Venezuelan, debuted in 2011, raced for Williams and Lotus, and miraculously won the 2012 Spanish Grand Prix. It’s less likely you would know his age off the top of your head, but could perhaps give an approximation based on the era he raced. His racing number, meanwhile, would probably be a total unknown.
(Maldonado is 38 at the time of writing and last raced in F1 with number 13. Give yourself a pat on the back if you knew these.)
Finally, it’s important to nail the second guess. This will typically involve working out which drivers are remaining and then subsequently finding a “typical” driver within that subset. There are lots of variables, but lets take a test case where the driver has has a smaller driver number than Magnussen, is older, has a debut before 2014 and has won a race. This leaves 8 drivers: Alonso, Maldonado, Massa, Pérez, Räikkönen, Ricciardo, Rosberg and Vettel. So, out of these drivers, who is typical? It turns out that Nico Rosberg is your best bet.
|VARIABLE||IDEAL VALUE||Rosberg’s Values|
|Debut Year||2006 or 2007||2006|
|Race Wins||11 or 21||23|
Using Rosberg as a second guess in this case gives a 50/50 chance of obtaining a unique answer after just 2 guesses. The other 50% are split into 2 pairs (Maldonado-Pérez and Räikkönen-Massa), meaning that in the worst case scenario you have a 50% chance of getting the right driver after 2 guesses. Whilst this might seem like a selective choice, this is actually a worst case scenario, as in most cases there will be 4 or fewer available drivers remaining based on using Magnussen as the first guess.
Whilst analysing exactly which drivers are remaining in a pool may be too difficult for most, gaining an idea of what is typical is extremely helpful. In the above example, Rosberg is significantly more successful and older than most drivers from the hybrid era. He only registers as an “average” driver because the remaining pool is already skewed towards those qualities.
This highlights another issue when making guesses: most of the variables given are not independent. For example, if they are an ex-Ferrari driver, they are also much more successful than average, older than average and with an earlier debut (the last ex-Ferrari driver was Vettel, who debuted in F1 7 years before the hybrid era began). A key skill in getting the right driver as quickly as possible is considering who is in the pool of remaining drivers, and then picking the most typical driver out of that pool.
I hope you have found the above useful. The initial plan was to dismantle the entire game, providing something akin to a flow chart to answer all possible answers. I eventually concluded that it would spoil the guesses game a bit, so an approach suggesting more “general advice” from data analysis was taken.