I recently authored my first R package (and definitely have the bug, now). It would be remiss of me to continue without a quick PSA for anyone out there thinking about dabbling in the package game - below is a shout-out of the invaluable resources I found along the way that kept me nice and sane.
@ewen_) June 7, 2017
In short, I set out on my package contribution journey by focusing on the development of something that I myself would find useful (that way, I could keep at least one user happy). Each year, I play an edition of fantasy sports called Fantasy Premier League (FPL), the official fantasy football game of England’s top football (or soccer for you stateside peeps) division. If you really want to know the in’s and out’s of FPL, there’s a section of the site just for rules. TL;DR: for 38 rounds (called ‘gameweeks’), you pick a team of football players to do good things that score you points and help you finish as high up the leader-board as possible.1
It just so happens that we’re in the off-season right now, and I figured I should use this down-time to explore how R could assist with FPL users’ exploits in the coming season.2
fplR - a package for the strange intersection of FPL / R users like me (I hope it’s not just me) to explore the game’s data. In the rest of this post, I want to test drive the package by first pulling out some insights on my most recent season (2016/2017), and then do some exploring of the mythical quirks of FPL as a bit of fun.3
How did I do last season?
As the game updates each user’s scores following the day’s matches, the site helpfully tells you your ranking (overall, in your country, in any mini-leagues you are part of, etc.). Still, just knowing your performance at one point in time does not begin to capture the season’s journey (read: roller-coaster of emotions and deep suffering). Time to dive into
fplR to see if it can help.
# get my fpl season performance myPerformance <- userPerformance(137633)
The sharp decline in my ranking around gameweek #5 looks striking, as is the fightback to something respectable in gameweeks 7-10. However, bigger variations in the earlier gameweeks are to be expected - the sample size is small, therefore the pack is close, so to speak (more on this, shortly). Nevertheless, my final rank was a personal best of 4,752nd in the world. (I hope that establishes my credentials, nice and early). Having spent so long invested in something, it’s really nice to just be able to reflect on it with this visual accompaniment.
Next, I’ll take some time to examine some aspects of the game that the FPL community has debated on since the beginning of time. I won’t necessarily dispel/confirm all of these things here, but it’s a taster of what kinds of analyses can be spun up quickly using
The early season rat race
Many users emphasise the importance of a good start, claiming that it is more difficult to do well in the latter stages of the season without a solid foundation built early on. To illustrate this, lets look at the average shift in overall rank by gameweek in one of my mini-leagues (a smaller league amongst a peer group) as the season progresses i.e. how much does a users rank change from week-to-week (N.B. in the current package version, I haven’t developed a function to help make this easier, yet. Soon come)?
The “elbow” in this chart comes at a similar time to when my rank began to level off, somewhat. You might have noticed the spike in the penultimate gameweek - in this instance, lots of players actually played twice (known as a “double gameweek” to insiders), meaning a greater opportunity for big swings in user rankings.
Another factor perceived to be important in gaining a healthy position for high scores is that of team value. As football players in the game perform well (or not so well), their price actually changes as a function of market forces i.e. how many users are transferring them into/out of their team? By ‘playing the market’, users can build up the money at their disposal with which to build the best team. Therefore, perhaps we would expect someone’s final team value to be a good predictor of their final rank (N.B. one outlier was removed from this plot - sometimes people chase the team value as a game in itself, incurring lots of points penalties that drastically affect their team’s points total. This was one of those cases).
To show a fairer picture, I have attempted to remove ‘dead’ users i.e. users that have given up on the game, while their team continues to register scores. To eliminate these zombie teams, I chose to use transfers (bringing a player in for an existing squad player) as a proxy. It is fairly safe to assume that users remaining active until the season end are also making (at least some) transfers up until the end of the season, too. I’ll consider users who made a transfer in one of the final three gameweeks as ‘live’.
There’s a moderate positive correlation, but plenty of variation as well - as was the case in our outlier, it is possible that users may chase team value at the expense of their league performance. Still, we can say that there appears to be a relationship between retaining a high team value and ranking highly.
While one free transfer is permitted each week, users can make additional transfers at a cost of 4 points each - known as “taking a hit”. This is a frequently toiled upon decision by FPL users, who must decide whether making this move will return a greater number of points (considering the penalty) than sticking. Perhaps the frequency someone incurs hits can tell us something about performance (again, I’ve removed an outlier incurring a huge number of these).
It appears that taking hits has very little correlation with a user’s rank. This would suggest that there are indeed good ‘hits’, as users are able to get a high rank while racking them up. At the same time, this illuminates other playing styles that can also bring success i.e. users who refrain from taking ‘hits’ are able to get as high rank as those that do.
Picking the bench
Every user talks about the time they left that weeks hot player on the bench (each gameweek, you must choose a starting 11 players - if they play, your back-up players, or ‘bench’, do not play and do not collect points). This often leads to disgruntled folk who hold dear the belief that some kind of FPL God is consistently against them. If we look at users’ benches across the season, how much variation is there?
What we see is close to a normal distribution. So, probably no FPL God’s retribution. Most people benched somewhere in the region of 150-300 points over the course of a season, with an unlucky/lucky few coming in higher or lower respectively.
Another phenomenon, representing something of an FPL meta-game, is that of the “bandwagon”. This refers to times when certain players put in particularly explosive performances, triggering big movements in the market (i.e. lots of people bringing them in, similar to a hot stock). When were the biggest transfer booms?
Dele Alli had the biggest influx of transfers all season in gameweek 21, after a masterclass against Chelsea. Incidentally, both the 2nd (Ibrahimovic) and 3rd (Gundogan) highest net transfers followed games against West Brom.
As for the biggest net losses, these mostly belonged to players of high ownership who suffered serious injuries/suspensions. The most interesting case is that of Diego Costa, who was the center of a media frenzy claiming a (real-life) transfer to China was imminent (this never materialized).
Of course, these absolute swings are largely governed by a player’s initial ownership size and profile. We can look at percentage changes in ownership of players to consider those who had large relative shifts in selection.
Here we see huge percentage increases in the ownership of Fer and Stuani after promising displays on the opening day of the season and second gameweek, respectively. It is often in the very early stages that users scramble for players, often after just a single good performance, as there are few data points to go by otherwise. Jesus and Gabbiadini were actually both new arrivals to the game in the mid-season transfer window (in January, clubs can buy additional players) who started well - again, in the absence of any long-term trend, users rushed to scoop up these players.
Of course, the % drop in ownership is naturally capped (ownership can only fall by 100%). Interestingly, just two games after Jesus’ huge upswing in ownership, he gave us the biggest fall (almost 90% of owners dropped him)! If that’s not Murphy’s Law in full effect, I don’t know what is.
How often are the top players hot?
Come the end of the season, we check the top point-scoring players and yearn for what might’ve been if we’d snapped them up at the right time. How much love did the top (eight) overall point scorers actually get from FPL users over the season, and when?
We see very different relationships between certain players and the transfer market. De Bruyne remains exceptionally stable in ownership after a couple of early season swings, while Kane’s ownership is volatile throughout. Some of this will be determined by player availability (i.e. were they injured/suspended), but also by how much variation there is in their points-scoring. Do we see consistency in the performance level of these players?
By applying a moving average in the above chart (each data point is an average of the last five gameweeks), we can imagine performance in terms of season phases. For someone like De Bruyne, we notice that his peak performances came at the very start and end of the season, which perhaps explains the tepid market in the middle of the season. In a similar fashion, we can identify ‘purple patches’ of players, like Alli in the season middle. Much of FPL success can be determined by ensuring ownership of these players during such patches.
I hope this illustrates some of the possibilities available with the
fplR package. I had a lot of fun making it, and look forward to using it to better understand my FPL experience. Do share any interesting things you’re able to do with it (or any things that go wrong with it…), and try to enjoy the new season. It’s just a game…
In this post I have referred to actual footballers as ‘players’, and players of the FPL game as ‘users’.↩
At current, the FPL site only keeps detailed data for the current season. Therefore, some of the data used in this analysis will likely be unavailable once the new FPL season begins. For access to all data used in this analysis, please go here.↩