Using analytics to make some decisions (not all)
The first thing that always comes to my mind when thinking about the usage of analytics is what I would do if I were working in another branch of the entertainment industry. There is the easy temptation in mobile games to turn to data for many – if not all – creative decisions. However, turning to data to make creative decisions would feel odd to me if I were in charge of producing a TV show, a feature film or a music album. That’s because first not every decision is a make-or-break decision – for some decisions there is little value in trying to optimize performance. Second, it’s because not all decisions can be validated by data. There is always an element of unknown in every decision we make, and spending time and resources to reduce that uncertainty is not always going to be something that yields actionable results. Is there conclusive data to indicate how many explosions to put in your action movie or if the love interest for the protagonist should be a redhead? Probably not (disclaimer: I have no direct experience in other branches of the entertainment industry – this is 100% my partial thoughts and assumptions).
Netflix has mountains of data available (there are very interesting posts in the Netflix tech blog). It appears that for the most part, analytics are not so much involved at the “internal content” level (what plots, actions, protagonists) as at the “macro level”. By “macro-level” I mean factors such as genres, titles, descriptions, recently watched shows or who’s in your Facebook friends list. I’m sure you could see by looking at drop rates what aspects of the content are unappealing and should be avoided. Maybe viewers do tune out more when the love interest of the protagonist is a redhead… But even in this case, it seems like when Netflix is leveraging data to optimize the viewing experience, the data considered is at the holistic level of the product taken as a whole rather than at the granular level of the content itself (that also makes sense if you think that historically Netflix was about the distribution of existing content more than actually producing the content).
In your game, is there an optimal state for all of the content you produce? Maybe, maybe not – there can be multiple options that are equally valid (what should be the character’s defining skills, what’s the reward in your login bonus, what’s the thematic of your next event, etc.). The one thing that’s pretty sure is that there are many decisions that don’t make that big of a difference on the performance of your game. And if you’re spending time analyzing on that level, the ROI of your analytics efforts is limited at best.
Data can be a double-edged sword. I’ll be the first to insist on the incredible value data can bring to every organization involved in mobile gaming. The amount of data available means there is basically no guesswork needed when trying to observe how players are engaging with your game and what is driving performance. But on the other hand, the amount of data available can be overwhelming. And that can have the opposite effect of sometimes devaluing our confidence in common-sense and our ability to move forward with new decisions and features. The more decisions we can make based on data, the less confident we can become in making decisions that are not supported by data. And that can negatively impact our ability to innovate and try to make impactful decisions. Data can provide visibility into what is happening – you can only use it to take a more confident leap of faith when you want to have an impact in the future. If not used sensibly, the reassurance data can provide can turn into a crutch and a dependency that impedes good and efficient decision-making.
Not every change in a metric is meaningful
Making games is not rocket science (or watch making, or any other activity that depends on having very high level of precision for every element). Every individual part doesn’t need to be perfectly fine-tuned and calibrated. At an even more fundamental level there are things that are outside the realm of optimization altogether. The key thing is to distinguish things that matter from things that don’t matter as much – and focus your efforts on what’s most likely to move the needle. That’s why using OKRs to develop your features and product strategy is so useful. You don’t want to spend time analyzing things because they can be analyzed (and theoretically optimized). You want to spend time and ressources into analyzing things because they can be impactful.
Once you accept the premise that not everything matters – and you’ve identified what in fact does matter – you need to be thinking about what is needed to consider something a meaningful change. You need to be clear and distinguish what constitutes impact and what is just noise and variation within a “normality range”. So that means to get the best ROI from your analytics efforts, you need to be clear on what in fact constitutes meaningful impact. What that means is that you need to be very intentional on what counts as impact, and only go after changes that are big enough to really reflect a change in game performance.
There is an important factor to keep in mind here. The game’s performance lies at the intersection of 2 key factors: the game and it’s features on one hand, and the audience engaging with that game on the other hand. When a metric moves, it depends on 2 things: the game itself and the playerbase. We’ve all experienced a drop in arpdau due to a feature in a low-performing country. That doesn’t reflect a change in game performance, but a change in players playing the game. When looking at impact, you want to make sure you are actually considering the performance of your game, not changes in the environment.
When using analytics, there is a difference between observing a difference between 2 numbers, and observing a meaningful change in product performance. Two numbers will always be different at some level of detail. Daily play time before the last update might be 36.2 minutes and 36.6 minutes after the update. It’s not the same play time, but that doesn’t always mean there is a change. The number here is indeed higher after, but that doesn’t mean players are playing more after the update. Even stating users are playing “slightly” more would be misleading. The number is different, but the performance is not. And that message greatly impacts the way you understand and evaluate the change, and what you decide to do next. If we focus too literally on the number rather than what that means in terms of impact, you are missing the forest for the trees.
It’s rather common to conduct an AB test that suggests a moderate uplift in a metric – but when rolled out in your game those improvements fail to materialize. If you are talking about a variation in conversion of 0.1 or 0.2%, then it’s one of two things. Either the change you’re looking at doesn’t reflect a change in game performance. Or you are looking at a change in game performance, but a very small one. You should treat both differently. And even though observing a small meaningful variation could be better than just observing noise, you probably want to be avoiding both cases. That’s because if you spend time looking for a change in product performance that’s in fact only normal fluctuation, you’re dedicating time and resources looking for something that doesn’t exist. But even in the second case where you’re looking at a real – but small – change in game performance, you are not spending time on high-impact things. By definition, the biggest ROI you’ll get from data comes from focusing on big changes. What that means is that you need to be very intentional on what counts as impact. The standards you set for meaningful change will in large part determine the ambitions you set for the game.
You might be at a point in the lifecycle of your game where many optimizations have been conducted and there are no more low-hanging fruits. That can be the moment where at the level of your portfolio strategy you might want to be rethinking the contribution of that game and the amount of ressources you’re willing to invest into it. But assuming you are still investing time and resources into that game’s operation you probably want to be striving for the biggest impact at all times – even if that means reinventing parts of your game to meanginfully move the needle.
Be clear on what constitutes impact
Maybe (hopefully?) particle physicists are very attentive to the most subtle variations. In mobile game it’s hard to do that – and it can be counter-productive if you are trying to track things down at that level of precision. Very few games can be attributing such small changes to actual changes in product performance. Depending on the metric you’re focusing on, the population you need to consider to reliable measure a change (from a statistical point of view) can be very high. And even if the delta reflect a real change in product performance, it’s probably not worth it to spend 3 months AB testing your log in bonus to have a high degree of confidence in the 0.5% increase in D7 retention of your game. You can probably find better things to test…