Searching for signal in ‘small data’
Everyone’s ‘data-driven’ these days. Following the data strips out biases and uncovers the right path… right?
Developing software requires ruthless prioritisation and difficult decisions. Features and upgrades must be approached in an order that brings value for users as quickly and cheaply as possible. Going down the wrong track for too long will be devastating.
In addition, I’ve learnt about cognitive biases; availability bias and confirmation bias seem to be prevalent in software design. And I’ve been in plenty of situations where numbers have been totally disregarded in favour of the Highest Paid Person’s Opinion.
The promise that data can solve these issues and more, and bring total clarity to the decision making process is seductive. But does it work like that if you’re a fledgling start-up? Not exactly in my experience.
When you’re creating a product from scratch you don’t have data. You don’t have user statistics, because you don’t have any users. Even once you’ve gathered some early beta testers, usage data must be looked at with a critical eye. It’s ‘small data’; tiny sample sizes can lead to wildly misleading conclusions.
‘Small data’ literacy
Something I’ve observed with quantitative data is that a relatively high level of numeracy is required to analyse it properly. Over-confident techy people (like me, probably) mistake the ability to compare two numbers with deep mathematical insight.*
As a basic example, say you ask ten people whether you should build feature X or feature Y. Seven people say they’d prefer X and three would rather have Y. You should go with X right?
Maybe. But there’s a close to 1 in 5 likelihood that a skewed result like that could occur purely by chance. ie If you toss 10 coins, there’s a 17% chance that at 7 or more would land heads. Small samples — like those gathered by early-stage start-ups — can’t be relied upon. They can only be used to help inform a bigger decision making process.
* I found the book ‘The Art of Statistics’ by David Spiegelhalter to be a great insight into why statisticians and data analysts exist!
User research woes
Without meaningful product data, we turn to empirical information. We conduct user research through interviews and workshops. This can uncover invaluable insights that you might never have imagined. But it’s noisy feedback due to numerous factors:
This is particularly apparent during group interviews but can be a problem for individuals too. It’s very easy for some people to confuse how they’d like their business to operate with how it actually operates. They might be mindful of presenting themselves professionally in front of others. This can make it challenging to decipher true areas of friction in a workflow.
If you treat people with kindness and respect (like you should!) they’ll probably reciprocate. In my experience, this can lead people to say what they think you want to hear during research conversations. They might overplay perceived pain points they’re aware you’re trying to fix or be overly positive about proposed solutions. As I’ve interviewed more users, I’ve become better at coaching more truthful responses. But I think that some people will always be just too eager to please.
In our case, this resulted in us having conversations with users who were already sold on our concept. We actually needed to speak more with people who were more unsure about our ideas, as they were likely more representative of the market as a whole and might provide more critical feedback. What didn’t they like? Is there anything we could change that’d make it more appealing for them?
“The trouble with market research is that people don’t think what they feel, they don’t say what they think, and they don’t do what they say.”Rory Sutherland, Alchemy
We had actually anticipated problems with erroneous user feedback from the beginning. But we knew that we needed a rich stream of feedback from real users to stand a chance of building a successful product. So to try to circumnavigate these issues, we pushed to get paying customers onto the platform right away.* We got users to put their money where their mouth is, so to speak. Unbeknownst to us however, this generated more noise.
Several members of our team had strong networks and good relationships with numerous business owners who fit our picture of ideal customer persona. Getting these people to sign up and pay for the product (albeit at a reduced rate) was easy, as trust had long been established.
I personally took these early sign-ups at face value. I assumed these people wanted the product. In reality however, I think many bought into the people selling the product more than the product itself. £50 a month is a small price to help support somebody you know, like, respect and trust. But the ‘niceness’ noise factor was amplified tenfold.
A big mistake that I made was to not conduct enough of my own research directly with these early adopters. I should have needled out my own conclusions instead of relying on third-party feedback. But there was a tricky dance going on. Because these users were now paying us money we felt obligated to continue to ‘sell’ our solution to them, rather than push for more challenging feedback. Losing a few customers at this stage would have felt like going backwards, but might have been key to progress in the long-run.
After several months, users who weren’t as interested as we had thought dropped away and we were left with people who were better suited to the product. But by this point we’d wasted a lot of time and energy in an echo chamber filled with the wrong people.
* Pushing for paying customers as soon as possible was part of our bet on taking more funding. We aligned ourselves early on with revenue metrics and initially at least, this paid off. It became more problematic later on.
A false start
We got to something like 30 new sign-ups from people across our extended network in a matter of weeks. That might not sound like a lot, but it was big for us. And it felt easy. We thought that we’d nailed some fundamentals and that each subsequent month would see compounding customer numbers.
The upside of these early results was that raising investment was easier. But it was based on brittle foundations that we weren’t entirely conscious of. Our early successes gave us false confidence with our financial forecasts and led us to overspend. Subsequent month’s sales figures deteriorated rapidly as we ran out of low-hanging friendlies to sell to. In fairness, ‘crossing the chasm’ from a group of early adopters to the majority is a common problem for many businesses (I should really read that book).
Noisy data summary
So to recap, noisy data — both quantitative and qualitative — is prevalent everywhere. It can be especially misleading for early stage businesses trying to find product-market fit.
- Small sample sizes can create misleading signals that might require deeper statistical familiarity to accurately interpret.
- What people say and how they act will often be totally different. They may say things that are overly ‘nice’ or idealistic even when they’re trying to be constructive.
- Selling to friends and peers doesn’t give you an accurate gauge of the broader market’s wants and needs.
- Pushing for critical product feedback is more difficult once a transactional relationship has been established.
Making strategic decisions using noisy feedback
Like any early stage software team, we came up against important strategic decisions every day. How should we position our product in the market? What user persona are we really building for? Which new product feature should we build next? What aspects of the product need improving? What’s blocking growth? What should we cut?
Ultimately, I think we failed to make some of these decisions quickly enough. Picking through ambiguous feedback took time, which is not something you have a lot of if you’re a start-up that lives or dies on growth. Waiting in the hope that stronger signals will appear was not the right strategy either because that takes even more time. So what should we have done instead?
Noisy feedback is distracting and uses undue time and energy. I think we need to be quite ruthless with ourselves to avoid getting sucked down dead-ends, even if we’re conscious of the traps ahead. I think it can be beneficial to actively ignore as much information as you can (fill your ears with wax).
We should strive to passively collect as much quantitative data as possible from the start. It could be useful in the future as datasets are expanded. But when sample sizes are too small to have meaning, we should probably just ignore them.
Gathering qualitative research is generally more time consuming and therefore more costly. At the very start of a project, it is essential to talk to as many people as possible and gather a wide range of opinions. But once you can set an initial course, I think it is beneficial to distance yourself from this approach. Listening to too many sources is unsustainable if you need to move quickly. The feedback is not just noisy, it’s cacophonous.
In our case, we eventually narrowed our user base down to 2–3 people who we felt we could trust for truly honest and valuable feedback, and built a product for them. We tried to ignore everyone else. I wish we’d done this sooner as it massively sped up decision making.
Follow your gut?
If we accept that success is predicated on chance as much as it is on making the ‘right’ choices, it becomes easier to make decisions without waiting for supporting data points. When time is important (it normally is), a wrong decision is probably preferable to no decision at all. We need to choose quickly and confidently without overthinking, but be ready to change our minds just as quickly if we need to.
I think that there’s a myth that’s propagated across the tech world that you’ll get the answers you need if you only look at the numbers. And I think that there can be a hesitance to act without sound signals for fear of looking irrational. But I think data can be every bit as fallible as the human mind.
Businesses are complex entities with hundreds of inputs and it’s not possible to forecast successful outcomes on metrics alone. So we shouldn’t be afraid to start choosing paths. Data can’t do that for you. And when it can, you’ll be out of a job anyway.