Stefan Thomke, professor at Harvard Business School, says running experiments can give companies tremendous value, but too often business leaders make decisions based on intuition. While A/B testing on large transaction volumes is common practice at Google, Booking.com, and Netflix, Thomke says even small firms can get a competitive advantage from experiments. He explains how to introduce, run, and learn from them, as well as how to cultivate an experimental mindset at your organization. Thomke is the author of the book Experimentation Works: The Surprising Power of Business Experiments and the forthcoming HBR article “Building a Culture of Experimentation.”
TRANSCRIPT
CURT NICKISCH: Welcome to the HBR IdeaCast from Harvard Business Review. I’m Curt Nickisch.
In science the need for experimentation is cut and dry. You come up with a hypothesis – whether it’s about how storm clouds move or how cells in the body die, and you set up an experiment to test it. There’s a method. It’s called the scientific method and you test it over and over again until you’re sure that it’s replicable and your answers are right. Or, at least as right as they can be until new variables come to light, or the landscape changes.
In business, there isn’t currently as much experimentation. Value has been placed on experience. On the intuition of managers and leaders. And that’s a bad thing says today’s guest. Even in the most innovative industries we can think of, more can be done to set up experiments, test the results and deliver better products and services to customers. And this goes far beyond A/B testing at tech giants.
Our guest today is Stefan Thomke. He’s a professor at Harvard Business School. He’s the author of the book, Experimentation Works: The Surprising Power of Business Experiments and he also wrote the HBR article, “Building a Culture of Experimentation.” Stefan, thanks for coming in.
STEFAN THOMKE: Thanks for having me.
CURT NICKISCH: Just to start, pretend I’m a business leader. Make the case for me. Why do we need to experiment more in business?
STEFAN THOMKE: Well first of all, it can generate a tremendous amount of value. Let me give you an example. Microsoft’s Bing, which is a search engine. An employee working sort of at Bing, came up with an idea on how to sort of display its ads. The manager didn’t think much of it. And they kind of shelved it. But the employee insisted.
At some point the employee decided just to launch an experiment to run a test, a controlled test. And when he ran the test, that little change, a few days of work generated more than $100 million of additional revenue in that year alone. And of course, more revenue going forward. It was in fact, it was the most successful experiment that was run at Bing.
So, what made the difference? Well the difference was that the employee had the power essentially or the authority to run the experiment, to launch it and to test it. It’s the test that actually told you what works and doesn’t work and —
CURT NICKISCH: And not the manager.
STEFAN THOMKE: And not the manager. The problem is in a lot of innovation, especially sort of when you’re trying to predict customer behavior, we get it wrong most of the time. And so rather than trying to follow our intuition or our opinions, why not just run the test and let the test tell us what works and doesn’t work?
CURT NICKISCH: And what’s the answer to that? Why aren’t people doing it?
STEFAN THOMKE: Well there’s lots of reasons why people are not doing it at scale, especially. So some people are sort of running simple experiments – because they refer to an experiment as something like a trial. We’re trying something. That’s not really an experiment sort of in the scientific sense. And they don’t do many of those because they either don’t have the infrastructure to run many tests. They may not have the tools sort of to do so. It may be too expensive to run it. And then they may decide that listen, we run a test and we get some results and then nobody listens to us anyway.
CURT NICKISCH: Right. Do managers overestimate the downside to experiments and underestimate the upside?
STEFAN THOMKE: I think sometimes they are too overly concerned about the risk of running the experiment. For good reasons – you have a lot of traffic. You may not want to launch something that results in a loss of customers visiting your website for example.
CURT NICKISCH: Right, if it goes down.
STEFAN THOMKE: If it goes down and so if you don’t have good stoppage rule, kill switches and things like that sort of in place and then maybe a risk eversion, it’s also stepping into the unknown. And quite honestly, its, it takes humility to admit that I just don’t know. Walking into a meeting and we’re launching this thing and everybody has some hypothesis about what the outcomes going to look like. And just going to the meeting and telling everybody listen, quite honestly I don’t know what’s going to happen, so let’s just find out.
CURT NICKISCH: Even though I get paid more and I’m in charge, I don’t know either.
STEFAN THOMKE: Exactly. And the higher up you go, the more you get paid. The more senior you get, you get paid to make tough decisions. And you want to be the decision maker. And you got to create sort of an organization that ticks a little differently so to do this sort of thing.
By the way, it’s not just the online world, it’s also the physical world where companies are running experiments and even there we have to make big decisions. Sometimes very expensive decisions and it’s the experiments that can in fact, adjudicate whether we want to do something or not.
Kohl’s – you know, big retailer and so forth. So Kohl’s hires a consulting company and the consulting company basically does a cost analysis and they go to senior management and tell them, listen, we figured out that you can save a lot of money if you open your stores an hour later. Now here you are. You’re running this company and you have to make a decision. Should we do that? Calculating the cost savings is easy. Because you can pretty quickly figure this out. But the big question is, what’s actually going to happen to our revenue? Are customers going to buy less if we open an hour later? So how do you make these kinds of decisions?
We can analyze and analyze, but we won’t know until we actually do it, until we run the test. And in this case they did. And so they ran controlled experiments in which they sort of setup these tests, opening an hour later and lo and behold, at the end the result was it didn’t make much difference.
CURT NICKISCH: Just so we’re on the same page, how do you go about setting up an experiment? Are there playbooks for this?
STEFAN THOMKE: Well, first of all there are tools. A lot of the companies that I describe in the book, describe in the book, a lot of companies that I describe in the book built their own infrastructure, built their own tools because when they got started many years ago, the tools weren’t around. So you look at an Amazon or Microsoft and Netflix, a Booking.com, I mean you go through them and there’s about a dozen or so. They decided to do it themselves.
CURT NICKISCH: So they just, they knew that they had questions they wanted to answer and they just figured out a way to do it.
STEFAN THOMKE: They figured this was going to give them a competitive advantage. If they can kind of go out and just test a lot and they knew that they often get it wrong, and so they started investing in infrastructure and so, at a place like Microsoft for example, you have a very, very large group that basically runs the infrastructuring on something like the last time I checked, it was something like 85, 90 people or so that are just sort of doing infrastructure.
But the good thing that happened a few years ago is there are now third party tools as well that can do this, both in the online spaces and in the brick and mortar spaces. Which do sort of a lot of the heavy lifting for you. A lot of the statistical stuff and so forth. And so, so it’s gotten a lot easier than say if you wanted to start say five or 10 years ago.
CURT NICKISCH: Developing a culture for this is probably a little bit different?
STEFAN THOMKE: I think it may be potentially harder than getting the tools and building the tools because now we’re dealing with behaviors, with beliefs, with norms and all sorts of things.
CURT NICKISCH: How does this show up in companies if the culture for experimentation is not working? What do you actually see and observe?
STEFAN THOMKE: Well the classical example is they start running experiments. We have the experiment. We hand over the results to the group that asks us to run the experiment, and then nothing happens. Or, they will start to challenge the experiments. Something must have gone wrong.
I remember a story where an angry person actually called sort of one of the tool venders, sort of in this space, and complained about the tool being wrong. The person ran an experiment that actually showed, and the experiment showed that actually gave, if you give customers less choice in his setting, you get better performance. And that was kind of just counterintuitive because everything that he believed and up to this point is that you should give people more choices.
And so he was really disturbed by the finding and so he called them and complained that there’s a flaw in the tool. Something in the tool must be wrong because the result doesn’t match the experience that he’s had and he’s been doing this for a long time. And so, you run into that sort of thing.
CURT NICKISCH: Which kind of underlines your point that experiments bring new insights that you just can’t develop on your own.
STEFAN THOMKE: Correct. There’s a company called Booking.com which most of us use. In fact it’s the biggest accommodations platform in the world. More than 1.5 million room nights are booked on the platform each day. It’s a two0-sided platform. This is what we call it. It’s got suppliers on one side which are hotel operators for example. And of course, it’s got customers like us on the other side.
And Booking.com runs a massive number of experiments. My estimates are and I’m probably on the low side they told me, it’s my estimates. It’s over 30,000 a year of experiments. And it’s a really fascinating company. It’s also a highly successful company. Their gross profits are in the high 90’s percent. And they don’t really have any assets. They don’t really own any accommodations. So it’s a super competitive industry too.
And so how do they get away with this? And the answer to this is they run a lot of experiments. And they created an experimentation culture, where almost running experiments is like breathing. You kind of do it every single day. I mean you have to, Curt you have to think about the numbers here. Even if I’m running a low number of experiments, I mean they’re running more than 100 new experiments a day. You have to have an organization that can even come up with so many hypotheses.
CURT NICKISCH: I mean you mentioned the number of transactions that Booking.com does in a day. How key is that to being able to run experiments? Does that also work for places that just don’t have data like that?
STEFAN THOMKE: Yes, it works for places that also have a lot less traffic. The underlying math changes, sort of what you have to do algorithmically is very different. In fact, if you have very large sample sizes, a lot of traffic for example, you can really fine tune. You can sort of do very, very small changes and you can kind of pick up whether that change actually causes something to happen. As your sample size shrinks, you’re going to have to go for bigger changes. We call it the power of an experiment. You have to power an experiment. Statistical power. And so, I recommend for companies that are sort of smaller that maybe they kind of run experiments that are a little bigger.
Now, what happens also and this is something that actually happened at IBM. When they started to do this they realized that they have way too many websites. So yes, they had very little traffic on some of these websites, but they didn’t need all the websites. So that actually led to a process of consolidation. They said listen, we don’t really need all these things so what we’ll do is we’ll consolidate, and we get sort of more traffic on fewer websites which then allows us to sort of run more experiments.
CURT NICKISCH: I wonder if there are companies or industries outside of consumer facing tech, or outside of scientific, or pharmaceutical companies where experimentation really feels foreign?
STEFAN THOMKE: Well, I mean, the classical companies I think are sort of in the creative industries where the assumption is that everything is driven by creatives. Look at entertainment for example. And look at what Netflix has done. So, Netflix kind of flipped it around and they operate in the creative industry, but they are completely experimentation driven. And I think it was a big wakeup call for the entertainment industry because when you go in and you run Netflix, you are part of their ecosystem, their experimentation ecosystem. They run a massive number of tests because they want to find out what works and doesn’t work. By the way, running the test and getting result doesn’t mean that you have to blindly follow what the result is because sometimes there are good, strategic reasons why you may not want to implement what the test tells you.
CURT NICKISCH: Right. Or there are tradeoffs to whatever benefits —
STEFAN THOMKE: Or tradeoffs for example or maybe there may be a contractual violation or something like that. But what that test does is it actually adds transparency to the decision. So you cannot pretend that we’re doing this because it’s good for the customer, or something like, or good for the viewer. It adds clarity to that. We understand from the tests what’s good for the viewer, but there may be other reasons why we may not want to do it. And adding that transparency to what you’re doing I think is sort of a big value and allows a company like Netflix to operate really in the creative industry, with a testing approach.
I don’t want to diminish the value of creative talent because creative talent is really important, but that doesn’t create certainty in terms of decision making. To me the creative talent and the intuition is an important part of experimentation, because it allows us to create hypotheses. You have to ask yourself Curt, where do these hypotheses come from?
CURT NICKISCH: Yes, they’re from people. People asking questions or have some ideas, yeah.
STEFAN THOMKE: Absolutely. So what I’m saying is that running all these experiments, they all have hypotheses that came out of product groups and it’s the people who come up with these hypotheses and so where do they get the ideas? Well, it’s intuition sometimes. Its insights, surprising, you know customer surprises. Things that thought that were true and then they observe something that doesn’t quite fit sort of what they know. Its usability labs.
So there’s still, I mean these companies all run qualitative research and, but they do all the kinds of things that other companies do, but they do it for generating hypotheses which are then rigorously tested versus, other organizations that generate the hypotheses and go directly from hypotheses to launch.
CURT NICKISCH: Right. Based on whoever is the best public speaker, or makes the best case in a meeting rather than —
STEFAN THOMKE: Yeah, yeah, yeah it’s, there’s a word for that in the community called “hippos.”
CURT NICKISCH: Hippos?
STEFAN THOMKE: Yes. Highest paid person’s opinion. Hippos. And we all know that hippos are very dangerous animals.
CURT NICKISCH: I think a lot of executives are probably also not used to knowing how much experimentation to do. How do you know what to experiment on and how do you know what to let be?
STEFAN THOMKE: Yes. You have to empower people to make that decision. And the reality is right now, I think most organizations test too little. So I don’t think you should be too worried about testing too much. Yes, there is probably a point in which you test too much because you need an organization that can absorb all that knowledge, or all sort of that, all those findings that are generated by all these tests. That’s true. And we need to think about that. But I don’t think that’s a problem in most organizations right now. Right now they’re doing, not doing enough.
CURT NICKISCH: If you’re bringing this into a company do you try to do this companywide? Do you try to start with a team or a division and scale it up from there?
STEFAN THOMKE: So there are different ways to organize your experimentation teams. There are three models that I describe in the book. One model is really more a centralized approach. I basically have like a center, a group that’s responsible for experiments and they’re like a service organization, where you can come from a business unit, you can commission an experiment and they’ll run it for you and they give you the results.
CURT NICKISCH: Oh that’s interesting.
STEFAN THOMKE: That’s one model. And a lot of companies start out that way. Because they are kind of a little uncertain how this is all going to work out and they may not believe that the company’s ready to do this at large scale.
CURT NICKISCH: It probably simplifies training and it lets dip their toe in without really having to —
STEFAN THOMKE: Exactly. And you have a few experts and they kind of make sure that people don’t do foolish things. Then another form is to have a decentralized, completely decentralized. So now we’re shifting the autonomy basically to people and allow pretty much anybody to run experiments and we don’t centralize it anymore.
And of course there you have to trust people. You have to know that they’re actually capable of doing this and it’s a way of course to rapidly scale things. But what happens there is, when you start to put all these, you spread all these sort of your experts around and they’re all the way sort of through the company, they get very busy and you kind of lose the focus on building capabilities because you need to always kind of get better and better. And so there’s no coordinated approach to this. Everybody kind of does their own thing.
So what companies have found is they go from centralized to decentralized and they want to scale things, but then they realize that they need to have a more coordinated approach and then they create something which I call a center of excellence. And the center of excellence is kind of a hybrid model then where you have sort of a core group that actually is responsible for developing capabilities, experimentation capabilities, kind of know what tools to use and push the envelope.
But at the same time you take people out of that group and then place them sort of into the different organizational units that are doing this and they are basically there to help as well. And companies found that that’s actually sort of a very good compromise because on one hand you kind of empower people to do things on their own and at the same time you actually have someone who centrally owns this capability as well.
CURT NICKISCH: How do you know when it’s really working?
STEFAN THOMKE: You know the way you’re really working, I think it’s a cultural test. And I tell you, here’s the test. You sit in a meeting and you’re discussing a decision and you know when it’s working either when someone asks, where’s the experiment, or when someone actually walks into the meeting and says, here is the experiment. When these kinds of discussions are happening every single day without you having to ask for these things, without you having to push for things, then you know things are kind of working.
I call it, it’s like running the numbers. You wouldn’t, when you go into to a meeting you always expect people to do some financial analysis. It’s almost a given, right. So it has to be like that. It has to be like running a financial asset. It has to be a given that you kind of do a test. You run an experiment and unless you’ve done it, you know, we’re not going to make a decision.
CURT NICKISCH: Say you’re an individual contributor. You may be a manager. You may be a frontline worker. But you buy into this. Like you see the value of experiments. You want your organization to do more. What do you do to try to bring a culture of experimentation to a place that is still relatively new to it?
STEFAN THOMKE: What you can do as an employee is first of all raise the awareness around you.
CURT NICKISCH: What does that mean?
STEFAN THOMKE: That means basically explaining sort of too people what sort of the value of the experiment, experiments are. But then also, I think at the same time is maybe try to do some of these things in the areas that you control. You know, yes, I see the difficulties sometimes and I hear this from people saying OK, I get you. But they are two levels up. You know, I’m not sure that they do. So what can I do? So I always tell them start small. Get going and then this is what often what happens and I’ve talked to organizations that actually started this way and then got bigger and bigger. They said, you know, we started out and we ran an experiment and we went to the meeting and we told people what the experiments sort of showed us and so forth and they kind of listened to it, and they gradually started to sort of understand the value of it. And, but you got to get started. Don’t wait.
CURT NICKISCH: What kind of manager is then the successful manager in a company that has a culture of experimentation? Because in the past maybe, it used to be people who had experience, people who had intuition. Now when you run experiments, what is the type of manager who excels and advances in an organization that has a culture of experimentation?
STEFAN THOMKE: So, you can ask the question, if everything is adjudicated by experiments, by tests, what’s the role of the manager anyway? I kind of break it down into sort of three different things that they should do.
First role I think of a manager is to set a grand challenge. What we don’t want to do is we don’t have an organization that just does experiments willy-nilly with no direction. So there needs to be a grand challenge. A grand challenge for example could be we want to have the best user experience in the industry. And that grand challenge then can be broken down into different pieces which then can be addressed with hypotheses which are then tested. So you give them a directionality that needs to be a program, a systematic program that sort of aims for some bigger goal. So that’s the grand challenge.
The second thing I think that managers need to do especially in this kind of environment, they need to place the systems resources and organizational designs that allow for the large scale experimentation to happen. You know that stuff, things like that don’t happen by themselves. You need to invest in tools. You need to make sure that you’ve got the right organizational design to start out with and maybe then change it when things don’t work. So you have to think about that as well. And you need to make sure that sort of all the systems are in place, so someone like that employee at Microsoft can just kind of push a button and essentially, and just launch and run this thing. If employees, if it takes employees weeks and weeks to setup an experiment, what are the odds of them doing it at large scale? It’s not going to happen. So you got to make it easy as well and you need to empower sort of people to do it. You need to democratize experiments.
And the third one is they need to be a role model. They need to live by the same rules. So if we ask our employees to test, to experiment before they make a decision, we need to live by the same rules ourselves. So when we go into a meeting and we propose a course of action and someone says, that’s really nice. We’ll run a test and let you know what happens, we need to then have the humility to say, OK, well thank you. Let’s do it and let’s do it quickly. So, we need to live the same way. We need to kind of do the same thing that we ask our employees to do. So that’s a different style of leading.
CURT NICKISCH: Stefan, thank you so much. Maybe we’ll try some experimentation on this show as well.
STEFAN THOMKE: Thank you. Great to be here.
CURT NICKISCH: Stephan Thomke is a Professor at Harvard Business School. He’s the author of the book Experimentation Works: The Surprising Power of Business Experiments, as well as the HBR article, “Building a Culture of Experimentation”. You can find it in the March-April 2020 issue of Harvard Business Review and at HBR.org.
This episode was produced by Mary Dooe. We get technical help from Rob Eckhardt. Adam Buchholz is our audio product manager. Thanks for listening to the HBR IdeaCast. I’m Curt Nickisch.
"how" - Google News
February 04, 2020 at 09:19PM
https://ift.tt/2RXAeig
How to Set Up — and Learn — from Experiments - Harvard Business Review
"how" - Google News
https://ift.tt/2MfXd3I
Bagikan Berita Ini
0 Response to "How to Set Up — and Learn — from Experiments - Harvard Business Review"
Post a Comment