How Big Companies Lose

The optimistic title of this post would have been, “How Small Companies Win” but I’m Scottish, and so the pessimistic title it shall be.

To some, particularly those in tech, there’s a belief that the average big company “doesn’t get it.” The startup sees the world in a way that the large company does not. The startup sees a large market, a catatonic customer base ready to be awakened, and a host of large incumbents ready to be upended – The fat cats will get their comeuppance and won’t know what hit ‘em.

However, as with most things, I’d argue it’s a bit more nuanced than that. Senior executives (at least the ones I’ve known) of large companies tend to be keenly aware of the macro conditions surrounding their business. They care deeply about their business. They obsess about the health of their business and they’re no less formidably educated, talented, or hungry than any startup counterpart. In fact, more often than not, their experience at large companies has afforded them a network that gives them access to the brightest minds and exposure to the latest ideas and trends in their industry. Given the available resources of most large companies, these executives have the latitude to chart a course that most startups can only fantasize about.

So, given these handicaps, why is it that we see large companies repeatedly stumble and startups make deep inroads into seemingly heretofore impenetrable markets?

I’d argue there are four primary reasons how big companies lose (and some secondary ones too):

  1. Focus – Large companies are often attacking and defending on multiple fronts. In any such approach, it’s inevitable that the best resources are diluted across those efforts. With startups, they tend to be (at least initially) singularly focused and that makes them dangerous.
  2. Risk – Startups have everything to gain. Publicly traded companies have everything to lose. Seemingly trivial issues can become major thorns and the tendency is toward risk avoidance, let alone risk mitigation, particularly when you traverse down from the execs and into the broader org at large.
  3. Heritage – Heritage needn’t be an abstract term to politely describe those long-since-forgotten values of a bygone era. Rather, heritage can be something tangible. Something you’re famous for. Something, when everything is going to hell in a handbasket, you don’t panic. You know what you’re about, and you’re not about to lurch. However, heritage can also mean being encumbered with the multiplicative choices of the big company’s forebears, and nobody ever bats 1000…
  4. Talent – Inevitably, within a large org, it’s impossible to stack it only with A-players. Or, said more accurately, I’ve yet to observe a large company with only A-players. Often with no major financial upside, a risk avoidance culture, and perhaps a difficult legacy to be inherited, up-and-comers understandably may be reticent to join the ranks of a big co, and ambitious and talented employees may feel their best chance of success lies outside the big co.

How big companies win is to address these issues head on. Acknowledging that every minute of the day, one or more startups are actively plotting to take their customers, to carve out a large piece of what was once “theirs” is a necessary starting point. That feeling has to be shared not just among the most senior executives but throughout the entire org. After all, the impermanence of success is the only permanent condition.

Time To Level Up

Earlier this year, I had the good fortune to meet Marc Strigel, COO of Soundcloud. Marc’s one of those rare folks that immediately strikes you as perpetually curious, always deconstructing and reframing what he thinks he knows, stress testing those constructs in conversations with others, giving you a lot to think about…During one of our conversations, he described the culture at Soundcloud, and one of those aspects stuck with me throughout the summer – the notion of leveling up.

From their site, here’s how Soundcloud describes it:

We use a term to describe how we’re constantly seeking to improve: “Level Up”. It’s about being self-motivated and challenging each other at every moment.

In America at least, we lionize people who push themselves to the absolute limit. Indeed, I used to be one of those people myself. In college, I ran division 1 track, and the mind games I played to convince myself that I was stronger than my opponents, willing to push myself past the point of exhaustion to prevail, were, in hindsight, the product of unadulterated obsession. Over time, this mentality comes to define who you perceive yourself to be and how you outwardly project yourself. The willingness to endure pain becomes a fact of life and as your ability progresses, so too does your pain threshold, and the addictive nature of it just takes over and the cycle repeats ad infinitum.

…But, that was 20 years ago. As I ran along the Ship Canal in Seattle last night, I ran with nothing but love and happiness. Did I push myself to the limit? Not even close. Indeed, I think now, I’m not looking to push any single part of my life to the obsessive limit. Instead, I’m trying to push all parts of my life to the point of happiness, or equilibrium, and that goes for my professional life too.

This week, I found happiness in figuring out how to write some site scraping code of all things. I found happiness in getting back the results of a formative product my team has been working on and seeing the enthusiasm it generated within the organization. I found happiness in a really interesting project I’ve been contributing to with a friend. I found happiness in writing.

At the time I spoke with Marc about Leveling Up, I mostly thought about the pressure for individuals of an organization to constantly push themselves to the limit and the potential for that to negatively manifest itself. Soundcloud is right though – the key really is self-motivation as opposed to external, organizational pressure or an internal fear of failure. The role of a leader in such an environment is not to create a culture of negative up-or-out energy, but to create the positive conditions that allow and encourage everyone to push themselves to the point of happiness. And real happiness comes not with rote repetition, stagnation, or fear, but with enlightened growth in all areas of your life that are important to you.

This weekend, I’ll be out there again teaching myself to fish for trout in some mountain lake, fumbling around with lures, casting bubbles, and leader lines – stuff I honestly had no clue about 3 weeks ago, but my kids will surely appreciate it, and that makes it important to me. Time to Level Up!


Business Intelligence Is Dead

The human brain constitutes about 2% of our body mass, but consumes as much as 20% of our energy. As an intermediary in decision-making, not only is the human brain inefficient, it’s also often wrong as discussed at length in Daniel Kahneman’s book Thinking Fast & Slow.

At the same time, computation is faster and cheaper than ever with no end to improvement on the immediate horizon. Combined with the rise of the Open Source movement, this has brought us to the point where machine learning and other artificial intelligence techniques are no longer the domain of obscure research papers but are now table stakes for most apps being built today.

In light of these advances and our inherent human fallibility, any business software that does not currently “close the loop” between analysis and action is essentially dead – a piece of zombie code used by people living in the past that rely on the output of these systems to prop up their vainglorious assertions with a convenient percentage here and a tortured average there.

The promise of Business Intelligence was always supposed to be about better decision-making, but more often than not, it provided nothing more than a prop for people to continue to decide emotionally and justify intellectually – a conversation piece, and an expensive one at that.

Consequently, the value of today’s Business Intelligence tools are essentially being discounted down to zero because of the humans that consume the output, belabor its meaning, and at some point long after the fact, make a decision that has a not insignificant probability of being wrong.

For enterprise SAAS companies that are being founded today, the basic act of reporting is a service that they will give away for free, because they understand that gathering & integrating data, labeling data, creating metadata, and performing transformations are only an intermediary step towards action within a closed loop system which is where the real value lies. Within such services, reporting of aggregate descriptive data is a convenient by-product and one that can be given away for free with relatively little incremental cost. Just as big-box retailers have their door-busting deals on soda, SAAS companies can afford to lead with free reporting knowing they’ll make it up and then some with much more valuable services (and ones that lend themselves better to pay-for-performance subscriptions).

If you are in the reporting business be it sales data, social data, web data, or any other variety of data, and you haven’t transitioned towards something further up the value chain, I would be very worried for what’s about to come. BI is already dead and it’s just a matter of time before the markets acknowledge that en masse and value it accordingly.

p.s. I spent a fantastic 4.5 years of my career  with MicroStrategy in the mid-2000s. During my time there, I would often hear Michael Saylor, the CEO of MicroStrategy make sweeping declarative statements about the future delivered with such conviction that they were taken as fact by most in attendance. With much affection, this post is partly for him. What’s up Mike?!

A Word On Segmentation

This was supposed to be a post on the rising cacophony around segmentation, but instead, the Google Trend of “segmentation” quickly disabused me of any such notion.


For good measure, I also checked to see how “clustering” is trending, and it’s not any better…


Sadly, even a specific algorithm like k-means isn’t enjoying a halo effect from the sexiness surrounding all things data sciencey…


It’s also of little comfort to see that k-means is 5X more popular among our Asian counterparts as compared to the U.S.


So, what’s the point? Well, the point is that when you work in a company and lots of people across the spectrum of technical competence discuss the topic of segmentation, there exists an inevitable gulf between a primarily creative group of practitioners within the company who think of segments in terms of customer demographics and seek to create archetypes / personas to represent those customers, and those who work in a technical capacity and think primarily in terms of customer behavior.

Being able to bridge that gulf in thinking such that the creative parts of an organization can embrace a systematic, behaviorally-driven approach to segmentation has to be an important goal of any data scientist / statistician working on this problem. Sometimes, the ability to describe your work is more important than the work itself…

Per Unit Economics: Dogs Per Night

Last week, I read an article about, a dog-sitting marketplace receiving a third round of funding of $12MM, bringing their total funding to date to $25MM.

I didn’t give this much thought until later that night when I started to think a bit more about the per unit economics of such a business. The article talked about their growth from 10,000 sitters to 25,000 sitters in 2013, it discussed their eightfold increase in revenue between 2012 and 2013, and how sitters charge between $20 – $40 per night per dog, with Rover taking a 15% cut.

When you look at a marketplace like that, Rover has quite a lot of levers it can pull:

  • Growth
    • Grow the number of sitters
    • Grow the number of dog owners
    • Increase the number of active users
    • Expand into other pet categories
  • Revenue Optimization
    • Charge more than $20 – $40
    • Take more than a 15% cut

At this stage of the company, and with $25MM in funding, presumably most of their focus is on growing the service in their core category. So, what might that look like?

2012 Rover Revenue

Well, if 1% of Rover’s 10,000 sitters in 2012 were active on any given night, and they each sat one dog, Rover would have earned between $110,000 – $220,000 on an annualized basis. If we swag Rover’s run rate of $7MM annually*, at 40% – 60% active sitters in 2012, they’d have 4,000 – 6,000 dogs per night and Rover  would hit break-even, depending on whether sitters are charging closer to $20 or $40 per night.

So, as Rover grows throughout 2014 and beyond, one metric I expect they’re paying attention to is dogs per night. At what point does the business look primed for an exit? My guess is somewhere around 15,000 dogs per night mark. With an estimated 153,000 dogs in Seattle alone and 47% of households nationwide owning at least one dog, 15,000 dogs per night nationwide seems plausible, doesn’t it?


*So, what’s Rover’s run rate? Short answer, I don’t know. Speculative answer? Well, if we take Rover’s 43 employees and assume that, fully burdened, they cost an average of $100K per annum (though perhaps it’s more – it’s Seattle after all), then that puts Rover’s payroll  ~$4.3MM. Then we have to add in other operating costs and of course the cost of acquisition of sitters and pet owners. All told, I’m guessing their run rate is between $6MM – $7MM, which would mean this latest round of funding gave them another couple of years to work towards an exit…

iPython Notebooks And PuLP

Lately, we’ve begun working on various constrained optimization problems, and this was a good opportunity to use Python, and more specifically, iPython Notebooks which I’ve been learning about in recent days.

Assuming you’ve got iPython already working, try typing:

ipython notebook

If it works, great. If not, you may have some dependent python libraries to install e.g.

pip install pyzmq
pip install tornado

Once you’ve got the dependent libraries installed, typing ipython notebook should reveal a similar screenshot in your browser;

ipython notebook screenshot

Ok, with that out of the way, I’ve been using a library called PuLP to test out various optimization problems, and so far so good. There’s also a good introduction to PuLP with examples that you can follow. I recreated those examples in notebook and to show them in this blog post, I had to first convert the notebook to HTML format using nbconvert (you may need to pip install pygments to get it working):

ipython nbconvert tutorial.ipynb

I then had the option of hosting the HTML page somewhere (WordPress doesn’t seem to like the HTML page) or, the other option is simply using the nbviewer, which is what I used. In that case, you just pass in the URL of your .ipynb file and it creates a viewing friendly version of it.

Alternatively, you can just use WordPress’ code block functionality and paste in your code:

#Whiskas optimization problem
import pulp
#initialise the model
whiskas_model = pulp.LpProblem('The Whiskas Problem', pulp.LpMinimize)
# make a list of ingredients
ingredients = ['chicken', 'beef', 'mutton', 'rice', 'wheat', 'gel']
# create a dictionary of pulp variables with keys from ingredients
# the default lower bound is -inf
x = pulp.LpVariable.dict('x_%s', ingredients, lowBound =0)

# cost data
cost = dict(zip(ingredients, [0.013, 0.008, 0.010, 0.002, 0.005, 0.001]))
# create the objective
whiskas_model += sum( [cost[i] * x[i] for i in ingredients])

# ingredient parameters
protein = dict(zip(ingredients, [0.100, 0.200, 0.150, 0.000, 0.040, 0.000]))
fat = dict(zip(ingredients, [0.080, 0.100, 0.110, 0.010, 0.010, 0.000]))
fibre = dict(zip(ingredients, [0.001, 0.005, 0.003, 0.100, 0.150, 0.000]))
salt = dict(zip(ingredients, [0.002, 0.005, 0.007, 0.002, 0.008, 0.000]))
#note these are constraints and not an objective as there is a equality/inequality
whiskas_model += sum([protein[i]*x[i] for i in ingredients]) >= 8.0
whiskas_model += sum([fat[i]*x[i] for i in ingredients]) >= 6.0
whiskas_model += sum([fibre[i]*x[i] for i in ingredients]) <= 2.0
whiskas_model += sum([salt[i]*x[i] for i in ingredients]) <= 0.4

#problem is then solved with the default solver
#print the result
for ingredient in ingredients:
	print 'The mass of %s is %s grams per can'%(ingredient, x[ingredient].value())

The next one is called the beer distribution problem. And no, drinking them all is not the answer…

#The Beer Distribution Problem for the PuLP Modeller
# Import PuLP modeler functions
import pulp
# Creates a list of all the supply nodes
warehouses = ["A", "B"]
# Creates a dictionary for the number of units of supply for each supply node
supply = {"A": 1000,
"B": 4000}
# Creates a list of all demand nodes
bars = ["1", "2", "3", "4", "5"]
# Creates a dictionary for the number of units of demand for each demand node
demand = {"1":500,
# Creates a list of costs of each transportation path
costs = [ #Bars
#1 2 3 4 5
[2,4,5,2,1],#A Warehouses
[3,1,3,2,3] #B
# The cost data is made into a dictionary
costs = pulp.makeDict([warehouses, bars], costs,0)
# Creates the 'prob' variable to contain the problem data
prob = pulp.LpProblem("Beer Distribution Problem", pulp.LpMinimize)
# Creates a list of tuples containing all the possible routes for transport
routes = [(w,b) for w in warehouses for b in bars]
# A dictionary called x is created to contain quantity shipped on the routes
x = pulp.LpVariable.dicts("route", (warehouses, bars), lowBound = 0, cat = pulp.LpInteger)
# The objective function is added to 'prob' first
prob += sum([x[w][b]*costs[w][b] for (w,b) in routes]), \
# Supply maximum constraints are added to prob for each supply node (warehouse)
for w in warehouses:
	prob += sum([x[w][b] for b in bars]) <= supply[w], \
# Demand minimum constraints are added to prob for each demand node (bar)
for b in bars:
	prob += sum([x[w][b] for w in warehouses]) >= demand[b], \
# The problem data is written to an .lp file
# The problem is solved using PuLP's choice of Solver
# The status of the solution is printed to the screen
print "Status:", pulp.LpStatus[prob.status]
# Each of the variables is printed with it's resolved optimum value
for v in prob.variables():
	print, "=", v.varValue
# The optimised objective function value is printed to the screen
print "Total Cost of Transportation = ", prob.objective.value()
Status: Optimal
route_A_1 = 300.0
route_A_2 = 0.0
route_A_3 = 0.0
route_A_4 = 0.0
route_A_5 = 700.0
route_B_1 = 200.0
route_B_2 = 900.0
route_B_3 = 1800.0
route_B_4 = 200.0
route_B_5 = 0.0
Total Cost of Transportation =  8600.0

RampUp 2014 Recap

Babbage Difference Engine

Babbage Difference Engine

Yesterday, I had the opportunity to attend LiveRamp’s ad-tech conference, held at the Computer History Museum in Mountain View. Below is a recap of the sessions I attended.

Opening Keynote

Google’s $100M dollar manNeal Mohan, kicked off the day by talking about:

  • Multi-device users and the implications that has for ad formats, ad measurement and attribution.
  • Incorporating user choice into the ads, citing YouTube’s TrueView feature in particular, where users have choice over which ads to watch, and advertisers only pay when the ad is actually viewed.
  • Combating ad fraud both on the inventory side and on the buy side
  • A focus on brand measurement. This one’s funny given that search is entirely a performance based advertising medium. However, there’s a growing argument being made in Silicon Valley that brands should be measuring their display campaigns not through a direct response lens but instead through a traditional TV advertising lens of things such as brand recall and awareness. It reminded me of Instagram adopting the same position last year. This is in contrast with Pat Connolly, the CMO of Williams-Sonoma, who also spoke on a panel and asserted that they’re entirely a performance marketer. Will brands actually buy into traditional media metrics for their digital spend? I think that remains an open question.

The one thing that Neal stressed time and again was the focus on trying to do what makes sense for the users. An example was when someone in the audience asked about injecting display ads into messaging apps, to which he gave a measured reply that they’d only consider doing something like that if there was a logical context for doing so.

Convergence of Offline and Online Data

This panel featured Rick Erwin of Experian, Scott Howe of Axciom, and Dave Jakubowski of Neustar. The main themes that emerged from this conversation were what was referred to as entity resolution both in terms of cross-device identification, multi-source 1st party customer data such as sales and customer service, and 3rd party data appends.

Regarding advertising on a particular channel, one of the panelists made the point that brands need to be thinking about owning the experience versus just owning the moment. A good reminder to not think in terms of email, mobile, desktop, etc. but instead think about the customer’s use case. I’ve seen this a lot where there are various vendors out there who’ll help with one use case of one channel, and the result is that the customer receives a disjointed experience when interacting across a variety of channels (and a variety of vendors).

TV advertising also cropped up, and particularly around addressable TV. Back when I worked on in-store TV networks around 2009, this was something people were beginning to explore and while it’s still early, it’s almost certainly just a matter of time before digital and TV campaigns are targeted to individual users. This also bleeds into dynamic creative which was something of a recurring theme. As more ad inventory becomes programmatic, it stands to reason that TV will eventually follow suit both in terms of RTB and dynamic creative.

Measurement was also something that came up, particularly in terms of digital advertising’s effect on in-store sales. This was something we did at DS-IQ circa 2010 and it’s strange to hear companies only now starting to develop scalable solutions in this area.

How Top Brands Use Data OnBoarding Today

This panel featured Brandon Bethea from Adaptive Audience, Nikhil Raj from Walmart Labs, and Tony Zito from Rakuten MediaForge. A couple of things stood out in this talk. The first was the depth of Walmart’s planning with their CPG suppliers. Nikhil naturally didn’t offer much detail but one could make a reasonable assumption that there’s a large amount of data sharing that takes place between the retailer and the brands. This has all kinds of advantages in terms of developing an understanding of the customer, and in terms of improving marketing outcomes for both brand and retailer marketing campaigns. It’s not clear if there’s a formal data exchange platform that’s common among Walmart and the brands but that’d certainly make a lot of sense.

The other thing that was discussed was the notion of lookalike modeling and also ad suppression. Again, nothing new, but simply a reflection of what was on their mind.

Data-Driven Retail Marketing Strategies

Panelists were Benny Arbel of myThings, Ryan Bonifacino of Alex & Ani, and Jared Montblanc of Nokia. Of note was some of the interesting things that Alex & Ani is doing around targeting for re-targeting campaigns and custom audience campaigns on Facebook through Kenshoo. Ryan cited Facebook as having been a good vehicle for new customer acquisition.

Jared of Nokia discussed how they evaluate their digital spend through the lens of Cost per High Quality Engagement. This makes sense in his world where Nokia is selling their devices through carrier partners. So, when they run a campaign, did a user not just click on a video but actually watch it, for example?

From the CMO: The Future of Data In  Marketing

This was one of the highlights for me, where Pat Connolly of Williams-Sonoma talked with Kirthi Kalyanam of Santa Clara University.

Observation 1: Connolly is one of those self-effacing humble execs who could easily be dismissed as old school based on appearances, and you’d be drawing the completely incorrect conclusion. The guy has been with Williams-Sonoma for 35 years going back to when they were strictly a catalog retailer, he’s demonstrably smart and has obvious command of some pretty technical details. For example, how many CMOs have you heard comfortably discuss technologies such as Hadoop, Teradata, and Aster in one breath and then discuss hazards modeling in the context of attribution in the other breath? To my knowledge, the only vendor doing survival analysis at scale is DataSong, and for Connolly to be in the weeds there was impressive.

Speaking of being in the weeds, Williams-Sonoma has a monthly marketing investment meeting that the CEO attends where junior analysts are presenting the details of various marketing campaigns. Talk about alignment – between direct (.com), marketing, and merch. Impressive.

Some other nuggets:

  • They can identify 50% – 60% of all web visitors, and aim to serve up recs under 40ms. That Connolly can recite the 40ms SLA made me smile, particularly since this is something we live and breath in the Data Lab.
  • They do about $2B in eCommerce and believe they’re the most profitable ecommerce retailer in the country.
  • There are ~100 variables in their regression models, but just one variable has 70% of the predictive value.
  • With a simple A/B test of making their Add to Cart button bigger, they added $20MM in incremental demand.
  • They can currently identify 30% of users across devices with a goal of 60% by the end of the year.
  • They consider tablet as their core site experience with desktop being simply a bigger version of tablet.
  • They consider their competitive advantage to be org alignment between merch, marketing, and direct. I wouldn’t disagree, knowing how difficult this can be.
  • They allow ad cost for acquiring new customers to be higher. Was simply a good example of enlightened decision-making where they aren’t simply trying to maximize ROAS on every single digital campaign.
  • While their paid marketing is entirely performance-driven, their owned media such as their blog is allowed to be more brand focused. Pat cited their West Elm brand which rarely features an actual product.

Measuring Digital Marketing’s Impact On In-Store Sales

Michael Feldman of Google, Gad Alon of Adometry, Kirthi Kalyanam of Santa Clara U, and Ben Whitmer of StageStores were featured panelists.

Gad mentioned that 40% of in-store demand is driven by digital media. Of that 40%, 70% could be attributed to display. I couldn’t find any data on the web to support this claim, and would be interested in hearing from others regarding this.

In-store measurement came up briefly but was surprised this wasn’t a bigger topic in the conference. Specifically, I’m talking about the kinds of things RetailNext, Nomi, and a host of others do.

Lastly, Peter Thiel was on deck for the closing talk. A summary of his discussion can be found here. There were a couple of salient points he made that I’ve been thinking about since. Will perhaps write more when I’m done digesting :-)

Final point – the overall quality of the sessions was by and large very good. It’s clear that there’s a ton of investment in this space and while some might argue strongly that there’s not a bubble, I’d at least say that were I starting a company right now, there’s no way it’d be in ad-tech – just too crowded and fragmented to be excited about.

Final final point – nice job by LiveRamp putting on this one day conference. Next year, it’d be a lot more powerful if the panelists reflected the audience a bit more i.e. I’m guessing half of the audience was female, yet throughout the day, I saw just one female panelist. This is something that’s really got me motivated to do something about.