Product-Market Crossfit

People can't agree on anything.

Have you noticed that things are rarely popular across different groups and generations? Social networks cater to different age groups. Even phone carriers are now branded for specific generations despite the fact that they basically all just carry data to your phone.

These divergences happen naturally. They have always done so. Things and people cluster into pockets based on tribal knowledge.

With products, sometimes the differences are an illusion. Some companies will sell the same product to different groups under different names. On top of allowing some tailoring of the sale experience, it makes it seem like there's more competition than there really is.

Desires of communities change over time. Great ideas are imagined long before they succeed which often only happens after multiple failed attempt. Evolution is not always towards the better. Superior early iterations sometimes came in a place or time that just wasn't ready for them.

These things tell us something about markets. Like the people they're made of, markets are complex. Markets shift over time. Markets are path dependent. When you look into the details, similar markets often differ more than they seem. This makes product-market fit an elusive concept. Some say it's about doing only a few things tailored for a well defined user and getting it right. Some like to engineer complex configurable solutions that can be adapted to broad and diverse users.

Last time, I discussed some aspects of product-market fit that were inspired by concepts of model-data fit found in information theory and machine learning. I mentioned these four concepts:

Low resolution fit: The product vaguely and simply fits its market.

High resolution fit: The product is complex and is tightly tailored to the market.

Overfit: The product is so tightly tailored to some users that it only fits a small part of the market.

Underfit: The product is so generic that it's missing important features and it's easy to copy.

These are types of biases that fall under a more general concept I like to call crossfit. It's inspired by the concepts of cross entropy or KL divergence.

Overfitting and underfitting are common fit shortcomings but having bad crossfit can also happen when the fit is biased in some other sub-optimal way. The product might fit a side market of a greater market, it might focus on aspects that would be more important in another market or it might use a design language that is better tailored to different people.

While overfitting and underfitting are the result of flawed use of good data, bad crossfit can happen when you use your data and knowledge optimally but that knowledge is biased to begin with.

Some of these biases can be extremely difficult to spot. Humans don't intuitively deal well with sampling bias. We have difficulty sorting out confusing paradoxes such as the Monty Hall Problem, Berkson's paradox, or Simpson's paradox.

Having bad crossfit, often happens when product teams with expertise in one domain apply their knowledge to a second domain without being sufficiently familiar with their differences.

This sometimes happens when technically trained software developers (like myself) design user interfaces. My first UIs were very difficult to use for anyone but other software developers. It can happen to anyone however, even to experienced designers. When entering a new field, we all tend to be blind to the relative importance of different details.

There's a cure for bad crossfit but before I get into that, let me explain why I think this is so fundamental.

Taking an information theoretic stance may be the result of a bit of professional bias on my part (the irony isn't lost on me). However, it's a good way to get to the fundamentals.

Information theory is fundamental to everything

Lets jump way down to the fundamentals of the universe, that weird place we are barely familiar with, which is made of matter, energy and form. Form, the less commonly mentioned ingredient, is the most interesting one. It consist of the shapes and evolving arrangement of all that matter and energy.

Information theory and the related field of thermodynamics, study the statistical limits of the evolution of these shapes. You've probably heard that these fields are about order and disorder. They study which patterns and correlations are possible to track, predict and reason about, what knowledge is possible to learn and transfer without too much loss.

When these concept were discovered, they unlocked technological revolutions. Sadi Carnot, the most steampunk of scientist, invented thermodynamics and helped propel humanity through an industrial revolution by allowing for mathematically optimized steam engines. What Carnot did with gases and atoms, Claude Shannon, the "father of information theory" then did with bits, reusing some of the same equations and revolutionizing telecommunication and information technology. Shannon was one of a handful of mathematicians labeled cybernetician maybe making him the most cyberpunk of scientist. These two theories describe statistical forces fundamental to everything. What ties thermodynamics and information theory together is that knowledge and information are ultimately just ordered atoms, chemicals in our heads, in computers or in other materials (See also Maxwell's demon ).

Claude Shannon's book starts like this:

The word communication will be used here in a very broad sense to include all of the procedure by which one mind may affect another. This, of course, involves not ony written and oral speech, but also music, the pictorial arts, the theater, the ballet, and in fact all human behavior.

Information theory is about how languages or "encodings" can best fit data, especially complex, uncertain data.

Having good crossfit is about speaking your users' language

Bear with me while I'm being overly technical just a bit further and try to convey how information theory applies here.

Let's talk about data compression. Compression algorithms use information about frequencies of patterns in data to create a set of symbols. The algorithms try to assign the largest and most common patterns in the data to the smallest symbols, leaving rare patterns assigned to the remaining larger symbols. The set of symbol ends up being like an optimal language for the data. The last step of compression is using these symbols to describe data. Humans follow some of these patterns instinctively. We naturally pick shorter and simpler words for things we talk about more often. It may be an essential trait of human intelligence that we act as living compression algorithms.

Information theory is about the limits of what you can do with this type of procedure, about how much non redundant information you can squeeze into small amounts of data.

The technical definition of cross entropy can loosely be stated as a measure of the extra number of bits you need when using a set of symbols optimized for one dataset, to compress a different dataset.

Information theory ties intelligence, language and simplicity together

The technical definition of simplicity is that which can be described and understood with few bits or words.

The technical definition of intelligence is the capacity of making good predictions from limited data.

Information theory says that the more you can predict unknown data from partial known data, the fewer symbols you need to describe data.

Creating a good model and finding those few optimized symbols, creates a well tailored language that enables good compression.

An optimized language unlocks simplicity by thoroughly describing things in the fewest words or bits possible. It enables easier reasoning by avoiding having to remember and mentally manipulate overly lengthy descriptions. It makes good, actionable predictions readily achievable.

That's why it's important to have the right language based on the right model that fits the data and the domain as well as possible.

Ok I hope this wasn't too abstract, back to product. The cure for bad crossfit is one that is oft-repeated in product circles: It's for product teams to have lots and lots of direct contact and feedback from end users while striving to put themselves in their shoes and learning the details of their world. It's basically about learning to speak their language, learning the relative frequency and relative importance of different concepts in their field and then using that well tuned language to reason about and build products.

Always ask yourself, are you building with a design language that is easy to use for yourself or for your users? Usually these are not the same.

One danger when talking to users is ending up overfitting users with the strongest voice. In machine learning, one way to prevent overfitting is to do something called cross-validation. Cross-validation just means training on different data than you test with. To maximize the use of your data, you might alternate the use of sub-parts of data for testing and training. With product ideas, this could mean that if you get an idea while working with user A, implement and test this idea on user B and vice versa.

If you test a suggestion from user A on user A, you are likely to overfit on that user. If it's difficult to find a second user who would be interested, it's a clear sign you're overfitting on the first user.

So bad crossfit happens when you take jargon and design language optimized to talk about one domain and use it to talk and reason about a second domain. This doesn't mean that you can never introduce new concepts and new language. Sometimes you should tear down old metaphors and introduce new language or new interface patterns that reflect new technological realities. But this has to be done very carefully and deliberately. You have to be thinking about how the new language will fit into a field with older traditions and older patterns that have evolved long periods of time and that reflect challenges well known to insiders. You should never try to re-educate your users just because you want to use the concepts you know and are too lazy to learn about how they themselves typically talk and reason about their field.

Technologists are biased towards technology

If you have a technical background and you are building your product around a UI that gets the technical details right, that reflect good data structures, that focuses on things like networks and connections and errors and data processing and other concerns of computers but that glosses over things that reflects the daily concerns of your users in the way they are used to talk about them, you are creating bad crossfit. Technical things are second nature to technical people. We tend to think of the world in our own terms. When it comes to designing product, it's important to make a conscious effort to counter this tendency.