← back to diary
#004

THE YES: AI-Powered E-Commerce

Harvard Business School Case Study Analysis

THE YES tried to fix broken e-commerce with a store built around each individual user, and the cold start problem they hit is the same trap every AI-first product faces — you need users to train the algorithm, but the algorithm must be good enough to keep users. E-commerce has been a digitized print catalog for twenty years, and we've collectively agreed to pretend that's fine. You search, you scroll, you filter by price and color, and you see the same grid of products as everyone else. The platforms collect staggering amounts of data about you, yet the amount they actually use to personalize your experience remains close to zero. THE YES attempted to break this cycle, and the case study is worth reading because the failure tells you exactly what trap is waiting.

E-commerce has always chased the same ideal: suggest the exact item someone wants before they know they want it. Many startups have tried and fallen short. The problem runs deeper than bad recommendations. E-commerce is built on digitized print catalog infrastructure with endless aisles that force users to scroll through irrelevant items. Too many options produce choice paralysis, Amazon becomes a nightmare of no curation and pay-to-play results, and brands face their own crisis with department stores collapsing and wholesale partners scarce. Amazon creates brand erosion and counterfeits while DTC costs explode, leaving founders with a broken system. Bornstein's key insight cut through the noise: it is amazing how much data e-commerce companies are gathering on users and how little they are using it.

The Architecture

Her solution was structurally simple and technically hard because there is no reason why you and I have to have the same store in the digital world. Technology makes it easy and inexpensive to build a million different stores for a million different customers, so THE YES built on four pillars to make this reality. Machine learning handles brand integrations automatically, requiring zero effort for either side, while an adaptive platform reads user signals in real-time and scales to millions. A fashion algorithm with over 500 attributes per item was built like Pandora, where human taxonomists laid the foundation and ML scaled it. The fourth pillar was data, providing a brand dashboard with competitive insights that traditional retail never shared.

Two-Sided Value

The value proposition ran both directions to solve the fragmentation of the market. For brands, the algorithm targeted acquisition reached the most relevant users, charging a 25% commission only on sale with no upfront cost. Brands preserved their integrity using their own photos and messaging, and they gained data that traditional retailers never shared. For consumers, the platform offered over 145 brands ranging from Balenciaga to Zara in a feed that adapted to style, size, and budget. A best price guarantee tracked the lowest price across the web, and results were pre-sorted by likelihood to buy so users never had to sort or filter.

The Cold Start Problem

But the cold start problem hit immediately because the flywheel requires a specific sequence of events to turn. A better algorithm leads to a better user experience, which leads to more users, which leads to more data, which leads to a better algorithm. You need users to train the algorithm, yet the algorithm must be good enough to retain users. Bring users in too early and you risk permanent churn, creating a dilemma every AI-first product faces. The product only becomes valuable with the data that only comes from having users, yet users need value to stay.

Early Results

The cold start question split Bornstein and her CTO Aggarwal. Bornstein argued for perfecting the product first because if you download an app and the experience isn't good, you are very unlikely to come back. Aggarwal countered that early adopters expect issues and that what gets them back is seeing real improvement, noting that more users equals more data equals a better algorithm. With $5M to deploy, the choice was between a 30% better algorithm delivering 40% higher conversion and 30% lower churn, or paid media bringing in users faster to accelerate the learning flywheel. Bornstein chose the algorithm.

Acquisition Channel Data

Early results as of August 2020 showed 30,000 downloads, a 4% conversion rate, and a $225 average order value. Personalization ratings showed 50% loved it, 30% were neutral, and 20% were dissatisfied, averaging 4 out of 5 and improving to 4.3 over time. Two distinct customer personas emerged from this data. The Fashionista, about 20% of the user base, knows exactly what she wants, uses heavy search, treats shopping as not social, spends $250 to $500 or more per item, and is less satisfied when rejected items keep reappearing. The Fashion Follower, about 80% of the user base, shops socially, wants inspiration not search, expects the app to do the work, spends $75 to $250, and is more satisfied being pushed out of her comfort zone. Which segment to prioritize? Fashionistas want better search while followers want better recommendations, creating different product decisions entirely.

Monetization Options

Acquisition channel data complicated the picture further. Sharing and referral had the highest second-week retention at 41.7% but the lowest quiz completion at 48%. Organic social had the best quiz completion at 91% and the highest YES list creation at 82.1%. Paid social had the lowest retention at 12%, revealing the "wrong users" problem. As O'Brien noted, Facebook optimizes for installs, but we probably want to optimize for engagement or cart conversions. The algorithm needs volume to work, and paid channels often deliver the wrong volume for the wrong reasons.

Lessons for Founders

Five monetization options existed for the platform. Pay-to-play advertising allowed brands to pay for preferred placement, while a loyalty program funded by brands followed a model similar to Sephora's samples. An influencer program created white-label stores with a 25% commission, and peer-to-peer social shopping tested a $25 give/get model at 75% share completion and 20% download. White-labeling THE YES's personalization technology to brands for their own DTC sites rounded out the options.

The lessons for founders are stark and specific. Domain expertise matters because deep personalization in one vertical beats shallow personalization across all of them. Human plus machine is the architecture, where human taxonomists built the foundation and ML scaled it. UX and algorithm are inseparable because you cannot optimize one without the other. Two-sided markets need both sides from day one, which is why they led with luxury brands to establish credibility. Trust takes time, and as Bornstein said, trust will be the hardest behavior to shift.

The ultimate question from the case remains whether the algorithm is good enough to earn trust and retain users. Everything depends on the answer. For Nyantrace, the same question applies to observability for AI agents. Is the observability data good enough, and specific enough, to earn the trust of developers who need to stake production systems on it?

Building
Nyantrace.ai

Observability and governance for AI agent systems. If you're building with agents, I'd like to talk.

nyantrace.ai →
← back to diary