Crew — Patrick Lewis

Crew gives builders feedback from agentic versions of their ideal customers, so they can validate marketing and product before they ship. I'm co-founder and CEO at Tortuga AI, the company behind it. Design and product are mine, and so is a good chunk of the front-end code.

Crew hero — The Evaluation overview screen shows key takeaways, recommendations, and per agent test results

Features

You point Crew at a surface - a landing page, a pricing page, a product flow - and a panel of AI agents walks through it the way your real customer would, scores it against the things that customer actually cares about, and hands back prioritized, specific changes. You make the changes, re-run it, and watch the score move.

Crew wizard — Creating a test from Crew's web app mimics a traditional, e-commerce flow

There are two kinds of agent. ICP agents are the customer layer - personas built around a real ideal customer profile. Experts are the mentor layer - specialist agents modeled on people whose work in tech we respect, like an SEO expert, a copywriter, or a YC partner. One thing we hold to: the personas are proxies for real users, not replacements for them, and we're careful not to let the copy overstate that.

Crew personas — Crew has both Expert agents and ICP agents, trained on the user's website

Crew persona detail — Each agent has a profile with past tests, favorite products, etc.

Pivot

Crew wasn't the plan. Josh and I had built four go-to-market tools - Ovii, Flagship, Quartermaster, and Captain - and were ten days from launching them as a single bundle for AI builders. On the side, Josh had been building a small evaluator that ran a marketing site through agentic customers and told you whether it was ready. The Sunday night before launch, we ran our own bundle site through it.

It tore us apart. The verdict was that the bundle had no coherent customer. Four products, four different triggers, four different buyers, and a landing page pitching all of them at once and landing with none. Not what you want to read at 10 p.m. with a Wednesday launch on the calendar.

We killed the launch. The way I put it to Josh was that it's better to break an engagement than get a divorce. Then we spent ten days using the evaluator on our own pages. Something I didn't expect happened around day three. Josh, who'd never worked in marketing or design, started shipping landing page changes himself - sharper hero copy, restructured value props, better CTA placement. The tool gave us both the same scoreboard to push against. A designer and an engineer can argue all night about whether a page is good. They can't really argue with how a hundred versions of your customer reacted to it. We went from one person doing marketing to two.

By the end of those ten days we knew the tool was the company. That's Crew.

Underlying thesis

AI coding tools made building roughly 10x faster. Validation didn't get any faster at all. So builders ship into a void, hear nothing back, and read the silence as "no product-market fit" when it's really just no information.

Building good software is easier than it's ever been. Building the right software for a specific customer is exactly as hard as it was a decade ago, and arguably harder now, because the noise floor is so much higher. Crew is a bet that checking your work against your customer becomes a normal step in the build loop - generate, validate, iterate, then put it in front of real people - rather than something a research team does weeks later.

Strategic decisions

Subscription over usage pricing. We launched usage-based and moved to subscription quickly, because early users told us plainly that they wanted predictability over metered runs.
Sharpened the customer. "AI builders" was too broad to sell to. We narrowed to AI builders who recently launched or raised and want to convert the uptick in traffic that comes with it - a moment when validating the page actually matters and the buyer is paying attention.
Made the product its own proof. Every tool we shipped before Crew is now a live case study for it, and the founding story is the loudest one: we killed a four-month launch because the tool told us to.

Learnings

The cleanest lesson of the year is that an external scoreboard changes who can do the work. For every launch before this, I was the one final eye on copy and positioning, which meant that work moved at the speed of one person. Crew's score gave the whole team a target outside any single person's taste, and the work got faster and less precious because of it.

Crew tests

The other one is older and came from Floor: people will jump through real hoops if you're solving a real problem. That's what let me get comfortable putting rough, caveat-heavy versions of Crew in front of strangers instead of polishing in private. The feedback you get at 70 percent is feedback you can't get any other way.

You can see Crew at yourcrew.app.

Features

Pivot

Underlying thesis

Strategic decisions

Learnings

More from Tortuga AI