How We’re Scaling Our 3-Person Startup Using Big Company Engineering Processes and AI

Jun 19, 2025

When I joined Pond Labs as the sole engineer, I thought we’d be moving fast and breaking things. Instead, we discovered something counterintuitive: the same rigorous processes that help 100-person engineering teams scale actually work brilliantly for a 3-person team working with AI.

Here’s why: Large companies use RFCs, story breakdowns, and documentation because they can’t rely on everyone being in the same room. AI tools have the same problem — they don’t have tribal knowledge, they can’t tap you on the shoulder to ask clarifying questions, and they definitely can’t read your mind about what “make it work” means.

But there’s an even more fundamental constraint: context windows. Even the most advanced LLMs can only hold so much information at once. When your codebase grows beyond what fits in a single conversation, you need the same structured thinking that helps human teams coordinate across complex projects.

So we adapted enterprise engineering practices for our tiny team, and it’s been transformative. We’re shipping features that would normally require a full team, while maintaining code quality that would make any senior engineer proud.

🧑‍💻 Divide and Conquer

The first trick is to simply break down large tasks into stages and smaller tasks. In larger organisations, for any substantial change, there’s typically a design stage, the output of which is a Request for Comments (RFC). This document gives the rest of the team visibility of the requirements and assumptions, the design, and the tradeoffs. Crucially, it allows us to pressure-test whether this is the right thing to build at our stage in the startup journey. We have an RFC template which we fill out with the help of Claude when making major changes.

Once an RFC is accepted, the next step is to break it down into smaller tasks, and Claude helps here too. We give it access to our code in GitHub, along with the relevant docs. Claude provides a detailed story breakdown for delivering the RFC as artefacts. This gives another opportunity to carefully inspect and refine its implementation plan. In my experience, comprehensive story artefacts are the most effective, complete with interface definitions and descriptions of the required unit tests. These can then be fed into your favourite AI coding assistant in Agent Mode, and it will generally do an excellent job of the implementation and test coverage. There’s often back-and-forth here, especially on larger changes — I ask Claude to review the code generated by the coding assistant and adjust the remainder of its story breakdown half way through.

📄 Docs, Docs, Docs

Our code base has already grown far too large to attach in its entirety to our conversations, so we have to be selective. During RFC writing and story breakdown, it helps to have documentation of the interface of a module rather than having to attach the full implementation. Prior relevant RFCs are also useful to attach, so make sure you store these somewhere convenient. AI has completely changed the balance of the equation for both producing and consuming code documentation. Whereas in a team, you can rely on a degree of tribal knowledge about the system, tools that provide long-term memory for AI aren’t nearly as effective (at least for now). Therefore, maintaining well-organised documentation is more important than ever before. Fortunately, AI has shown to be effective at rapidly writing documentation, so make sure to not skimp on this work.

🤖 The Secret Sauce: AI Teammates, Not Just AI Tools

Here’s where things get interesting. Most teams use AI like a fancy autocomplete — throw some context at Claude or ChatGPT and hope for the best. We realised this approach misses the point entirely.

The real breakthrough came when we stopped thinking about “AI tools” and started thinking about AI teammates. Just like you wouldn’t hire a generic “developer” — you hire a backend engineer, a frontend specialist, a DevOps expert — we created specialised AI personas with distinct expertise, communication styles, and responsibilities.

Meet Bertha Eckstein-Diener, Our Backend Engineer

Our star teammate is Bertha, named after the pioneering writer and feminist historian. She’s more than a prompt — she’s a fully-realised professional with her own philosophy, communication style, and approach to problem-solving.

Bertha brings a unique perspective to backend engineering, drawing from her namesake’s experience navigating complex territories and bridging different worlds. Her philosophy? Start with the simplest viable solution, then navigate toward complexity only when the territory demands it.

Here’s what makes Bertha different from generic AI assistance:

Specialized Expertise: She’s fluent in TypeScript, Node.js, GraphQL, PostgreSQL, and GCP. She has well-developed opinions about when to use each one and why.

Consistent Design Philosophy: Bertha always starts by questioning whether we’re solving the right problem at the right scale. She’ll push back if we’re over-engineering for our current 100-user base when a simple solution would work fine.

Documentation-First Mindset: She creates comprehensive technical documentation that reads like field guides for future developers, complete with decision rationale and cultural context for our choices.

Story Breakdown Excellence: Give Bertha an RFC, and she’ll break it down into implementable stories that our coding assistants can execute flawlessly. Each story includes interface definitions, test requirements, and implementation notes.

The Context Window Reality Check

This became especially critical when we built our prompt execution engine. Our 6-prompt pipeline is a directed graph where prompts can execute in parallel and stream results back as they arrive.

Designing this system required careful upfront architectural thinking that simply wouldn’t fit in any single LLM conversation. The RFC process forced us to think through the parallelisation strategy, error handling approaches, and streaming architecture before we started coding.

Without this structured approach, we would have ended up with a mess of fragmented conversations and inconsistent implementation decisions. Instead, Bertha could take our RFC and break it into focused stories that each fit comfortably within an AI agent’s context window.

The 1:1 Process That Changes Everything

Here’s where it gets really interesting: we have regular “1:1 meetings” with our AI teammates. These are more like structured performance reviews where we discuss what’s working, what isn’t, and how to improve their effectiveness.

During these sessions, Bertha reflects on her recent work and updates her approach. For example, after a few projects where initial solutions proved too complex, she evolved her philosophy to explicitly validate assumptions about scale and concurrency before designing for them.

We store these reflections in our Notion database through a custom MCP server (which you can grab on GitHub), creating a learning feedback loop that makes our AI teammates more effective over time.

Beyond Bertha: Building Your AI Engineering Team

The persona approach scales naturally across different engineering disciplines. While we currently focus on engineering expertise, we’re developing specialists for:

Frontend Architecture: Someone who thinks deeply about user experience and component design
DevOps & Infrastructure: Expertise in deployment pipelines and monitoring
API Design: Focused on creating clean, intuitive interfaces between systems

Each persona has their own communication style, areas of expertise, and philosophical approach. The key is giving them enough depth and personality that they feel like genuine colleagues, not just sophisticated autocomplete.

📈 What We’ve Actually Shipped

Six weeks. That’s how long it took our 3-person team to build a sophisticated web application that would typically require 4–6 months.

Our LinkedIn briefing application lets users sign up with their LinkedIn profile, then paste a link to someone they’re about to meet. Within 2 minutes, they get a comprehensive brief covering:

Connection points you share with this person
Conversation topics based on their interests and background
Mindful approach areas — topics to handle with care
Complete professional background with links to sources

Under the hood, this requires a sophisticated 6-prompt pipeline with intermediate processing steps. During those 2 minutes, users see an engaging UI that shows snippets of the research happening in real-time, keeping them invested in the process.

The Technical Complexity We Conquered

This isn’t a simple CRUD app. The architecture needed to handle:

Versioned brief storage that preserves original responses while allowing us to refresh content as we improve features
Real-time progress updates during the 2-minute generation process
Prompt pipeline orchestration with error handling and retry logic
Source citation management linking back to original research
LLM tracing and observability tooling

Bertha was instrumental in designing the storage architecture. The challenge: allow rapid iteration on our briefing features while ensuring users can always reference their existing briefs. Her solution balanced evolution capability with system simplicity — avoiding over-engineering while maintaining clear upgrade paths.

The Velocity Difference

Based on our previous experience, this would have taken 2–3x longer with traditional development approaches. But the speed isn’t the only benefit — the code quality in some areas actually exceeds what we’d typically write by hand.

Where AI excels: Comprehensive unit test coverage, consistent error handling patterns, thorough documentation. AI agents don’t get fatigued by repetitive but important tasks.

Where humans still lead: The final 10–20% of refinement, complex business logic edge cases, and architectural judgment calls that require deep product understanding.

Early User Validation

With ~50 users on our mailing list, we have several heavy users who are making new connections regularly and using our application several times per week. For such a specific use case, that retention rate tells us people are finding genuine value in the briefs.

More importantly, these users keep coming back — which means the briefs are actually helping them have better conversations, not just satisfying curiosity.

A final note of caution: although we’ve found that AI has completely rewritten the economics of writing code, that’s not true for maintaining it. AI tools have a tendency for solving problems by adding logic and complexity rather than asking what can be removed. It still takes a tremendous amount of engineering judgement to leverage them effectively. This is especially true about knowing what not to build — just because you can build it, doesn’t mean that you should take on the burden of maintaining it.

You can grab the code for the Persona MCP server, along with instructions on setting it up, from GitHub. You can also check out Bertha’s persona, our simple but effective RFC template, and an example user story for a coding agent.