Case Study
Building FurballRescue:
AI as a Development Partner
ReallySmall builds software for organizations that do meaningful work. Our first product, FurballRescue, is a management platform for foster-based animal rescues — organizations that rescue animals and place them in volunteer foster homes until they find permanent families.
This is the story of how we built it, and what we learned about working with AI along the way. It's not a clean arc. There were wrong turns.
Starting Point
We started with an old custom .NET MVC application that Lab Rescue of the LRCP had been using since 2012. It tracked animals, foster families, medical records, and adoptions — and it had served the organization well for over a decade. But we knew there was an opportunity to do much more.
Foster-based rescue organizations — the ones without a physical shelter, relying entirely on volunteer foster homes — were underserved by every product on the market. Most rescue software was built for brick-and-mortar shelters or municipal animal control. We wanted to build something purpose-fit for foster-based rescues, and we decided to use AI as deeply as we could throughout the entire process.
Research Before Code
Before writing a single line of the new product, we invested significant time in research — all of it AI-assisted.
Domain Clarification
We pointed Claude at the legacy codebase and had it produce a complete rewrite. The initial output used Blazor, which turned out not to be the right architecture for a multi-tenant SaaS product — but the exercise forced a deep analysis of every entity, relationship, and workflow in the original system. Fourteen years of accumulated domain knowledge was now clearly documented and well-understood.
Market Analysis
Claude surveyed every significant product in the rescue management space — features, pricing, target audiences, and gaps. We studied what rescue directors publicly said they needed, what frustrated them about existing tools, and where the market left foster-based organizations behind.
Workflow Mapping
We mapped the complete operational lifecycle: intake requests, medical triage, foster placement, adoption applications, home checks, follow-ups, syndication to platforms like Petfinder and Adopt-a-Pet, and board-level reporting. This drew on direct experience with rescue operations and extensive web research into how foster rescues across the country described their workflows and challenges.
Efficiency Analysis
We had Claude adopt the perspective of an organizational efficiency expert, examining the mapped workflows for unnecessary human labor. This surfaced problems we hadn't expected — manual data entry steps that could be eliminated, communication gaps that created duplicate work, reporting processes that consumed hours of a director's week. These findings shaped the product backlog as much as any feature request.
All of this research was documented not just for our reference, but in formats that AI could consume as context in future sessions. That turned out to be one of the most important practices of the entire project.
The Wrong Architecture (Twice)
If this were a marketing piece, we'd skip this part. But it's the most useful section for anyone thinking about using AI for real product development.
The first rewrite came out as a Blazor Server application. It worked — Claude had successfully reverse-engineered the legacy domain — but Blazor relies on persistent SignalR WebSocket connections for every active user, which wouldn't scale well for a multi-tenant SaaS product with many concurrent users across many organizations. We pivoted to ASP.NET MVC with server-rendered Razor views. Straightforward, stateless, horizontally scalable.
Then there was the database.
We started with SQLite — lightweight, no external dependencies, perfect for getting features built quickly. When it came time to deploy, we invested heavily in a sophisticated replication setup using Litestream on Azure Container Apps. The idea was elegant: SQLite on fast local ephemeral disk, with Litestream continuously replicating WAL changes to Azure Blob Storage. Near-zero latency writes, one-second durability.
The reality was a gauntlet. Litestream's documented directory replication mode silently
didn't work in v0.5.x. Container Apps' ephemeral storage meant every restart wiped
everything — and health probes triggered database provisioning before the restore finished.
Azure Files' SMB mounts were incompatible with SQLite's WAL mode, causing silent corruption.
Git Bash on Windows silently mangled Linux paths in Azure CLI commands, setting environment
variables to C:/Program Files/Git/data instead of /data.
We generated pages of hard-won operational knowledge.
It all worked, eventually. Then we realized that per-tenant PostgreSQL on Azure's managed Flexible Server would give us everything we needed — real multi-tenancy, proper concurrent access, managed backups, no ephemeral storage gymnastics — for about the same cost. We migrated. The Litestream documents went to the archive.
None of this was wasted. Each wrong turn clarified what we actually needed. The SQLite phase let us build features without worrying about database infrastructure. The Litestream phase taught us exactly what our deployment requirements were. And AI was the one doing the bulk of the implementation work at each stage — including the migrations between them. The cost of a wrong architectural bet was days, not months, because the rework was AI-assisted too.
AI doesn't prevent wrong decisions. It makes wrong decisions cheap enough to recover from.
Building the Product
With the architecture settled, development moved quickly. We kept two practices that made this possible.
Documentation as context. We kept project documentation relentlessly current — architectural decisions, domain concepts, conventions, API patterns. This wasn't primarily for human readers. It was so that every AI session started with accurate, complete context. Without this discipline, each session would have begun cold, producing generic or inconsistent output. With it, we could hand AI large, complex tasks and trust the results.
Multi-perspective review. Periodically, we'd have Claude review its own work from different expert viewpoints — as a security architect identifying vulnerabilities, as a senior developer spotting code smells, as a rescue director evaluating whether the UX matched real operational workflows, as a product manager looking for feature gaps. Each perspective consistently surfaced improvements that a single-viewpoint review would miss. These generated focused backlogs that were typically addressed the same day.
AI was the primary production tool throughout — not just for application code, but for infrastructure. DNS migrations, Amazon SES configuration, Azure resource provisioning, GitHub CI/CD setup, SSL certificate management — all handled through Claude Code at the command line. Work that would normally involve hours in cloud portals was completed conversationally in minutes.
The application grew to include multi-tenant architecture with per-tenant databases, transactional email, Stripe payment processing, automated syndication to adoption platforms, daily backups with cloud archival, a full audit trail, medical protocols, custom fields, volunteer hour tracking, expense management, transport coordination, and a public intake request portal.
Testing
We approached testing through Behavior-Driven Design, and we were thorough about it.
Every user role in the system was audited: what each role needed to accomplish, what permissions it held, what the testable behaviors should be. All 35 permissions were traced through roles, controllers, and views to verify the authorization matrix. Every controller action was checked for test coverage, and gaps were filled. Tests were then evaluated for substance — we wanted meaningful scenarios that validated real business behavior, not trivial assertions.
The result was 265 BDD scenarios across 53 feature files, running as integration tests that drive a real browser against a real database. They run on every push through GitHub Actions.
More Than CI
Load testing. The BDD suite was adapted into a load testing harness that provisioned 20 new tenant organizations on live Azure infrastructure, created 5 users in each, and ran the entire test suite simultaneously across all 100 users. The complete run — every user exercising the full intake-to-adoption workflow — finished in about two and a half minutes.
Demo content. Since the integration tests operate the application through a real browser, we captured screenshots at each step. Against a database populated with realistic mock data — AI-generated animal profiles with real photos, plausible foster families, complete medical histories — those screenshots became the visual backbone of a product walkthrough video. Claude scripted the narration to follow an animal from intake through adoption, with the application visible at every step.
The same test suite that verified correctness also proved scalability and produced marketing material.
Documentation
With the application production-ready, we built documentation for every audience.
First, a discovery pass: Claude searched the entire codebase for features that had been built but not yet documented, finding several. Then we built outward from the system-level references — architecture, domain model, permissions, API — into audience-specific documentation:
- Developer guides covering architecture, data migration, configuration, deployment, and testing
- Admin guides for roles, permissions, email templates, syndication, payments, and backups
- User guides walking through every operational workflow from animal intake to board reporting
- Public-facing guides for foster onboarding, adopter information, and intake requests
- Marketing materials including feature overview, competitive analysis, and go-to-market strategy
52 documentation pages total, each synthesized from verified sources — the codebase, BDD test cases, role definitions, and earlier research.
What We Learned
Research Compounds
Every phase produced material that the next phase consumed. Market research informed the backlog. The backlog drove development. Development artifacts fed test design. Tests generated demo content and load testing. Documentation synthesized everything. The value of each step was amplified by the steps before it.
Context Is Everything
AI with good context produces dramatically better output than AI without it. Maintaining accurate documentation was the highest-leverage activity in the entire project — more than any single feature or optimization.
Perspective Matters
The most valuable thing we did with AI wasn't generating code. It was having it review its own work from different expert viewpoints. A security architect sees different problems than a product manager. A rescue director sees different problems than a senior developer. Running these perspectives against the same codebase was the quality multiplier.
Wrong Turns Are Cheap
We built the wrong architecture twice. With AI handling the implementation and migration work, the cost of those wrong bets was days rather than months. This changes the calculus on experimentation. You can try an approach, learn from it, and pivot — without the sunk-cost paralysis that makes teams cling to bad decisions.
Direction Is the Work
The human contribution throughout this project was less about production and more about vision, perspective, and process. Casting a clear purpose for the product. Offering challenges and alternative viewpoints. Shaping AI's direction and correcting its course. Knowing when to push deeper and when to move on. The nature of the work changes when AI handles execution — but the work doesn't get easier. It gets different.
By the Numbers
The Product
FurballRescue is a production multi-tenant SaaS platform purpose-built for foster-based animal rescues. It tracks animals from intake through adoption, manages foster families and partner organizations, handles medical records and protocols, processes adoption fee payments, sends email notifications, syndicates listings to Petfinder and Adopt-a-Pet, and generates board-level reports including Live Release Rate.
FurballRescue is built by ReallySmall — a small software company proving that AI collaboration makes it possible to deliver thoughtful, professional software at independent-company economics.