Capo

The stats update was taking 45 seconds. Users would click "Complete Match," watch a spinner, and wait. And wait. Sometimes the request would timeout entirely, leaving the match in a broken state.

It was August 2025, and the architecture that had worked for a simple stats tracker was failing under real load.

The Original Design

When a match completed, the system needed to update statistics: player totals, season standings, personal bests, hall of fame records, match reports. Each calculation was implemented as a Supabase Edge Function.

The flow:

Match Complete → Call Edge Function 1 → Wait → Call Edge Function 2 → Wait → ... → Call Edge Function 11 → Done

Eleven functions, called sequentially. Each function:

Made a database call to run a SQL aggregation
Returned immediately
Was about 109 lines of code
Was 95% identical to the other ten functions

The total time: 45+ seconds. The Vercel timeout limit: 30 seconds for hobby plans. The system was hitting the ceiling.

The Analysis

I asked the AI to analyze the edge functions. The report was embarrassing:

Edge Function 1:  109 lines
Edge Function 2:  107 lines
Edge Function 3:  108 lines
...
Edge Function 11: 112 lines

Code duplication: 95%
Unique logic per function: ~5 lines (the SQL function name)

Eleven nearly identical files. The only meaningful difference was which SQL function each one called. Classic copy-paste development that had accumulated over months of feature additions.

The New Architecture

The solution was a background worker with a job queue:

Match Complete → Enqueue Job → Return Immediately (< 1 second)
                     ↓
Background Worker → Process Job → Update Stats → Callback to Invalidate Cache

The user sees immediate feedback. The heavy processing happens asynchronously. If something fails, the job retries automatically.

The Worker Service

A standalone Node.js service that polls a job queue:

while (true) {
  const job = await getNextJob();
  if (job) {
    await processJob(job);
    await markJobComplete(job.id);
  } else {
    await sleep(1000);
  }
}

Parallel Processing

The eleven sequential edge functions became parallel operations within a single job:

async function processStatsJob(matchId: number, tenantId: string) {
  await Promise.all([
    updatePlayerTotals(matchId, tenantId),
    updateSeasonStandings(matchId, tenantId),
    updatePersonalBests(matchId, tenantId),
    updateHallOfFame(matchId, tenantId),
    // ... 7 more
  ]);
}

What took 45 seconds sequentially now takes 30-60 seconds in parallel — but the user doesn't wait for any of it.

Retry Mechanisms

Jobs can fail. Network issues, database locks, temporary errors. The worker includes automatic retry logic with exponential backoff.

Job Monitoring

An admin UI shows job status: pending, processing, completed, failed. Failed jobs show error messages. Admins can manually retry or investigate.

The Cache Invalidation Problem

Next.js has a function called revalidateTag() that invalidates cached data. But it only works within the Next.js runtime. The worker is a separate Node.js process — it can't call Next.js functions directly.

The solution: an HTTP callback.

// Worker calls back to Next.js after processing
await fetch('https://app.caposport.com/api/internal/cache/invalidate', {
  method: 'POST',
  headers: { 'Authorization': `Bearer ${INTERNAL_SECRET}` },
  body: JSON.stringify({ tags: ['season_stats', 'player_stats', 'match_reports'] })
});

The Next.js API route receives the callback and invalidates the relevant cache tags. This pattern — external service triggering Next.js cache invalidation via HTTP — has become standard for all background processing in the system.

The Implementation

Files created:

/worker/
├── src/
│   ├── jobs/statsUpdateJob.ts        # Main processor
│   ├── lib/statsProcessor.ts         # Parallel executor
│   ├── lib/cache.ts                  # Cache invalidation client
│   └── types/jobTypes.ts             # TypeScript types
├── package.json
└── README.md

Database additions: background_job_status table for tracking, job queue with priority support, retry count and error logging.

Total: about 2,000 lines of production code, replacing 1,200 lines of duplicated edge functions.

The Results

Metric	Before	After
User wait time	45+ seconds	< 1 second
Processing time	45 seconds	30-60 seconds
Timeout risk	High	None
Retry capability	None	Automatic
Monitoring	None	Full visibility

The processing still takes time, but users don't experience it. They click "Complete Match," see a success message, and move on. The stats update in the background.

What the AI Learned

The AI initially generated the worker without retry logic. Error handling was an afterthought. I had to explicitly request it.

The cache invalidation callback pattern took three attempts to get right. The AI's first version tried to import Next.js functions into the worker (doesn't work). The second version used the wrong authentication pattern. The third version worked.

The job monitoring UI was also an afterthought. In retrospect, it should have been designed upfront. Knowing job status is essential for debugging production issues.

The Broader Pattern

I talk more about the overall approach in How I Actually Vibe Code. The worker rewrite established a pattern used throughout the system.

The performance improvements that followed built on this foundation. React Query handles client-side caching; the worker handles server-side processing. Together, they transformed a slow application into a responsive one.

The booking system uses the same worker infrastructure for refund queue processing and webhook retries. The pattern proved reusable.

The Lesson

Edge functions and serverless architectures are convenient for simple operations. They become limiting for complex, long-running processes.

The shift from "11 edge functions called sequentially" to "1 worker processing jobs in parallel" wasn't just a performance improvement. It was an architectural upgrade that enabled features that couldn't have been built otherwise.

The 95% code duplication was a smell. The AI recognized it when asked to analyze the codebase. A human might have defended the duplication as "separation of concerns." The AI saw it as technical debt worth eliminating.

Sometimes the right solution isn't optimizing what you have. It's replacing it with something fundamentally different.

Series Navigation

View all 9 articles in this series Start here: How I Actually Vibe Code

Previous: Booking Race Conditions Next: Performance Fixes Add Up