Building Production APIs: Architecture Patterns That Scale

A production API needs five things beyond basic CRUD: authentication with proper token management, consistent error responses, rate limiting, request validation, and structured logging. Most APIs ship without at least two of these and pay for it later — usually when the first real user hits a race condition at 2 AM or a bad actor discovers there is no rate limit on the login endpoint.

After building APIs across 29 production projects at Mobibean, I have seen the same gaps repeat. Teams get the happy path working and ship. Then they spend months patching the things they skipped. This post covers the patterns that prevent those patches — concrete, with code, and prioritized by how much pain they save.

The 5 Non-Negotiable Patterns for Production APIs

Every API that serves real users needs these five things before launch. Not after. Not when you "get around to it." Before the first deploy.

1. Authentication With Proper Token Management

JWT access tokens with short expiry (15 minutes) plus longer-lived refresh tokens is the standard pattern for most APIs. API keys work for server-to-server communication where there is no user session.

// JWT token pair generation
import jwt from "jsonwebtoken";

interface TokenPair {
  accessToken: string;
  refreshToken: string;
}

function generateTokenPair(userId: string): TokenPair {
  const accessToken = jwt.sign(
    { sub: userId, type: "access" },
    process.env.JWT_SECRET!,
    { expiresIn: "15m" }
  );

  const refreshToken = jwt.sign(
    { sub: userId, type: "refresh" },
    process.env.JWT_REFRESH_SECRET!,
    { expiresIn: "7d" }
  );

  return { accessToken, refreshToken };
}

Store refresh tokens in the database so you can revoke them. Never store JWTs in localStorage if your API serves a browser client — use httpOnly cookies instead.

2. Consistent Error Response Format

Every error from your API should have the same shape. Every single one. Frontend developers should never have to guess what an error response looks like.

// Standard error response — same shape, always
interface ApiError {
  error: {
    code: string;        // machine-readable: "VALIDATION_ERROR"
    message: string;     // human-readable: "Email is required"
    details?: unknown[];  // field-level errors, stack trace in dev
    requestId: string;   // for support tickets and log correlation
  };
}

3. Rate Limiting

Without rate limiting, one aggressive client can take down your entire API. Implement per-user limits and global limits separately.

// Rate limiting middleware using sliding window
import { RateLimiterRedis } from "rate-limiter-flexible";
import { Redis } from "ioredis";

const redis = new Redis(process.env.REDIS_URL!);

const rateLimiter = new RateLimiterRedis({
  storeClient: redis,
  keyPrefix: "rl",
  points: 100,       // 100 requests
  duration: 60,      // per 60 seconds
  blockDuration: 60, // block for 60s if exceeded
});

async function rateLimitMiddleware(req, res, next) {
  try {
    await rateLimiter.consume(req.userId || req.ip);
    next();
  } catch {
    res.status(429).json({
      error: {
        code: "RATE_LIMIT_EXCEEDED",
        message: "Too many requests. Try again in 60 seconds.",
        requestId: req.id,
      },
    });
  }
}

4. Input Validation at the Boundary

Validate every request before it reaches your business logic. Zod or a similar schema validation library makes this straightforward.

import { z } from "zod";

const CreateUserSchema = z.object({
  email: z.string().email("Invalid email format"),
  name: z.string().min(1, "Name is required").max(100),
  role: z.enum(["admin", "member", "viewer"]),
});

// Validate at the route handler, not inside business logic
app.post("/users", (req, res) => {
  const result = CreateUserSchema.safeParse(req.body);
  if (!result.success) {
    return res.status(400).json({
      error: {
        code: "VALIDATION_ERROR",
        message: "Invalid request body",
        details: result.error.issues,
        requestId: req.id,
      },
    });
  }

  // result.data is now typed and validated
  return userService.create(result.data);
});

5. Structured Logging With Correlation IDs

When something breaks in production, you need to trace a request through your entire system. Assign a unique ID to every request and include it in every log entry.

import { randomUUID } from "crypto";
import pino from "pino";

const logger = pino();

function requestIdMiddleware(req, res, next) {
  req.id = req.headers["x-request-id"] || randomUUID();
  res.setHeader("x-request-id", req.id);
  req.log = logger.child({ requestId: req.id, userId: req.userId });
  next();
}

This costs almost nothing to implement and saves hours during every incident.

API Architecture Decision Table

These are the decisions you face at the start of every API project. The right answer depends on context, not dogma.

Decision	Option A	Option B	When to Choose A	When to Choose B
Protocol	REST	GraphQL	Multiple consumers with simple data needs; team knows REST well	Single frontend with complex nested data; need to reduce over-fetching
Architecture	Monolith	Microservices	Team under 10; product still finding market fit; MVP stage	Clear domain boundaries; independent deployment needed; team over 20
Database	PostgreSQL	MongoDB	Relational data; need transactions; most SaaS apps	Document-oriented data; very early prototyping; schema changes constantly
Auth	JWT tokens	Session cookies	Mobile clients; API-first; third-party consumers	Server-rendered apps; simpler security model; single domain
Versioning	URL path (/v1/)	Header-based	Public APIs; clear consumer expectations	Internal APIs; gradual migration; fewer consumers
Hosting	Serverless (Lambda, Vercel)	Containers (ECS, Fly.io)	Unpredictable traffic; event-driven workloads	Steady traffic; WebSockets; long-running processes
Caching	Redis	In-memory (node-cache)	Multiple instances; shared state needed	Single instance; simple caching needs
Task processing	Queue (SQS, BullMQ)	Synchronous	Email, webhooks, reports, anything slow	Only when response time is under 200ms anyway

For most SaaS applications we build: REST, monolith, PostgreSQL, JWT, URL versioning, containers, Redis. That stack handles the first 100K users without rearchitecting.

Database Design for API-First Applications

Start with PostgreSQL. This is not a controversial opinion — it is the correct default for 90% of API projects. PostgreSQL gives you relational integrity, JSON columns for flexible data, full-text search, and an ecosystem of tooling that no other database matches.

Connection Pooling From Day One

Every database connection consumes memory on both your application server and the database. Without a connection pool, each API request opens a new connection, and you hit the database connection limit under moderate load.

Use PgBouncer or your ORM's built-in pooling. Configure this before your first deploy, not after you get a "too many connections" error in production.

Migrations Over Manual Schema Changes

Never modify a production database schema by hand. Use a migration tool (Prisma Migrate, Drizzle Kit, Knex migrations) and version your schema changes alongside your code. Every schema change should be reviewable in a pull request.

Index Strategy

Start with indexes on your primary query patterns: foreign keys, columns used in WHERE clauses, and columns used for sorting. Do not add indexes preemptively on every column — each index slows down writes and uses disk space.

Use EXPLAIN ANALYZE on your slowest queries and add indexes based on actual data, not guesses.

When to Add Redis

Add Redis when you need one of these:

Caching: Frequently read, rarely changing data (user profiles, configuration)
Sessions or rate limiting: Shared state across multiple API instances
Queues: Background job processing with BullMQ
Pub/sub: Real-time features across instances

Do not add Redis "just in case." It is another piece of infrastructure to monitor, back up, and keep running. Add it when you have a specific need.

Error Handling That Does Not Drive Frontend Developers Crazy

Bad error handling is the number one complaint frontend developers have about backend APIs. The fix is straightforward: be consistent and be specific.

Standard Error Response Shape

Every error response from your API should follow this structure:

{
  "error": {
    "code": "VALIDATION_ERROR",
    "message": "Invalid request body",
    "details": [
      { "field": "email", "message": "Invalid email format" },
      { "field": "name", "message": "Name is required" }
    ],
    "requestId": "req_abc123"
  }
}

The code is machine-readable (frontend uses it for conditional logic). The message is human-readable (for logs and debugging). The details array carries field-level errors for forms.

HTTP Status Codes: Use Them Correctly

Status Code	Meaning	When to Use
200	OK	Successful GET, PUT, PATCH
201	Created	Successful POST that created a resource
204	No Content	Successful DELETE
400	Bad Request	Validation errors, malformed JSON
401	Unauthorized	Missing or invalid authentication
403	Forbidden	Authenticated but insufficient permissions
404	Not Found	Resource does not exist
409	Conflict	Duplicate entry, state conflict
422	Unprocessable Entity	Valid JSON but business logic rejection
429	Too Many Requests	Rate limit exceeded
500	Internal Server Error	Unhandled exception (never return details to client)

The distinction between 401 and 403 matters. 401 means "I don't know who you are." 403 means "I know who you are, and you can't do this." Frontend developers use this distinction to decide whether to redirect to login (401) or show a permissions error (403).

Return All Validation Errors at Once

Nothing frustrates a user more than fixing one form error, submitting again, and getting a different error. Validate the entire request body and return every error in a single response. The Zod example above does this by default.

Never Leak Internal Errors

In development, include stack traces in error responses. In production, log them server-side and return a generic message to the client. A stack trace in a 500 response tells attackers which framework you use, what your file structure looks like, and where to probe next.

Versioning Without Pain

Most APIs do not need versioning on day one. Add it when you need to make a breaking change and you have consumers who cannot update immediately.

URL Versioning: The Simple Default

GET /v1/users/123
GET /v2/users/123

URL versioning is visible, cacheable, and easy to understand. Every developer who sees /v1/ in a URL knows what it means. Header-based versioning (Accept: application/vnd.api+json;version=2) is more "correct" according to REST purists, but it is harder to test, harder to cache, and harder to explain to new team members.

For public APIs, use URL versioning. For internal APIs where you control all consumers, you often do not need versioning at all — just coordinate deployments.

How to Sunset Old Versions

Add a Sunset header to responses from the old version with the deprecation date
Add a Deprecation header with a link to migration documentation
Send email notifications to API key owners 90, 60, and 30 days before shutdown
Monitor traffic to the old version — do not shut it down while it still receives significant traffic
After the sunset date, return 410 Gone instead of silently breaking

When You Actually Need Versioning

You need versioning when you must change a response shape or remove a field that existing clients depend on. Adding new fields to a response is not a breaking change — existing clients simply ignore fields they do not recognize. Adding optional request parameters is not a breaking change either. Version when you must break the contract, not when you add to it.

Testing APIs in Production

Unit tests for individual functions matter less for APIs than integration tests that verify the full request-response cycle. Your users do not call your internal functions — they call your HTTP endpoints.

Integration Tests First

Write tests that send real HTTP requests to your API and verify the response. Test the entire path: middleware, validation, business logic, database, response serialization.

describe("POST /v1/users", () => {
  it("creates a user with valid input", async () => {
    const res = await request(app)
      .post("/v1/users")
      .send({ email: "test@example.com", name: "Test", role: "member" })
      .expect(201);

    expect(res.body).toMatchObject({
      id: expect.any(String),
      email: "test@example.com",
    });
  });

  it("returns 400 with validation errors for invalid input", async () => {
    const res = await request(app)
      .post("/v1/users")
      .send({ email: "not-an-email" })
      .expect(400);

    expect(res.body.error.code).toBe("VALIDATION_ERROR");
    expect(res.body.error.details).toHaveLength(2);
  });
});

Contract Testing Between Services

If your API is consumed by other services, use contract testing (Pact is the standard tool) to verify that both sides agree on the request and response format. This catches breaking changes before they reach production.

Load Testing Before Launch

Run load tests with realistic traffic patterns before every launch. k6 and Artillery are both good options. Test with expected peak traffic, not average traffic. If your API serves a product that sends marketing emails, test with 10x normal traffic — that is what happens when 50,000 people click a link at the same time.

Health Check Endpoints

Every API needs a /health endpoint that returns 200 when the service is running and can reach its database. Load balancers, monitoring systems, and deployment pipelines all depend on this.

Add a /health/detailed endpoint (behind authentication) that reports database connection status, Redis status, queue depth, and memory usage. This saves time during incidents.

Common API Mistakes We See in Code Audits

These are the patterns we find most often when reviewing APIs built by other teams. Every one of them causes real production problems. If you are building an API for a SaaS product or preparing for launch, audit your code for these before shipping.

No Pagination

Returning all records from a database query works fine with 50 rows. It crashes your server with 50,000. Every list endpoint needs pagination from day one — cursor-based for large datasets, offset-based if you need page numbers.

N+1 Queries

Fetching a list of users and then making a separate database query for each user's profile is the classic N+1. Use eager loading (Prisma include, SQL JOIN) or dataloader patterns to batch these into a single query.

No Request Timeouts

If your API calls a third-party service and that service hangs, your API hangs too — and holds a database connection the entire time. Set timeouts on every external HTTP call and every database query. Five seconds is a reasonable default for most external calls.

Synchronous Email and Notification Sending

Sending an email inside a request handler means your user waits for the SMTP server to respond before they get their API response. Move email, push notifications, webhook deliveries, and report generation to a background queue. The user gets a fast response; the email goes out seconds later.

Business Logic in Route Handlers

Route handlers should validate input, call a service function, and format the response. Business logic belongs in a service layer that can be tested independently and reused across different routes, background jobs, and CLI commands.

Frequently Asked Questions

REST or GraphQL — which should I choose?

Choose REST if you have multiple consumers (mobile app, web app, third-party integrations) with straightforward data needs. REST is simpler to cache, simpler to monitor, and every developer understands it. Choose GraphQL if you have a single frontend that needs to fetch deeply nested data and you want to avoid multiple round trips. For most projects we build using AI-augmented development, REST is the right default. GraphQL adds operational complexity (schema management, query cost analysis, N+1 prevention) that is only justified when you have the specific problems it solves.

How many endpoints is too many?

There is no magic number, but if your API has more than 50 endpoints and is maintained by a small team, you probably have too many. The symptom is not the endpoint count itself — it is that nobody can remember all the endpoints, documentation falls behind, and inconsistencies creep in. Group related endpoints behind clear resource boundaries and consider whether some functionality should be a separate service.

When should I move from a monolith to microservices?

Not until the monolith is causing specific, measurable problems. The most common trigger is deployment speed: when deploying one feature requires testing and deploying the entire application, and your team is large enough that this happens multiple times per day. Another trigger is scaling: when one part of your system needs to handle 100x the traffic of the rest, breaking it into a separate service lets you scale independently. If your team is under 10 people and your deploy pipeline takes less than 15 minutes, a monolith is almost certainly the better choice.

How do I handle API authentication for mobile apps?

Use the same JWT access/refresh token pattern described above, but store tokens in the device's secure storage (Keychain on iOS, Keystore on Android). Never store tokens in plain SharedPreferences or AsyncStorage. Implement token refresh automatically in your HTTP client so the user is never forced to re-login unless the refresh token expires. Set refresh token expiry to 30-90 days for mobile apps, since users expect persistent sessions.

Production API architecture is not about choosing the trendiest framework or following every best practice blog post. It is about making the five or six decisions that prevent the most common failures and implementing them before launch. Get authentication, error handling, rate limiting, validation, and logging right, and you have an API that can grow with your product instead of holding it back.

If you are building an API and want architecture guidance from someone who has shipped this 29 times, reach out. We offer API development as a standalone service and as part of full SaaS builds.