Skip to content
V3.0 // STABLE
LOAD 12%
LAT 24MS
SLA 99.99%

Designing a Scalable Payment System

4 min read
51 views
paymentsdistributed systemsarchitecturegolang

Designing a payment system requires high consistency, reliability, and security. In this article, we'll design a high-level architecture for processing payments at enterprise scale, focusing on Idempotency, Transactional Integrity, and Asynchronous Processing.

Core Engineering Principles

[!IMPORTANT] In financial systems, Reliability > Latency. It is better to wait 500ms for a confirmed transaction than to have a 50ms response that might lead to double charging or lost records.

  1. Idempotency: Every payment request must have a unique idempotency_key. This ensures that even if a network timeout occurs and the client retries, we do not result in double charging.
  2. ACID Transactions: Financial records must be atomic and consistent. We use relational databases (PostgreSQL/MySQL) with strict locking for ledger updates.
  3. Scalable State Machine: A payment goes through several states: PENDINGPROCESSINGSUCCEEDED / FAILED.

High-Level Architecture

The architecture follows a hexagonal pattern to decouple our core logic from external payment providers and downstream consumers.

Live architecture
Analyzing Schema...

Arch Note

Interactive logic enabled. Click components in expanded view for technical service definitions.

Layer.0 / Distributed_System_Viz

Database Schema (ERD)

A robust payment system starts with a well-designed schema for auditability.

Live architecture
Analyzing Schema...

Arch Note

Interactive logic enabled. Click components in expanded view for technical service definitions.

Layer.0 / Distributed_System_Viz

Implementation: Idempotency in Golang

Using Redis to store and validate request keys quickly before hitting the relational database.

func (s *PaymentService) ProcessPayment(ctx context.Context, req *PaymentRequest) (*PaymentResponse, error) {
    // 1. Check Redis for existing Idempotency Key
    // Use SETNX (Set if Not Exists) for atomic locking
    exists, err := s.redis.SetNX(ctx, req.IdempotencyKey, "PROCESSING", 30*time.Minute).Result()
    if err != nil {
        return nil, fmt.Errorf("idempotency check failed: %w", err)
    }
    if !exists {
        // Log the duplicate attempt and return the previously stored result if any
        return nil, ErrDuplicateRequest
    }
 
    // 2. Wrap in DB Transaction
    err = s.db.WithTransaction(func(tx *sql.Tx) error {
        // a. Create PENDING record
        // b. Record Audit Log
        return nil
    })
 
    if err != nil {
        s.redis.Del(ctx, req.IdempotencyKey)
        return nil, err
    }
 
    return &PaymentResponse{Status: "SUCCESS"}, nil
}

Failure Mode Analysis

ScenarioImpactMitigation Strategy
Provider TimeoutUnknown StatePolling/Webhook Reconsiliation. Call provider status API before retrying.
Database DownService OutageLocal Buffering/Outbox Pattern. Store requests in a persistent queue temporarily.
Kafka DelayStale DataEventual Consistency. Use unique transaction IDs for consumer-side idempotency.

[!TIP] Consultant's Choice: For startups, start with a synchronous flow for simplicity. For enterprise scale (1000+ tps), adopt an Asynchronous Orchestration pattern to avoid blocking threads on external API calls.

Observability & Monitoring

To maintain 99.99% availability, track these core metrics:

  • Payment Success Rate (PSR): Percentage of successful vs failed transactions.
  • Mean Time to Reconcile (MTTR): How long it takes for a "lost" transaction to be recovered.
  • Provider Latency: P99 response times of external gateways.
  • Error Distribution: Monitor for spike in IDEMPOTENCY_MISMATCH or CARD_DECLINED.

This architecture ensures that even if the server crashes mid-transaction, we can reconcile the state later using the pending record and idempotency key.