Scale your FastLaunchAPI application to handle increased traffic and workload with horizontal scaling, load balancing, and performance optimization strategies

Scaling

Scale your FastLaunchAPI application to handle increased traffic and workload. This guide covers horizontal scaling, load balancing, database optimization, and performance monitoring strategies.

Overview

Scaling is the process of increasing your application's capacity to handle more users, requests, and data. FastLaunchAPI supports both vertical and horizontal scaling approaches.

Scaling Strategies

🔄 Horizontal Scaling

Scale out by adding more application instances

📊 Database Scaling

Optimize database performance and implement read replicas

⚡ Caching Strategies

Implement Redis caching and CDN for better performance

🔧 Load Balancing

Distribute traffic across multiple application instances

Horizontal Scaling

Dokku Horizontal Scaling

Dokku makes horizontal scaling simple with built-in process management and load balancing capabilities.

Scale Web Processes

Increase the number of web workers to handle more concurrent requests. Learn more about Dokku process scaling:

# Scale to 3 web processes
dokku ps:scale your-app-name web=3

# Scale workers for background tasks
dokku ps:scale your-app-name worker=2 beat=1

Monitor your application's resource usage to determine optimal scaling:

# Check current process status
dokku ps:report your-app-name

# Monitor resource usage
dokku resource:report your-app-name

Configure Resource Limits

Set appropriate resource limits to prevent individual processes from consuming too many resources. See Dokku resource management:

# Set memory limits (in MB)
dokku resource:limit your-app-name --memory 512
dokku resource:limit your-app-name --memory-swap 1024

# Set CPU limits
dokku resource:limit your-app-name --cpu-quota 50000

Health Checks

Configure health checks to ensure your scaled instances are healthy. Learn about Dokku health checks:

# Configure health check endpoint
dokku checks:set your-app-name web /health

# Set health check options
dokku checks:set your-app-name web --wait=5 --timeout=30 --attempts=3

Your FastLaunchAPI includes a built-in health check endpoint:

@app.get("/health")
async def health_check():
    return {
        "status": "healthy",
        "timestamp": datetime.utcnow(),
        "version": "1.0.0"
    }

Auto-scaling Configuration

For automatic scaling based on metrics, consider using external monitoring tools or cloud provider auto-scaling features.

# Example auto-scaling script (run via cron)
#!/bin/bash
CPU_USAGE=$(dokku resource:report your-app-name | grep "cpu percent" | awk '{print $3}' | cut -d'%' -f1)

if [ "$CPU_USAGE" -gt 80 ]; then
    CURRENT_SCALE=$(dokku ps:scale your-app-name | grep web | awk '{print $2}')
    NEW_SCALE=$((CURRENT_SCALE + 1))
    dokku ps:scale your-app-name web=$NEW_SCALE
fi

Database Scaling

Connection Pooling

Optimize database connections using connection pooling. Update your database configuration:

# database.py
from sqlalchemy import create_engine
from sqlalchemy.pool import QueuePool

engine = create_engine(
    DATABASE_URL,
    poolclass=QueuePool,
    pool_size=20,
    max_overflow=30,
    pool_pre_ping=True,
    pool_recycle=3600
)

For Dokku PostgreSQL, you can configure connection limits:

# Increase max connections
dokku postgres:info your-app-db
dokku postgres:connect your-app-db

Database Indexing

Optimize your database queries with proper indexing. Add indexes for frequently queried fields:

# In your SQLAlchemy models
class User(Base):
    __tablename__ = "users"

    id = Column(Integer, primary_key=True)
    email = Column(String, unique=True, index=True)  # Indexed for fast lookups
    created_at = Column(DateTime, default=datetime.utcnow, index=True)

    # Composite index for complex queries
    __table_args__ = (
        Index('idx_user_email_created', 'email', 'created_at'),
    )

Read Replicas

For high-read workloads, implement read replicas. Configure read/write splitting:

# database.py
class DatabaseManager:
    def __init__(self):
        self.write_engine = create_engine(DATABASE_WRITE_URL)
        self.read_engine = create_engine(DATABASE_READ_URL)

    def get_write_session(self):
        return sessionmaker(bind=self.write_engine)()

    def get_read_session(self):
        return sessionmaker(bind=self.read_engine)()

Set up read replica in Dokku:

# Create read replica
dokku postgres:create your-app-db-replica
dokku postgres:link your-app-db-replica your-app-name

Database Monitoring

Monitor database performance with built-in tools:

# Check database metrics
dokku postgres:info your-app-db

# Monitor slow queries
dokku postgres:connect your-app-db
# Inside PostgreSQL:
# SELECT * FROM pg_stat_activity WHERE state = 'active';

Caching Strategies

Application-Level Caching

Implement caching for frequently accessed data:

# cache.py
import redis
import json
from functools import wraps

redis_client = redis.from_url(REDIS_URL)

def cache_result(expiration=3600):
    def decorator(func):
        @wraps(func)
        async def wrapper(*args, **kwargs):
            cache_key = f"{func.__name__}:{hash(str(args) + str(kwargs))}"

            # Try to get from cache
            cached = redis_client.get(cache_key)
            if cached:
                return json.loads(cached)

            # Execute function and cache result
            result = await func(*args, **kwargs)
            redis_client.setex(cache_key, expiration, json.dumps(result))
            return result
        return wrapper
    return decorator

# Usage in your routes
@app.get("/api/users/{user_id}")
@cache_result(expiration=1800)  # Cache for 30 minutes
async def get_user(user_id: int):
    # Database query here
    pass

Session Caching

Use Redis for session storage to improve performance:

# session.py
from fastapi import FastAPI, Depends
from fastapi.middleware.sessions import SessionMiddleware
import redis

app.add_middleware(
    SessionMiddleware,
    secret_key=SECRET_KEY,
    session_cookie="fastlaunchapi_session"
)

# Custom session storage
class RedisSessionStorage:
    def __init__(self, redis_url: str):
        self.redis = redis.from_url(redis_url)

    def get_session(self, session_id: str):
        return self.redis.hgetall(f"session:{session_id}")

    def set_session(self, session_id: str, data: dict, expiration: int = 3600):
        self.redis.hmset(f"session:{session_id}", data)
        self.redis.expire(f"session:{session_id}", expiration)

API Response Caching

Cache API responses for better performance:

# middleware/cache_middleware.py
from fastapi import Request, Response
from fastapi.responses import JSONResponse
import hashlib

class CacheMiddleware:
    def __init__(self, app, redis_client):
        self.app = app
        self.redis = redis_client

    async def __call__(self, scope, receive, send):
        if scope["type"] == "http":
            request = Request(scope, receive)

            # Create cache key
            cache_key = f"api:{hashlib.md5(str(request.url).encode()).hexdigest()}"

            # Check cache
            cached_response = self.redis.get(cache_key)
            if cached_response:
                response = JSONResponse(content=json.loads(cached_response))
                await response(scope, receive, send)
                return

        await self.app(scope, receive, send)

Redis Scaling

Scale your Redis instance for better performance:

# Scale Redis with more memory
dokku redis:info your-app-redis
dokku redis:promote your-app-redis your-app-name

# Configure Redis for production
dokku redis:connect your-app-redis
# Inside Redis CLI:
# CONFIG SET maxmemory 2gb
# CONFIG SET maxmemory-policy allkeys-lru

Load Balancing

Nginx Load Balancing

Dokku automatically configures Nginx as a reverse proxy and load balancer for your scaled application instances.

Configure Nginx

Customize Nginx configuration for better load balancing. Create an Nginx configuration file:

# nginx.conf.sigil
{{ range $port_map := .PROXY_PORT_MAP | split " " }}
{{ $port_map_list := $port_map | split ":" }}
{{ $scheme := index $port_map_list 0 }}
{{ $listen_port := index $port_map_list 1 }}
{{ $upstream_port := index $port_map_list 2 }}

upstream {{ $.APP }}-{{ $upstream_port }} {
{{ range $container := $.DOKKU_APP_CONTAINERS }}
  server {{ $container.Address }}:{{ $upstream_port }};
{{ end }}
}
{{ end }}

server {
  listen {{ $.NGINX_PORT }};
  server_name {{ $.NOSSL_SERVER_NAME }};

  location / {
    proxy_pass http://{{ $.APP }}-{{ $upstream_port }};
    proxy_set_header Host $host;
    proxy_set_header X-Real-IP $remote_addr;
    proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
    proxy_set_header X-Forwarded-Proto $scheme;

    # Load balancing configuration
    proxy_next_upstream error timeout invalid_header http_500 http_502 http_503 http_504;
    proxy_connect_timeout 30s;
    proxy_send_timeout 30s;
    proxy_read_timeout 30s;
  }
}

Health Check Configuration

Configure health checks for load balancing:

# Add to your Nginx configuration
location /health {
    access_log off;
    proxy_pass http://{{ $.APP }}-{{ $upstream_port }};
    proxy_set_header Host $host;
}

Rate Limiting

Implement rate limiting to protect your application:

# Add rate limiting
http {
    limit_req_zone $binary_remote_addr zone=api:10m rate=10r/s;
    limit_req_zone $binary_remote_addr zone=login:10m rate=1r/s;
}

server {
    location /api/ {
        limit_req zone=api burst=20 nodelay;
        proxy_pass http://{{ $.APP }}-{{ $upstream_port }};
    }

    location /auth/login {
        limit_req zone=login burst=5 nodelay;
        proxy_pass http://{{ $.APP }}-{{ $upstream_port }};
    }
}

Performance Monitoring

Built-in Monitoring

Implement monitoring endpoints in your FastLaunchAPI application:

# monitoring.py
from fastapi import APIRouter
import psutil
import time
from datetime import datetime

router = APIRouter()

@router.get("/metrics")
async def get_metrics():
    return {
        "timestamp": datetime.utcnow(),
        "cpu_usage": psutil.cpu_percent(),
        "memory_usage": psutil.virtual_memory().percent,
        "disk_usage": psutil.disk_usage('/').percent,
        "active_connections": len(psutil.net_connections()),
        "uptime": time.time() - psutil.boot_time()
    }

@router.get("/health/detailed")
async def detailed_health():
    return {
        "status": "healthy",
        "database": await check_database_health(),
        "redis": await check_redis_health(),
        "external_apis": await check_external_apis()
    }

Performance Logging

Implement performance logging middleware:

# middleware/performance_middleware.py
import time
import logging
from fastapi import Request

logger = logging.getLogger(__name__)

class PerformanceMiddleware:
    def __init__(self, app):
        self.app = app

    async def __call__(self, scope, receive, send):
        start_time = time.time()

        await self.app(scope, receive, send)

        process_time = time.time() - start_time

        if scope["type"] == "http":
            request = Request(scope, receive)
            logger.info(
                f"Request: {request.method} {request.url.path} "
                f"- Time: {process_time:.4f}s"
            )

Resource Monitoring

Monitor system resources with Dokku:

# Check resource usage
dokku resource:report your-app-name

# Monitor logs for performance issues
dokku logs your-app-name -t

# Check process status
dokku ps:report your-app-name

Database Connection Optimization

Connection Pool Configuration

Proper connection pool configuration is crucial for scaling database operations effectively.

# database.py
from sqlalchemy import create_engine
from sqlalchemy.pool import QueuePool, StaticPool

# Production configuration
engine = create_engine(
    DATABASE_URL,
    poolclass=QueuePool,
    pool_size=20,           # Number of connections to maintain
    max_overflow=30,        # Additional connections when needed
    pool_pre_ping=True,     # Verify connections before use
    pool_recycle=3600,      # Recycle connections every hour
    echo=False,             # Disable SQL logging in production
    future=True
)

# Connection pool monitoring
@app.get("/db-pool-status")
async def get_db_pool_status():
    return {
        "pool_size": engine.pool.size(),
        "checked_in": engine.pool.checkedin(),
        "checked_out": engine.pool.checkedout(),
        "overflow": engine.pool.overflow(),
        "invalidated": engine.pool.invalidated()
    }

Celery Worker Scaling

Scale Celery Workers

Scale your Celery workers based on task queue length:

# Scale workers dynamically
dokku ps:scale your-app-name worker=4

# Monitor worker status
dokku run your-app-name celery -A celery_setup inspect active
dokku run your-app-name celery -A celery_setup inspect stats

Queue Monitoring

Monitor your task queues:

# celery_monitoring.py
from celery import Celery
from fastapi import APIRouter

router = APIRouter()

@router.get("/celery/status")
async def celery_status():
    from celery_setup import celery_app

    inspect = celery_app.control.inspect()
    stats = inspect.stats()
    active = inspect.active()

    return {
        "workers": stats,
        "active_tasks": active,
        "queue_length": get_queue_length()
    }

def get_queue_length():
    from celery_setup import celery_app
    with celery_app.connection() as conn:
        return conn.default_channel.client.llen('celery')

Task Routing

Configure task routing for better performance:

# celery_setup.py
from celery import Celery

celery_app = Celery('fastlaunchapi')

# Configure task routing
celery_app.conf.update(
    task_routes={
        'email.send_email': {'queue': 'email'},
        'payments.process_payment': {'queue': 'payments'},
        'ai.generate_content': {'queue': 'ai'},
    },
    worker_prefetch_multiplier=1,
    task_acks_late=True,
    worker_max_tasks_per_child=1000,
)

Scale specific queues:

# Scale specific worker types
dokku ps:scale your-app-name worker=2 email-worker=1 ai-worker=1

Security Considerations for Scaling

Scaling Security Checklist: - Implement rate limiting - Use connection pooling securely - Monitor for DDoS attacks - Secure inter-service communication - Implement proper authentication for scaled services - Use HTTPS for all communications - Monitor and log security events

Rate Limiting Implementation

# rate_limiting.py
from fastapi import HTTPException, Request
import redis
import time

class RateLimiter:
    def __init__(self, redis_client, max_requests=100, window=60):
        self.redis = redis_client
        self.max_requests = max_requests
        self.window = window

    async def check_rate_limit(self, request: Request):
        client_ip = request.client.host
        key = f"rate_limit:{client_ip}"

        current_time = int(time.time())
        pipe = self.redis.pipeline()

        pipe.zremrangebyscore(key, 0, current_time - self.window)
        pipe.zcard(key)
        pipe.zadd(key, {str(current_time): current_time})
        pipe.expire(key, self.window)

        results = pipe.execute()
        current_requests = results[1]

        if current_requests >= self.max_requests:
            raise HTTPException(
                status_code=429,
                detail="Rate limit exceeded"
            )

Troubleshooting Scaling Issues

Common Scaling Problems

Common Issues: - Database connection pool exhaustion - Memory leaks in scaled processes - Uneven load distribution - Session persistence issues - Cache invalidation problems - Network bottlenecks

# Check database connections
dokku postgres:info your-app-db

# Monitor slow queries

dokku postgres:connect your-app-db

# Inside PostgreSQL:

# SELECT query, mean_time, calls FROM pg_stat_statements ORDER BY mean_time DESC LIMIT 10;

# Check connection pool status

dokku run your-app-name python -c "from database import engine; print(engine.pool.status())"

# Check memory usage
dokku resource:report your-app-name

# Monitor memory leaks
dokku logs your-app-name | grep -i "memory\|oom"

# Restart processes if needed
dokku ps:restart your-app-name

# Check response times
dokku logs your-app-name | grep "response_time"

# Monitor CPU usage
dokku run your-app-name top

# Check network connections
dokku run your-app-name netstat -an

Best Practices for Scaling

📊 Monitor Everything

Implement comprehensive monitoring and alerting

🔄 Gradual Scaling

Scale gradually and monitor the impact

🧪 Load Testing

Test your scaling configuration under load

🔒 Security First

Maintain security standards while scaling

Load Testing

Test your scaled application:

# Install load testing tools
pip install locust

# Create load test script
# locustfile.py
from locust import HttpUser, task, between

class WebsiteUser(HttpUser):
    wait_time = between(1, 3)

    @task(3)
    def index_page(self):
        self.client.get("/")

    @task(1)
    def api_endpoint(self):
        self.client.get("/api/users/me")

# Run load test
locust -f locustfile.py --host=https://your-app.com

Scaling

🔄 Horizontal Scaling

📊 Database Scaling

⚡ Caching Strategies

🔧 Load Balancing

📊 Monitor Everything

🔄 Gradual Scaling

🧪 Load Testing

🔒 Security First

🔐 Security

📊 Monitoring

🚀 Deployment

🔧 CI/CD

On this page