Scaling
Scale your FastLaunchAPI application to handle increased traffic and workload with horizontal scaling, load balancing, and performance optimization strategies
Scaling
Scale your FastLaunchAPI application to handle increased traffic and workload. This guide covers horizontal scaling, load balancing, database optimization, and performance monitoring strategies.
Overview
Scaling is the process of increasing your application's capacity to handle more users, requests, and data. FastLaunchAPI supports both vertical and horizontal scaling approaches.
Scaling Strategies
🔄 Horizontal Scaling
Scale out by adding more application instances
📊 Database Scaling
Optimize database performance and implement read replicas
⚡ Caching Strategies
Implement Redis caching and CDN for better performance
🔧 Load Balancing
Distribute traffic across multiple application instances
Horizontal Scaling
Dokku Horizontal Scaling
Dokku makes horizontal scaling simple with built-in process management and load balancing capabilities.
Scale Web Processes
Increase the number of web workers to handle more concurrent requests. Learn more about Dokku process scaling:
# Scale to 3 web processes
dokku ps:scale your-app-name web=3
# Scale workers for background tasks
dokku ps:scale your-app-name worker=2 beat=1
Monitor your application's resource usage to determine optimal scaling:
# Check current process status
dokku ps:report your-app-name
# Monitor resource usage
dokku resource:report your-app-name
Configure Resource Limits
Set appropriate resource limits to prevent individual processes from consuming too many resources. See Dokku resource management:
# Set memory limits (in MB)
dokku resource:limit your-app-name --memory 512
dokku resource:limit your-app-name --memory-swap 1024
# Set CPU limits
dokku resource:limit your-app-name --cpu-quota 50000
Health Checks
Configure health checks to ensure your scaled instances are healthy. Learn about Dokku health checks:
# Configure health check endpoint
dokku checks:set your-app-name web /health
# Set health check options
dokku checks:set your-app-name web --wait=5 --timeout=30 --attempts=3
Your FastLaunchAPI includes a built-in health check endpoint:
@app.get("/health")
async def health_check():
return {
"status": "healthy",
"timestamp": datetime.utcnow(),
"version": "1.0.0"
}
Auto-scaling Configuration
For automatic scaling based on metrics, consider using external monitoring tools or cloud provider auto-scaling features.
# Example auto-scaling script (run via cron)
#!/bin/bash
CPU_USAGE=$(dokku resource:report your-app-name | grep "cpu percent" | awk '{print $3}' | cut -d'%' -f1)
if [ "$CPU_USAGE" -gt 80 ]; then
CURRENT_SCALE=$(dokku ps:scale your-app-name | grep web | awk '{print $2}')
NEW_SCALE=$((CURRENT_SCALE + 1))
dokku ps:scale your-app-name web=$NEW_SCALE
fi
Database Scaling
PostgreSQL Optimization
Connection Pooling
Optimize database connections using connection pooling. Update your database configuration:
# database.py
from sqlalchemy import create_engine
from sqlalchemy.pool import QueuePool
engine = create_engine(
DATABASE_URL,
poolclass=QueuePool,
pool_size=20,
max_overflow=30,
pool_pre_ping=True,
pool_recycle=3600
)
For Dokku PostgreSQL, you can configure connection limits:
# Increase max connections
dokku postgres:info your-app-db
dokku postgres:connect your-app-db
Database Indexing
Optimize your database queries with proper indexing. Add indexes for frequently queried fields:
# In your SQLAlchemy models
class User(Base):
__tablename__ = "users"
id = Column(Integer, primary_key=True)
email = Column(String, unique=True, index=True) # Indexed for fast lookups
created_at = Column(DateTime, default=datetime.utcnow, index=True)
# Composite index for complex queries
__table_args__ = (
Index('idx_user_email_created', 'email', 'created_at'),
)
Read Replicas
For high-read workloads, implement read replicas. Configure read/write splitting:
# database.py
class DatabaseManager:
def __init__(self):
self.write_engine = create_engine(DATABASE_WRITE_URL)
self.read_engine = create_engine(DATABASE_READ_URL)
def get_write_session(self):
return sessionmaker(bind=self.write_engine)()
def get_read_session(self):
return sessionmaker(bind=self.read_engine)()
Set up read replica in Dokku:
# Create read replica
dokku postgres:create your-app-db-replica
dokku postgres:link your-app-db-replica your-app-name
Database Monitoring
Monitor database performance with built-in tools:
# Check database metrics
dokku postgres:info your-app-db
# Monitor slow queries
dokku postgres:connect your-app-db
# Inside PostgreSQL:
# SELECT * FROM pg_stat_activity WHERE state = 'active';
Caching Strategies
Redis Caching Implementation
Application-Level Caching
Implement caching for frequently accessed data:
# cache.py
import redis
import json
from functools import wraps
redis_client = redis.from_url(REDIS_URL)
def cache_result(expiration=3600):
def decorator(func):
@wraps(func)
async def wrapper(*args, **kwargs):
cache_key = f"{func.__name__}:{hash(str(args) + str(kwargs))}"
# Try to get from cache
cached = redis_client.get(cache_key)
if cached:
return json.loads(cached)
# Execute function and cache result
result = await func(*args, **kwargs)
redis_client.setex(cache_key, expiration, json.dumps(result))
return result
return wrapper
return decorator
# Usage in your routes
@app.get("/api/users/{user_id}")
@cache_result(expiration=1800) # Cache for 30 minutes
async def get_user(user_id: int):
# Database query here
pass
Session Caching
Use Redis for session storage to improve performance:
# session.py
from fastapi import FastAPI, Depends
from fastapi.middleware.sessions import SessionMiddleware
import redis
app.add_middleware(
SessionMiddleware,
secret_key=SECRET_KEY,
session_cookie="fastlaunchapi_session"
)
# Custom session storage
class RedisSessionStorage:
def __init__(self, redis_url: str):
self.redis = redis.from_url(redis_url)
def get_session(self, session_id: str):
return self.redis.hgetall(f"session:{session_id}")
def set_session(self, session_id: str, data: dict, expiration: int = 3600):
self.redis.hmset(f"session:{session_id}", data)
self.redis.expire(f"session:{session_id}", expiration)
API Response Caching
Cache API responses for better performance:
# middleware/cache_middleware.py
from fastapi import Request, Response
from fastapi.responses import JSONResponse
import hashlib
class CacheMiddleware:
def __init__(self, app, redis_client):
self.app = app
self.redis = redis_client
async def __call__(self, scope, receive, send):
if scope["type"] == "http":
request = Request(scope, receive)
# Create cache key
cache_key = f"api:{hashlib.md5(str(request.url).encode()).hexdigest()}"
# Check cache
cached_response = self.redis.get(cache_key)
if cached_response:
response = JSONResponse(content=json.loads(cached_response))
await response(scope, receive, send)
return
await self.app(scope, receive, send)
Redis Scaling
Scale your Redis instance for better performance:
# Scale Redis with more memory
dokku redis:info your-app-redis
dokku redis:promote your-app-redis your-app-name
# Configure Redis for production
dokku redis:connect your-app-redis
# Inside Redis CLI:
# CONFIG SET maxmemory 2gb
# CONFIG SET maxmemory-policy allkeys-lru
Load Balancing
Nginx Load Balancing
Dokku automatically configures Nginx as a reverse proxy and load balancer for your scaled application instances.
Configure Nginx
Customize Nginx configuration for better load balancing. Create an Nginx configuration file:
# nginx.conf.sigil
{{ range $port_map := .PROXY_PORT_MAP | split " " }}
{{ $port_map_list := $port_map | split ":" }}
{{ $scheme := index $port_map_list 0 }}
{{ $listen_port := index $port_map_list 1 }}
{{ $upstream_port := index $port_map_list 2 }}
upstream {{ $.APP }}-{{ $upstream_port }} {
{{ range $container := $.DOKKU_APP_CONTAINERS }}
server {{ $container.Address }}:{{ $upstream_port }};
{{ end }}
}
{{ end }}
server {
listen {{ $.NGINX_PORT }};
server_name {{ $.NOSSL_SERVER_NAME }};
location / {
proxy_pass http://{{ $.APP }}-{{ $upstream_port }};
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
# Load balancing configuration
proxy_next_upstream error timeout invalid_header http_500 http_502 http_503 http_504;
proxy_connect_timeout 30s;
proxy_send_timeout 30s;
proxy_read_timeout 30s;
}
}
Health Check Configuration
Configure health checks for load balancing:
# Add to your Nginx configuration
location /health {
access_log off;
proxy_pass http://{{ $.APP }}-{{ $upstream_port }};
proxy_set_header Host $host;
}
Rate Limiting
Implement rate limiting to protect your application:
# Add rate limiting
http {
limit_req_zone $binary_remote_addr zone=api:10m rate=10r/s;
limit_req_zone $binary_remote_addr zone=login:10m rate=1r/s;
}
server {
location /api/ {
limit_req zone=api burst=20 nodelay;
proxy_pass http://{{ $.APP }}-{{ $upstream_port }};
}
location /auth/login {
limit_req zone=login burst=5 nodelay;
proxy_pass http://{{ $.APP }}-{{ $upstream_port }};
}
}
Performance Monitoring
Application Metrics
Built-in Monitoring
Implement monitoring endpoints in your FastLaunchAPI application:
# monitoring.py
from fastapi import APIRouter
import psutil
import time
from datetime import datetime
router = APIRouter()
@router.get("/metrics")
async def get_metrics():
return {
"timestamp": datetime.utcnow(),
"cpu_usage": psutil.cpu_percent(),
"memory_usage": psutil.virtual_memory().percent,
"disk_usage": psutil.disk_usage('/').percent,
"active_connections": len(psutil.net_connections()),
"uptime": time.time() - psutil.boot_time()
}
@router.get("/health/detailed")
async def detailed_health():
return {
"status": "healthy",
"database": await check_database_health(),
"redis": await check_redis_health(),
"external_apis": await check_external_apis()
}
Performance Logging
Implement performance logging middleware:
# middleware/performance_middleware.py
import time
import logging
from fastapi import Request
logger = logging.getLogger(__name__)
class PerformanceMiddleware:
def __init__(self, app):
self.app = app
async def __call__(self, scope, receive, send):
start_time = time.time()
await self.app(scope, receive, send)
process_time = time.time() - start_time
if scope["type"] == "http":
request = Request(scope, receive)
logger.info(
f"Request: {request.method} {request.url.path} "
f"- Time: {process_time:.4f}s"
)
Resource Monitoring
Monitor system resources with Dokku:
# Check resource usage
dokku resource:report your-app-name
# Monitor logs for performance issues
dokku logs your-app-name -t
# Check process status
dokku ps:report your-app-name
Database Connection Optimization
Connection Pool Configuration
Proper connection pool configuration is crucial for scaling database operations effectively.
# database.py
from sqlalchemy import create_engine
from sqlalchemy.pool import QueuePool, StaticPool
# Production configuration
engine = create_engine(
DATABASE_URL,
poolclass=QueuePool,
pool_size=20, # Number of connections to maintain
max_overflow=30, # Additional connections when needed
pool_pre_ping=True, # Verify connections before use
pool_recycle=3600, # Recycle connections every hour
echo=False, # Disable SQL logging in production
future=True
)
# Connection pool monitoring
@app.get("/db-pool-status")
async def get_db_pool_status():
return {
"pool_size": engine.pool.size(),
"checked_in": engine.pool.checkedin(),
"checked_out": engine.pool.checkedout(),
"overflow": engine.pool.overflow(),
"invalidated": engine.pool.invalidated()
}
Celery Worker Scaling
Background Task Scaling
Scale Celery Workers
Scale your Celery workers based on task queue length:
# Scale workers dynamically
dokku ps:scale your-app-name worker=4
# Monitor worker status
dokku run your-app-name celery -A celery_setup inspect active
dokku run your-app-name celery -A celery_setup inspect stats
Queue Monitoring
Monitor your task queues:
# celery_monitoring.py
from celery import Celery
from fastapi import APIRouter
router = APIRouter()
@router.get("/celery/status")
async def celery_status():
from celery_setup import celery_app
inspect = celery_app.control.inspect()
stats = inspect.stats()
active = inspect.active()
return {
"workers": stats,
"active_tasks": active,
"queue_length": get_queue_length()
}
def get_queue_length():
from celery_setup import celery_app
with celery_app.connection() as conn:
return conn.default_channel.client.llen('celery')
Task Routing
Configure task routing for better performance:
# celery_setup.py
from celery import Celery
celery_app = Celery('fastlaunchapi')
# Configure task routing
celery_app.conf.update(
task_routes={
'email.send_email': {'queue': 'email'},
'payments.process_payment': {'queue': 'payments'},
'ai.generate_content': {'queue': 'ai'},
},
worker_prefetch_multiplier=1,
task_acks_late=True,
worker_max_tasks_per_child=1000,
)
Scale specific queues:
# Scale specific worker types
dokku ps:scale your-app-name worker=2 email-worker=1 ai-worker=1
Security Considerations for Scaling
Scaling Security Checklist: - Implement rate limiting - Use connection pooling securely - Monitor for DDoS attacks - Secure inter-service communication - Implement proper authentication for scaled services - Use HTTPS for all communications - Monitor and log security events
Rate Limiting Implementation
# rate_limiting.py
from fastapi import HTTPException, Request
import redis
import time
class RateLimiter:
def __init__(self, redis_client, max_requests=100, window=60):
self.redis = redis_client
self.max_requests = max_requests
self.window = window
async def check_rate_limit(self, request: Request):
client_ip = request.client.host
key = f"rate_limit:{client_ip}"
current_time = int(time.time())
pipe = self.redis.pipeline()
pipe.zremrangebyscore(key, 0, current_time - self.window)
pipe.zcard(key)
pipe.zadd(key, {str(current_time): current_time})
pipe.expire(key, self.window)
results = pipe.execute()
current_requests = results[1]
if current_requests >= self.max_requests:
raise HTTPException(
status_code=429,
detail="Rate limit exceeded"
)
Troubleshooting Scaling Issues
Common Scaling Problems
Common Issues: - Database connection pool exhaustion - Memory leaks in scaled processes - Uneven load distribution - Session persistence issues - Cache invalidation problems - Network bottlenecks
# Check database connections
dokku postgres:info your-app-db
# Monitor slow queries
dokku postgres:connect your-app-db
# Inside PostgreSQL:
# SELECT query, mean_time, calls FROM pg_stat_statements ORDER BY mean_time DESC LIMIT 10;
# Check connection pool status
dokku run your-app-name python -c "from database import engine; print(engine.pool.status())"
# Check memory usage
dokku resource:report your-app-name
# Monitor memory leaks
dokku logs your-app-name | grep -i "memory\|oom"
# Restart processes if needed
dokku ps:restart your-app-name
# Check response times
dokku logs your-app-name | grep "response_time"
# Monitor CPU usage
dokku run your-app-name top
# Check network connections
dokku run your-app-name netstat -an
Best Practices for Scaling
📊 Monitor Everything
Implement comprehensive monitoring and alerting
🔄 Gradual Scaling
Scale gradually and monitor the impact
🧪 Load Testing
Test your scaling configuration under load
🔒 Security First
Maintain security standards while scaling
Load Testing
Test your scaled application:
# Install load testing tools
pip install locust
# Create load test script
# locustfile.py
from locust import HttpUser, task, between
class WebsiteUser(HttpUser):
wait_time = between(1, 3)
@task(3)
def index_page(self):
self.client.get("/")
@task(1)
def api_endpoint(self):
self.client.get("/api/users/me")
# Run load test
locust -f locustfile.py --host=https://your-app.com
Next Steps
🔐 Security
Implement security best practices for scaled applications
📊 Monitoring
Set up comprehensive monitoring and alerting
🚀 Deployment
Optimize your deployment process for scaled applications
🔧 CI/CD
Implement continuous integration and deployment
This scaling guide provides comprehensive strategies for growing your FastLaunchAPI application. Start with horizontal scaling and gradually implement advanced techniques based on your specific needs and traffic patterns.