Server Management

Scaling AcelleMail for 100K+ Emails Per Day

At 100K sends/day you cross the threshold where the default AcelleMail config starts to drag — queue depths climb, MySQL slow queries appear, occasional PHP-FPM 502s. This guide walks the bottleneck math, the configs to tune (queue pool size, MySQL InnoDB, PHP-FPM, nginx, Redis), when to split components onto separate hosts, and the cost/benefit at each step.

December 15, 2025 12 min read Advanced

What this is for

At 100K sends/day, the default AcelleMail config starts to drag. You'll notice:

Queue depths that don't drain (1k+ jobs queued even when "nothing is happening")
MySQL slow query log filling up
Occasional 502 Bad Gateway when admin pages load slowly
Workers getting OOM-killed during list imports
Customers complaining about "sending taking forever"

None of these are individually catastrophic, but each one compounds. 100K/day is the threshold where one well-tuned box starts to struggle — and where 30 minutes of targeted tuning gets you another 5× headroom. This guide walks the math, the configs, and the architectural decisions, in the order you should hit them.

Step 0 — Do the bottleneck math

100K sends/day = ~70 sends/minute if perfectly even. In reality, you get bursts — a 9 AM Tuesday campaign blast might be 100k sends in the first 20 minutes = 5,000 sends/minute peak.

Each "send" is one job pulled from Redis → one HTTP call to the sending provider (SES/Mailgun/SendGrid) → one row inserted into email_log. A healthy worker handles ~20-50 sends/minute (limited by sending-provider API latency, ~50-200ms per call).

So to handle 5,000 sends/minute peak, you need:

5,000 sends/min ÷ 30 sends/min/worker = ~170 worker-slots needed at peak

That's significantly above the default 15-worker pool. Either accept that bursts will queue up and drain over 5-10 minutes (often fine), or scale the worker pool.

The right answer depends on your customer expectations. For most use cases, a 10-minute queue drain on a 100k-send burst is acceptable — and the default 15 + auto-scale-to-20 via queue:adjust handles it. For latency-sensitive transactional sends (welcome emails, password resets), bump the pool higher.

Step 1 — Queue worker pool sizing

Per the supervisor setup guide, the two-tier pool default is 2 master + 15 worker = 17 total at Medium tier (4 vCPU / 8 GB).

For 100k+/day, the right starting point is Large tier (8 vCPU / 16 GB) with 4 + 30 workers:

# /etc/supervisor/conf.d/acellemail-master.conf
numprocs=4   # was 2

# /etc/supervisor/conf.d/acellemail-worker.conf
numprocs=30  # was 15

sudo supervisorctl reread && sudo supervisorctl update
sudo supervisorctl status
# Should now show 4 + 30 = 34 RUNNING processes

Memory math: each worker peaks ~256-512 MB. 30 × 384 MB ≈ 11.5 GB just for workers — fits comfortably in 16 GB after MySQL (3 GB), Redis (1 GB), PHP-FPM web (1 GB), nginx (negligible), OS (1 GB) = ~17 GB worst case. If you're tight, scale to 24 GB.

Verify queue drain rate is acceptable:

# Trigger a 10k send-test campaign, then watch the drain:
watch -n 2 'redis-cli llen queues:batch; redis-cli llen queues:high'

You want both to be back to near-zero within 5 minutes.

Step 2 — MySQL tuning

Edit /etc/mysql/mysql.conf.d/mysqld.cnf:

[mysqld]
# Cache hot indexes + data in RAM. Set to 50-75% of system RAM if MySQL is alone.
# For a co-located AcelleMail + MySQL on 16 GB, 4-6 GB is a good baseline.
innodb_buffer_pool_size = 4G

# Larger redo logs = fewer flushes under heavy write load (campaign blast)
innodb_log_file_size = 512M

# Allow more concurrent connections (30 workers + web + admin + cron)
max_connections = 300

# Disable query cache (deprecated in MySQL 8; removed in MySQL 9; can hurt perf at scale)
query_cache_type = 0
query_cache_size = 0

# Trade durability for throughput — durable enough for queues; tolerates ~1s of lost commits on crash
innodb_flush_log_at_trx_commit = 2

# Optimize for SSD (default is OK for most installs; tune if you see I/O bottleneck in slow log)
innodb_io_capacity = 2000
innodb_io_capacity_max = 4000

# Per-table tablespaces (newer default, but verify)
innodb_file_per_table = 1

Restart MySQL:

sudo systemctl restart mysql

Verify the buffer pool is actually using the new size:

sudo mysql -e "SHOW VARIABLES LIKE 'innodb_buffer_pool_size';"
# Expect: 4294967296 (4 GB in bytes)

innodb_flush_log_at_trx_commit = 2 trade-off: the default 1 flushes the redo log to disk on every commit (full durability). Setting 2 flushes only once per second (~1s of writes lost on power failure). For email-sending workloads this is acceptable — at-most-1-second of email_log writes might be lost, but the sending itself completed (recipient got the email). The throughput gain is significant (often 3-5×). If you need full durability, leave it at 1 and accept slower writes.

Watch the slow query log

# Enable in mysqld.cnf:
slow_query_log = 1
slow_query_log_file = /var/log/mysql/slow.log
long_query_time = 1.0
log_queries_not_using_indexes = 0

sudo systemctl restart mysql

# Watch:
sudo tail -f /var/log/mysql/slow.log

Common AcelleMail slow queries and their fixes:

SELECT ... FROM subscribers WHERE list_id = X AND status = 'subscribed' slow → ensure compound index on (list_id, status) exists (run SHOW INDEX FROM subscribers to verify)
SELECT ... FROM email_log WHERE campaign_id = X slow → add index on campaign_id
SELECT ... FROM jobs ORDER BY id LIMIT 1 slow → switch queue driver to Redis (it eliminates jobs table polling)

Step 3 — PHP-FPM tuning

Edit /etc/php/8.3/fpm/pool.d/www.conf:

pm = dynamic
pm.max_children = 50      # default 5
pm.start_servers = 10     # default 2
pm.min_spare_servers = 5  # default 1
pm.max_spare_servers = 20 # default 3
pm.max_requests = 500     # recycle workers periodically — prevents memory creep

; Important: limit how long a single FPM worker can hang before being killed
; — protects against a slow outbound HTTP call blocking the pool
request_terminate_timeout = 60s

Restart PHP-FPM:

sudo systemctl restart php8.3-fpm

Memory math: 50 workers × ~80 MB peak = ~4 GB worst case. Confirms our 16 GB sizing.

request_terminate_timeout = 60s is a hard lesson from production. Without it, a single web request that hangs (slow remote API call, DNS timeout, blocked syscall) holds a worker indefinitely. 5 such requests = entire FPM pool blocked = site appears down. 60-second timeout kills the hung worker; supervisor restarts it. Set this on every production install.

Step 4 — Nginx tuning

Edit /etc/nginx/nginx.conf:

worker_processes auto;                # one per CPU core

events {
    worker_connections 2048;          # default 768 — too low for high-traffic sites
    use epoll;                        # default on Linux; explicit for clarity
    multi_accept on;
}

http {
    # gzip on responses
    gzip on;
    gzip_min_length 1024;
    gzip_types text/plain text/css text/javascript application/javascript application/json application/xml;
    gzip_comp_level 5;

    # Reasonable client + buffer settings for the admin UI
    client_max_body_size 300M;
    client_body_buffer_size 128k;
    client_header_buffer_size 8k;
    large_client_header_buffers 4 16k;

    # Keepalives reduce TCP overhead from the admin's repeated AJAX polls
    keepalive_timeout 65;
    keepalive_requests 100;
}

Reload nginx:

sudo nginx -t && sudo systemctl reload nginx

Step 5 — Redis on the same box (still — for now)

At 100k/day, Redis stays on the same box. Splitting it off is premature optimization until you hit ~1M/day or you need HA. Confirm Redis is the queue driver per Redis for Queue Processing:

grep QUEUE_CONNECTION /var/www/acellemail/.env
# Expect: QUEUE_CONNECTION=redis

# Confirm Redis is sized for the workload:
redis-cli config get maxmemory
# Should be at least 2GB for 100k/day; 4GB for headroom

redis-cli config get maxmemory-policy
# MUST be: noeviction (queues require this — see Redis article)

Step 6 — When to split DB to its own host

Co-located DB starts to bottleneck around 5-10M sends/month (~150k-300k/day average). Symptoms:

iostat -x 1 shows sustained 80%+ disk %util during campaigns
Top reports mysqld consistently in the top 2 CPU consumers
SHOW PROCESSLIST shows 100+ active connections with worker queries waiting

At that point, move MySQL to its own host:

DigitalOcean Managed DB — easy, ~$15/mo for the smallest tier. Adds 1-2ms latency per query (private network). Worth it for the operational simplicity.
AWS RDS — same model. db.t3.medium is a reasonable starting point at ~$55/mo on-demand or ~$35 with 1-year Reserved.
Self-managed on a separate droplet — cheapest, more ops work. Pick if you already have MySQL expertise in-house.

In AcelleMail's .env:

DB_HOST=10.0.0.5         # private IP of the DB box (NEVER use public IP)
DB_DATABASE=acellemail
DB_USERNAME=acellemail
DB_PASSWORD=...

# Force private network if your cloud provides one

Run php artisan config:clear after.

Step 7 — When to add a second app server

At ~10-20M sends/month (300k-600k/day), one app server starts to struggle even after all the tuning above. The right next step is horizontal scaling:

[Load Balancer]
       ↓
  ┌────┴────┐
[App1]   [App2]   ← stateless — run web UI + workers
       ↓
[MySQL on managed]
[Redis on its own box, replicated]
[Object storage for shared storage/ — S3 / DO Spaces / B2]

Key changes from single-server:

Move storage/ to shared object storage (S3 / DO Spaces). Both app servers must see the same files.
Move sessions to Redis (SESSION_DRIVER=redis in .env). Otherwise customers get logged out when the LB routes them to a different app server.
Add a sticky-session policy on the LB for the WYSIWYG editor (it does background autosaves keyed to session).
Run cron + supervisor on only one node (or use leader-election like php artisan schedule:work with a lock). Running cron on both = double-firing every scheduled task.
Add a Redis password if Redis is now on a network reachable from multiple boxes.

This is the boundary where you should consider Docker / Kubernetes — orchestrating multiple stateless app instances by hand gets tedious fast. See the Docker deployment guide.

Step 8 — Sending-provider rate limits

You can scale AcelleMail all you want, but your sending provider's rate limit caps your throughput. Watch these:

Provider	Default limit	How to raise
Amazon SES	14 sends/sec sandbox; production starts at 50/s and ramps with reputation	Open AWS Support ticket; usual cadence is +50/s/week as reputation builds
SendGrid	100 sends/sec on Pro plan	Upgrade plan
Mailgun	100 sends/sec on starter; higher tiers go to 1000/s	Upgrade plan
Postmark	10 sends/sec default; up to 100/s on request	Email support

At 100k/day = ~1.16 sends/second average, you're well within any provider's limits. At 1M/day = 11.5/sec average, you'll bump SES's sandbox cap during burst peaks. Plan ahead — see SES Sending Limits Cookbook.

Quick-reference tuning checklist

After all the above, your config delta from default looks like:

Component	Setting	Default	Tuned (100k/day)
Supervisor master	numprocs	2	4
Supervisor worker	numprocs	15	30
MySQL	innodb_buffer_pool_size	128M	4G
MySQL	max_connections	151	300
MySQL	innodb_log_file_size	50M	512M
MySQL	innodb_flush_log_at_trx_commit	1	2
PHP-FPM	pm.max_children	5	50
PHP-FPM	request_terminate_timeout	(unset)	60s
nginx	worker_connections	768	2048
Redis	maxmemory	(unset)	2-4G
Redis	maxmemory-policy	noeviction (already)	noeviction (verify)
`.env`	QUEUE_CONNECTION	sync	redis
`.env`	CACHE_DRIVER	file	redis
`.env`	SESSION_DRIVER	file	redis

Common issues

Symptom	Cause	Fix
Queue depth grows during campaign blast, doesn't drain	Worker pool too small	Step 1 — bump worker numprocs
MySQL CPU pegs at 100% during sends	Buffer pool too small; reading from disk constantly	Step 2 — increase `innodb_buffer_pool_size`
Random 502 Bad Gateway on admin pages	PHP-FPM pool exhausted	Step 3 — bump `pm.max_children`; add `request_terminate_timeout`
OOM kills during list import	`memory_limit` too low for big CSV	`php.ini` `memory_limit = 1G` (for fpm + cli)
Workers slow even when queue is empty	Misconfigured `--sleep` (workers spinning)	Verify supervisor configs include `--sleep=3`
Mail-merge campaigns take minutes per recipient	Heavy template + many merge tags	Profile with `xhprof` / `blackfire`; cache merged content where possible
Sending IP getting rate-limited by Gmail/Outlook	Throughput exceeds receiver's per-IP cap	Add IP rotation; see Multi-Server Rotation Pattern
Free disk space dropping fast	`email_log` table growing without bound	`system:cleanup` daily task should prune; verify cron is firing
`php artisan` operations slow	Cached config / view files stale	`php artisan optimize:clear` after major config changes

When to stop tuning and just scale up

A single tuned 8 vCPU / 16 GB box handles ~1M-2M sends/month comfortably. Above that, add hardware before tuning further:

2M+/mo → 16 GB → 24-32 GB RAM; consider RDS for DB
5M+/mo → Multi-app-server architecture (Step 7)
20M+/mo → Multi-region, dedicated sending IPs, custom partition for email_log, full DBA review

The cost of an additional $30-100/mo of hardware is much less than the cost of 4 hours of your time spent micro-tuning.

FAQ

Should I tune PHP-FPM pm = static or pm = ondemand instead of dynamic? For a busy-most-of-the-time AcelleMail (campaigns going out throughout the day), dynamic is the best balance. static wastes RAM on idle nights. ondemand adds latency on the first request after idle. dynamic is the right default for 100k/day.

Why not Cloudflare in front? Cloudflare can absorb traffic spikes to public AcelleMail pages (tracking pixel endpoints, unsubscribe links). It can't help with worker throughput (those are SSE/API calls from the server). Worth adding for the tracking-pixel layer; not a substitute for the tuning above.

Should I tune the MySQL tmp_table_size / max_heap_table_size? Defaults are usually fine. If SHOW STATUS LIKE 'Created_tmp_disk_tables' shows a high number relative to Created_tmp_tables, bump both to 256M.

What about HTTP/2 / HTTP/3? nginx 1.24+ supports both. Modest improvement on admin UI responsiveness; no impact on send throughput. Worth enabling: listen 443 ssl http2; in the vhost.

Does Acelle support read replicas? Yes — Laravel's read / write database config in config/database.php supports separate read endpoints. Useful at 5M+/mo when read-heavy operations (campaign reports, subscriber search) start to slow down. Pre-configured Laravel pattern; AcelleMail honours it.

Can I cap a single campaign's send rate? Yes — set a per-sending-server throttle. See Sending Throttling Strategies for the full configuration.

Setting Up Queue Workers and Cron Jobs — the supervisor baseline (this article extends it)
Redis for Queue Processing — Redis is a prerequisite for any meaningful scaling
Automated Database Backups — protect your scaled-up DB
Server Requirements and Hosting Options — pick the right tier for your volume
Install AcelleMail on AWS EC2 — for the multi-AZ / multi-server architecture
Docker Deployment Guide for AcelleMail — for the multi-app-server pattern
Configuring Amazon SES with AcelleMail — pair with SES at scale
SES Sending Limits Cookbook — raise the sending-provider cap
Multi-Server Rotation Pattern — multi-IP setup beyond a single SES account
Sending Throttling Strategies — pace sends to avoid receiver throttling
Post-Install Hardening Checklist — apply hardening after every scale-up

Tagged

Acellemail

Inicie sesión para dar me gusta 7 16 comentarios

5 comentarios

Únase a la conversación. Los comentarios están abiertos a miembros de la comunidad AcelleMail.

El registro toma unos 10 segundos: sin verificación de email.

Crear una cuenta Iniciar sesión

tnovak.cz hace 2 meses

saving this one. we're about to hit the volume tier where we need to think about queue tuning.

0
m.schmidt78 hace 3 meses

Tip for high-volume installs: monitor your failed_jobs table size, not just count. We had a queue migration that left 50k stale failed rows that started slowing reads. Truncate periodically.

0
1. admin hace 3 meses
  
  Good tip. The Cloudflare-outbound-rate-limit case is something we hadnt documented.
  
  0
2. admin hace 3 semanas (editado)
  
  solid addition — adding to the article on the next refresh.
  
  0
tranminh.devop… hace 3 meses

Have you tried SQS for the queue at scale? We're hesitant about the AWS lock-in but the managed angle is appealing

0
1. admin hace 3 meses
  
  We don't recommend that approach in production. It works in dev but has subtle race conditions under concurrent load. Stick with the documented pattern
  
  0
2. admin hace 3 meses (editado)
  
  good question. the campaign:rerun audit writes to laravel.log only when the audit decides to force-resume — pure noop runs are silent. we'll add an info-level heartbeat in a future acelle release to make it easier to monitor
  
  0
3. admin hace 2 meses (editado)
  
  good catch. the bounds (200/32) are hardcoded in the runtime. we've discussed making them configurable; not a near-term priority but it's tracked.
  
  0
4. admin hace 2 meses (editado)
  
  good question — and one that comes up often enough we should add an FAQ section. Short answer: yes for the common case; the exception is when youre running custom plugins that override the default behavior.
  
  0
5. admin hace 2 meses (editado)
  
  we tested this with up to 1m subscribers on a $40/mo vps. past that you start needing query optimization. below that, the defaults are fine
  
  0
6. admin hace 1 mes (editado)
  
  for your specific case, i'd recommend testing with `--dry-run` first. the behavior under high load isn't 100% deterministic and we want you to see your own pattern before committing.
  
  0
7. admin hace 2 semanas (editado)
  
  were aware of the silent-bail-out on deleted customers — there's an open issue for it. workaround for now: monitor the campaign:rerun log for absence of expected log lines, alert when silent for > 20 min...
  
  0
jmorrison.itop… hace 3 meses

Moved from database queue to Redis last month at ~800k emails/day. Worker throughput went up ~40%. MySQL CPU dropped from 60% to 18% baseline. Highly recommend the migration once you're past 500k.

0
1. admin hace 3 meses (editado)
  
  Thanks for the detail — adding the kernel-reboot edge case to the article on the next update. 👀
  
  0
i.rossi.mil hace 4 meses

We do automated backups to S3 nightly. wp-cli-style. Restore tested quarterly. The article's emphasis on testing restores cannot be overstated...

0
1. admin hace 2 meses (editado)
  
  Solid case study material here. If you're open to it, we'd love to write this up as a blog post — happy to credit you anonymously or otherwise
  
  0

Server Management

Redis for Queue Processing

AcelleMail's default queue driver is the database — fine for hobby installs, terrible for production. Switching to Redis cuts queue-dispatch...

10 min read Intermediate

2 6

Server Management

Automated Database Backups for AcelleMail

The first backup is the cheapest you'll ever take. By the time you actually need to restore, the cost of NOT having a backup is hours of dow...

12 min read Intermediate

6 10

Server Management

Setting Up Queue Workers and Cron Jobs

AcelleMail uses Laravel queues + a system cron to do everything beyond rendering web pages — sending campaigns, processing automations, hand...

10 min read Intermediate

4 12

Scaling AcelleMail for 100K+ Emails Per Day

What this is for

Step 0 — Do the bottleneck math

Step 1 — Queue worker pool sizing

Step 2 — MySQL tuning

Watch the slow query log

Step 3 — PHP-FPM tuning

Step 4 — Nginx tuning

Step 5 — Redis on the same box (still — for now)

Step 6 — When to split DB to its own host

Step 7 — When to add a second app server

Step 8 — Sending-provider rate limits

Quick-reference tuning checklist

Common issues

When to stop tuning and just scale up

FAQ

Related articles

5 comentarios

Redis for Queue Processing

Automated Database Backups for AcelleMail

Setting Up Queue Workers and Cron Jobs

More in Server Management

Automated Database Backups for AcelleMail

Redis for Queue Processing

Setting Up Queue Workers and Cron Jobs

Ejecute su email marketing en su propio servidor, en sus propios términos

What this is for#

Step 0 — Do the bottleneck math#

Step 1 — Queue worker pool sizing#

Step 2 — MySQL tuning#

Watch the slow query log#

Step 3 — PHP-FPM tuning#

Step 4 — Nginx tuning#

Step 5 — Redis on the same box (still — for now)#

Step 6 — When to split DB to its own host#

Step 7 — When to add a second app server#

Step 8 — Sending-provider rate limits#

Quick-reference tuning checklist#

Common issues#

When to stop tuning and just scale up#

FAQ#

Related articles#

Get more guides like this

Related reading

Redis for Queue Processing

Automated Database Backups for AcelleMail

Setting Up Queue Workers and Cron Jobs

More in Server Management

Automated Database Backups for AcelleMail

Redis for Queue Processing

Setting Up Queue Workers and Cron Jobs

Ejecute su email marketing en su propio servidor, en sus propios términos

Get the AcelleMail newsletter

What this is for

Step 0 — Do the bottleneck math

Step 1 — Queue worker pool sizing

Step 2 — MySQL tuning

Watch the slow query log

Step 3 — PHP-FPM tuning

Step 4 — Nginx tuning

Step 5 — Redis on the same box (still — for now)

Step 6 — When to split DB to its own host

Step 7 — When to add a second app server

Step 8 — Sending-provider rate limits

Quick-reference tuning checklist

Common issues

When to stop tuning and just scale up

FAQ

Related articles