What this is for#
AcelleMail ships with QUEUE_CONNECTION=database as the default in .env.example line 40. The database driver puts all queue jobs in a MySQL/MariaDB table and uses row-level locks to coordinate between worker processes. It works out of the box, requires no extra services, and is right for installations up to ~500k emails per day.
Past that threshold, database-queue contention starts to dominate worker time. Workers spend more time waiting for locks than processing jobs. The symptoms are subtle at first — a small drop in throughput — then sudden when a campaign batch pushes contention past a critical point.
This guide walks the three queue driver options AcelleMail supports — database, Redis, AWS SQS — the threshold where each wins, the migration path between them, the AUTOMATION_QUEUE_CONNECTION separation pattern for high-volume installs, and the operational gotchas you only learn after the first 3am queue incident.
How AcelleMail uses queues#
Before picking a driver, understand what the queues actually do. From resources/documents/supervisor_{master,worker}_config.tmpl:
Master pool (2 processes by default):
queue:work --queue=import,default --tries=1 --max-time=180
The master pool handles:
import — CSV imports of subscribers (chunky long-running work)
default — miscellaneous Laravel default queue
Worker pool (15 processes by default):
queue:work --queue=high,batch,single,automation,automation-dispatch --tries=1 --max-time=180
The worker pool handles email sending:
high — admin-priority sends (welcome to new admin user, password resets, etc.)
batch — campaign sends (the high-volume firehose)
single — individual transactional sends
automation — automation-step emails (drip sequences, welcome series, etc.)
automation-dispatch — the meta job that scans for due automations every 5 minutes
So a single AcelleMail install runs 17 worker processes (2 master + 15 worker), each polling the queue table or service for work.
The implication for queue drivers: with 17 workers all polling the same queue backend, the polling pattern itself matters a lot. A database queue with 17 SELECT FOR UPDATE statements firing every few seconds is a different load profile than 17 BLPOP'ing Redis clients.
The three driver options#
Database (default)#
How it works. A jobs table in your application database. Workers do SELECT FROM jobs WHERE reserved_at IS NULL ORDER BY id LIMIT 1 FOR UPDATE SKIP LOCKED, pick up a row, set reserved_at = NOW(), process, then DELETE. Failed jobs go to a failed_jobs table.
Pros.
- Zero infrastructure overhead — already have MySQL.
- Easy to inspect (
SELECT * FROM jobs shows current backlog).
- Easy to clear (TRUNCATE in a panic).
- Transaction-safe with the rest of your data (campaign create + queue dispatch in one transaction).
- Backups capture queue state alongside application state.
Cons.
- Polling load on MySQL. Each worker polls every few seconds (configurable). 17 workers × every 3 seconds = ~5-6 SELECT queries per second baseline. Manageable but not free.
- Row-lock contention at high concurrency. SKIP LOCKED helps but isn't free.
- No native blocking semantics. Workers must poll; they can't subscribe-and-block like Redis BLPOP.
- Queue table grows. Completed jobs DELETE but unprocessed/retried jobs accumulate. Backlog visibility costs grow with table size.
Right when: <500k emails/day, single-server install, no separate cache layer planned, ops team is database-only-comfortable.
Redis#
How it works. Redis lists per queue name. Workers BLPOP-block on the list (BLPOP queue:batch 5 blocks up to 5 seconds for work). When a campaign dispatches, jobs RPUSH to the list. Failed jobs go to a separate failed_jobs list (Redis or DB).
Pros.
- Blocking semantics. Workers sleep until work arrives — no polling overhead.
- Sub-millisecond enqueue/dequeue. 100k jobs/sec on commodity Redis.
- Lower MySQL load. Frees your application DB from queue traffic — bigger wins than people expect.
- Native pub/sub for monitoring. Tools like Horizon (Laravel's official queue dashboard) plug in cleanly.
- Memory-only is fast; persistence is configurable (RDB snapshots or AOF for durability).
Cons.
- Another moving part. Redis is a service to run, monitor, back up, secure.
- Memory cost. Backlogged queues live in RAM. A 1M-job backlog at ~2KB each = ~2GB Redis memory.
- Less inspection ergonomics. Hard to "SELECT * FROM queue" — must use Redis CLI or a tool.
- Failover complexity. Single-Redis is a single point of failure; HA Redis (Sentinel, Cluster) is non-trivial.
Right when: 500k-5M emails/day, single Redis instance acceptable, ops team has Redis experience, throughput sensitivity matters.
AWS SQS#
How it works. Managed queue service from AWS. Workers long-poll SQS for messages. Each message has a visibility timeout — if worker doesn't ACK within the timeout, the message reappears.
Pros.
- Fully managed. AWS handles HA, durability, scaling.
- Massive scale. No practical upper limit on messages/sec.
- Pay-per-use. $0.40 per million messages. At 10M emails/month, that's $4/month — trivial.
- Cross-region durability. Critical for multi-region setups.
- Native DLQ (dead-letter queue) support.
Cons.
- AWS lock-in. Or at least AWS-tax — outbound traffic from non-AWS to SQS costs more.
- Per-message ordering not guaranteed on standard SQS. FIFO SQS exists but has lower throughput.
- 30s default visibility timeout can cause duplicate processing of slow jobs. Tune carefully.
- Network round-trip per poll. Slightly higher per-job latency than Redis (5-10ms vs sub-1ms).
- Per-message size limit 256KB. Rarely hit but does affect large payload jobs.
Right when: Multi-region setup, AWS-native ops team, >5M emails/day, durability requirements demand managed service.
The threshold question#
Rough heuristics:
| Daily email volume |
Recommended driver |
Why |
| < 500k emails/day |
Database |
Default; no operational complexity benefit yet |
| 500k - 5M emails/day |
Redis |
Sweet spot for self-hosted scale |
| > 5M emails/day |
Redis (HA) or SQS |
Database is a bottleneck; managed service worth considering |
| Multi-region |
SQS |
Cross-region durability is hard with self-hosted Redis |
These are guidelines, not laws. Other factors:
- Existing ops competency. If your team already runs Redis for caching, the Redis option is cheap to add. If not, the learning curve matters.
- Database server load. If MySQL is already at 60%+ CPU during peak campaigns, moving queue off MySQL is a quick win even at lower email volumes.
- Monitoring requirements. Database queues are easy to inspect with familiar SQL tools. Redis/SQS need separate dashboards.
- Failure-mode tolerance. Database queue is "queue down = whole app down." Redis can be down for queue while app stays up (campaigns queue locally, drain when Redis recovers — though Laravel's behavior here depends on driver config).
Switching from database to Redis#
The migration is straightforward in concept; the operational details matter.
1. Install + secure Redis#
# Ubuntu/Debian
sudo apt install -y redis-server
sudo systemctl enable --now redis-server
# Verify
redis-cli ping # → PONG
# Secure: edit /etc/redis/redis.conf
sudo sed -i 's/^# requirepass .*/requirepass YOUR_LONG_RANDOM_PASSWORD/' /etc/redis/redis.conf
sudo systemctl restart redis-server
Set bind 127.0.0.1 if Redis lives on the AcelleMail VPS. For separate Redis, bind to the internal-network IP and firewall port 6379 to AcelleMail's IP only.
2. Update .env#
QUEUE_CONNECTION=redis
REDIS_CLIENT=phpredis # or 'predis'
REDIS_HOST=127.0.0.1
REDIS_PASSWORD=YOUR_LONG_RANDOM_PASSWORD
REDIS_PORT=6379
REDIS_DB=0
REDIS_CACHE_DB=1 # if also using Redis as cache (recommended)
If running Redis on a separate server, set REDIS_HOST to its private IP and ensure connectivity (firewall, security groups).
3. Drain the database queue first#
You don't want jobs split across two backends during the cutover. Stop new sends, wait for the database jobs table to empty:
SELECT COUNT(*) FROM jobs; -- watch until 0
SELECT COUNT(*) FROM failed_jobs; -- check; handle separately
Or pause active campaigns and let the queue drain naturally. For high-volume installs where you can't pause, dual-run is possible (set QUEUE_CONNECTION=redis but keep one worker on --connection=database until it drains).
4. Restart workers#
sudo supervisorctl restart acellemail:*
Workers pick up the new QUEUE_CONNECTION from .env.
5. Verify#
# Inside Redis
redis-cli
> KEYS *
# Should see keys like "queues:batch", "queues:default", etc. after first send
# In application logs
tail -f storage/logs/laravel.log
# Watch for queue-related warnings
Send a test campaign to a small list. Verify the job processes successfully end-to-end.
The AUTOMATION_QUEUE_CONNECTION separation pattern#
For installs with heavy automation use, AcelleMail's .env supports an optional separate queue connection for automation jobs:
QUEUE_CONNECTION=redis # main campaign sends
AUTOMATION_QUEUE_CONNECTION=database # automations + automation-dispatch
Why separate?
The batch queue (campaign sends) and the automation queue (drip + welcome + sequence sends) have very different load profiles:
- Campaign sends are bursty. A big newsletter dispatch creates 100k+ jobs in seconds, then quiets down.
- Automation sends are steady. A welcome-series email per new subscriber, a birthday email per matching profile per day. Continuous low-rate.
If both share the same Redis instance, a campaign burst can backlog the queue and delay automation sends. By splitting them onto separate connections, you isolate the load.
When to use:
- 5M+ emails/month
- Time-sensitive automations (welcome series, abandoned cart) where delay is unacceptable
- Heavy automation volume relative to campaign volume
Most installs don't need this. Keep both on the same driver unless you have specific reason to split.
Operational gotchas#
1. Backpressure on database driver#
When the jobs table grows past ~100k rows, MySQL query plans on reserved_at IS NULL ORDER BY id LIMIT 1 start to degrade. The index helps but doesn't eliminate the problem.
Symptom. Workers report "no jobs" repeatedly while there are clearly jobs in the table. Investigation shows the worker queried for WHERE queue = 'batch' AND reserved_at IS NULL LIMIT 1 and the database optimizer scanned thousands of rows to find one matching.
Fix. Add a composite index if missing:
CREATE INDEX idx_jobs_queue_reserved ON jobs (queue, reserved_at, id);
Or migrate to Redis, where this whole class of problem doesn't exist.
2. Redis memory blowup during bounce-handler runs#
The bounce-handler command (handler:run every 30 min per routes/console.php) processes bounces in chunks. If you have a sudden bounce-rate spike (a list import gone wrong), Redis can balloon to GBs in minutes.
Symptom. Redis memory usage spikes, then crashes with OOM.
Fix. Set maxmemory + maxmemory-policy noeviction in /etc/redis/redis.conf. With noeviction, Redis refuses writes when full instead of evicting queue jobs (which would lose work). Pair with monitoring alerts.
3. SQS visibility timeout vs slow campaign jobs#
A campaign batch dispatch can take 30-60 seconds for very large lists. If SQS visibility timeout is 30s default, SQS re-queues the message at 30s, and you process it twice.
Symptom. Some recipients get duplicate emails after a campaign send.
Fix. Increase SQS visibility timeout on the queue: aws sqs set-queue-attributes --queue-url ... --attributes VisibilityTimeout=300. Or set per-message via --message-attributes.
4. Stale workers after deployment#
After deploying new AcelleMail code, workers keep running with the old code in memory. They process new jobs with the stale logic.
Symptom. Jobs that should use the new feature use the old behavior; logs show old class definitions.
Fix. Restart workers after every deployment:
sudo supervisorctl restart acellemail:*
Or use Laravel's queue:restart command for graceful restart:
sudo -u www-data php artisan queue:restart
Workers will finish their current job, then restart with new code. Add this to your deploy script.
5. Failed jobs invisible without inspection habits#
Failed jobs go to failed_jobs table (database) or a separate Redis list. Both are easy to forget.
Symptom. Days later, you notice 500 jobs in failed_jobs from a misconfigured template last Tuesday. Subscribers never got those emails.
Fix. Cron-scheduled check on failed_jobs count + alerting:
# In a monitoring script
COUNT=$(php artisan tinker --execute='echo \DB::table("failed_jobs")->count();')
if [ "$COUNT" -gt 0 ]; then
# alert ops
fi
For Redis-backed queues, Laravel Horizon dashboard shows failed jobs visually. Worth the setup time for any install with Redis already running.
6. Long-running batch jobs hitting max-time#
The supervisor config has --max-time=180 (3 minutes). Workers self-terminate after 180s and supervisor restarts them. This is normal and intentional — workers can leak memory; periodic restarts protect against that.
Symptom. Long campaign batches that take >3 min seem to silently restart mid-stream.
Fix. Acelle's batch-dispatch design accounts for this — the campaign's dispatchWithBatchMonitor creates a Job Batch that survives worker restarts. Each batch sub-job is short (a few subscribers). If you see longer-running jobs, that's usually a sign of bad batching logic or a custom worker that's not following Acelle's patterns.
Inspecting queue state#
Database queue#
-- Current queue depth
SELECT queue, COUNT(*) FROM jobs GROUP BY queue;
-- Stuck/long-running jobs
SELECT id, queue, payload->>'$.displayName', attempts, reserved_at
FROM jobs
WHERE reserved_at IS NOT NULL
AND reserved_at < NOW() - INTERVAL 5 MINUTE;
-- Failed jobs
SELECT id, queue, exception, failed_at
FROM failed_jobs
ORDER BY failed_at DESC LIMIT 20;
Redis queue#
redis-cli
> LRANGE queues:batch 0 10 # peek at first 10 jobs in batch queue
> LLEN queues:batch # depth of batch queue
> KEYS queues:* # all active queue keys
> SMEMBERS failed_jobs # failed-job IDs
For pretty visualization, use Laravel Horizon (Composer install + small config). It's overkill for tiny installs but excellent for any operator running Redis-backed queues seriously.
SQS queue#
aws sqs get-queue-attributes \
--queue-url https://sqs.region.amazonaws.com/account/your-queue \
--attribute-names All
# ApproximateNumberOfMessages = current backlog
# ApproximateNumberOfMessagesNotVisible = currently being processed
When to think about queue at all#
You don't need to think about queue drivers until you hit one of:
- Your
jobs table exceeds 100k rows during peak
- MySQL CPU hits >70% during campaign sends
- Sub-second delivery latency matters (transactional sends, etc.)
- You're planning >2M emails/day routinely
- An incident makes you realize you couldn't restore the queue from a backup easily
If none of those apply, the database default is doing its job. Spend the ops cycles elsewhere.
Related reading#