备份与维护
日常运维:备份、日志管理、服务升级和故障排查。
数据库备份
手动备份
bash
# Dump all SlaunchX databases
docker exec slaunchx-mysql-test mysqldump -uroot -p<root-password> \
--databases slaunchx slaunchx_tron_wallet slaunchx_solana_wallet \
--single-transaction --quick --routines --triggers \
> /home/backups/mysql-$(date +%Y%m%d-%H%M%S).sql
# Verify backup integrity
tail -1 /home/backups/mysql-*.sql
# Expected: "-- Dump completed on ..."自动每日备份(Cron)
bash
cat > /opt/slaunchx/scripts/backup-mysql.sh << 'EOF'
#!/bin/bash
BACKUP_DIR=/home/backups/mysql
RETENTION_DAYS=14
TIMESTAMP=$(date +%Y%m%d-%H%M%S)
mkdir -p $BACKUP_DIR
docker exec slaunchx-mysql-test mysqldump -uroot -p<root-password> \
--databases slaunchx slaunchx_tron_wallet slaunchx_solana_wallet \
--single-transaction --quick --routines --triggers \
> "$BACKUP_DIR/backup-$TIMESTAMP.sql"
# Verify
if [ $? -eq 0 ] && [ -s "$BACKUP_DIR/backup-$TIMESTAMP.sql" ]; then
sha256sum "$BACKUP_DIR/backup-$TIMESTAMP.sql" > "$BACKUP_DIR/backup-$TIMESTAMP.sha256"
echo "Backup successful: backup-$TIMESTAMP.sql"
else
echo "ERROR: Backup failed!" >&2
exit 1
fi
# Cleanup old backups
find $BACKUP_DIR -name "backup-*.sql" -mtime +$RETENTION_DAYS -delete
find $BACKUP_DIR -name "backup-*.sha256" -mtime +$RETENTION_DAYS -delete
EOF
chmod +x /opt/slaunchx/scripts/backup-mysql.sh
# Schedule: daily at 03:40 UTC
crontab -e
# Add: 40 3 * * * /opt/slaunchx/scripts/backup-mysql.sh >> /var/log/slaunchx-backup.log 2>&1从备份恢复
bash
docker exec -i slaunchx-mysql-test mysql -uroot -p<root-password> < /home/backups/mysql/backup-YYYYMMDD-HHMMSS.sqlRedis 备份
Redis 启用 AOF 持久化(--appendonly yes)后会在重启时自动恢复。如需显式备份:
bash
# Trigger RDB snapshot
docker exec slaunchx-redis-test redis-cli -a <password> BGSAVE
# Copy the dump file
docker cp slaunchx-redis-test:/data/dump.rdb /home/backups/redis-$(date +%Y%m%d).rdbMinIO 备份
bash
# Mirror all buckets
docker exec slaunchx-minio-test mc mirror local/slaunchx /backup/minio/日志管理
Docker 日志轮转
所有容器启动时均使用 --log-opt max-size=50m --log-opt max-file=3,限制每个容器最多 150 MB 日志(3 x 50 MB 文件,自动轮转)。
查看日志
bash
# Recent logs
docker logs --tail 200 slaunchx-app-prometheus-test
# Logs since a timestamp
docker logs --since "2026-03-23T10:00:00" slaunchx-app-prometheus-test
# Follow real-time
docker logs -f slaunchx-app-prometheus-test
# Filter errors
docker logs slaunchx-app-prometheus-test 2>&1 | grep -iE "error|exception|fatal" | tail -30日志位置
Docker 将日志存储在:
{docker-data-root}/containers/{container-id}/{container-id}-json.log使用 docker info | grep "Docker Root Dir" 查找数据根目录。
服务升级
标准升级流程
bash
MODULE=app-prometheus
ENV=test
PORT=18020
IMAGE=localhost:5000/slaunchx/$MODULE:dev-latest
# 1. Pull new image
docker pull $IMAGE
# 2. Stop and remove old container
docker stop slaunchx-$MODULE-$ENV
docker rm slaunchx-$MODULE-$ENV
# 3. Start new container (same command as initial deploy)
docker run -d \
--name slaunchx-$MODULE-$ENV \
--network slaunchx-intra \
-p 0.0.0.0:$PORT:$PORT \
--env-file /opt/slaunchx/config/$MODULE/$ENV.env \
-e SPRING_PROFILES_ACTIVE=$ENV \
--log-opt max-size=50m --log-opt max-file=3 \
--restart unless-stopped \
$IMAGE
# 4. Verify health
sleep 15 # wait for Spring Boot startup
curl -s http://localhost:$PORT/prometheus/actuator/health使用部署脚本
bash
# Upgrade a single module (handles stop/rm/pull/run)
ci/local/deploy.sh test app-prometheus
# Upgrade all modules
ci/local/deploy.sh test升级时的数据库迁移
Flyway 在启动时自动运行。如果新镜像中包含新的迁移文件:
- Flyway 检测到新文件并按顺序执行
- 如果迁移失败,容器会退出——请查看日志
- 请勿手动修补数据库来绕过失败的迁移
bash
# Check Flyway history
docker exec -i slaunchx-mysql-test mysql -uslaunchx -p<password> slaunchx \
-e "SELECT version, description, success FROM flyway_schema_history ORDER BY installed_rank DESC LIMIT 10;"镜像仓库清理
旧 Docker 镜像会在私有仓库中不断累积。定期清理:
bash
# Run garbage collection
docker exec slaunchx-registry bin/registry garbage-collect \
/etc/docker/registry/config.yml --delete-untagged
# Check registry disk usage
du -sh /home/registry/data/故障排查
容器无法启动
bash
# Check exit code and logs
docker inspect slaunchx-app-prometheus-test --format '{{.State.ExitCode}}'
docker logs --tail 50 slaunchx-app-prometheus-test| 退出码 | 含义 |
|---|---|
| 0 | 正常关闭 |
| 1 | 应用错误(查看日志) |
| 137 | OOM killed(增大 -Xmx 或主机内存) |
| 143 | SIGTERM(正常 docker stop) |
MySQL 连接问题
bash
# Test from inside the app container
docker exec -it slaunchx-app-prometheus-test bash
# Then: curl -s slaunchx-mysql-test:3306 || echo "Cannot reach MySQL"
# Test credentials
docker exec -it slaunchx-mysql-test mysql -uslaunchx -p<password> -e "SELECT 1"Redis 连接问题
bash
docker exec -it slaunchx-redis-test redis-cli -a <password> INFO server | head -5RabbitMQ 队列堆积
bash
# Check queue depths
curl -s -u slaunchx:<password> http://localhost:18672/api/queues | \
python3 -c "import sys,json; [print(f'{q[\"name\"]}: {q[\"messages\"]}') for q in json.load(sys.stdin)]"磁盘空间
bash
# Docker disk usage
docker system df
# Reclaim unused resources (careful in production)
docker system prune --volumes # removes stopped containers, unused images/volumes