DevOps 提示词:CI/CD、Docker、K8s
用 AI 辅助编写 Dockerfile、GitHub Actions、K8s 配置的提示词
DevOps 场景下的 AI 辅助
DevOps 工作中有大量的配置文件编写:Dockerfile、CI/CD Pipeline、K8s Manifest、Terraform 配置……这些文件格式复杂、细节多、容易出错。
AI 在这个场景下特别有用:
- 它熟悉各种配置文件的语法和最佳实践
- 它能快速生成符合规范的配置
- 它能帮你排查配置错误
- 它能解释复杂的配置选项
提示词 1:Dockerfile 生成
请为以下应用生成生产级 Dockerfile。
应用信息:
- 语言/框架:[语言和框架]
- 构建命令:[如 npm run build]
- 启动命令:[如 node dist/main.js]
- 端口:[端口号]
- 环境变量:[列表]
要求:
1. 使用多阶段构建(Multi-stage Build)
2. 最小化镜像大小
3. 使用非 root 用户运行
4. 合理利用缓存层
5. 包含健康检查
6. 添加注释说明每一步
输出示例:Node.js 应用
# ============================================
# 阶段 1:安装依赖
# ============================================
FROM node:20-alpine AS deps
WORKDIR /app
# 只复制依赖文件,利用 Docker 缓存
COPY package.json package-lock.json ./
# 只安装生产依赖
RUN npm ci --only=production && \
# 清理 npm 缓存减小镜像
npm cache clean --force
# ============================================
# 阶段 2:构建
# ============================================
FROM node:20-alpine AS builder
WORKDIR /app
COPY package.json package-lock.json ./
RUN npm ci
# 复制源代码并构建
COPY . .
RUN npm run build
# ============================================
# 阶段 3:生产镜像
# ============================================
FROM node:20-alpine AS runner
WORKDIR /app
# 安全:使用非 root 用户
RUN addgroup --system --gid 1001 appgroup && \
adduser --system --uid 1001 appuser
# 从 deps 阶段复制生产依赖
COPY --from=deps /app/node_modules ./node_modules
# 从 builder 阶段复制构建产物
COPY --from=builder /app/dist ./dist
COPY --from=builder /app/package.json ./
# 切换到非 root 用户
USER appuser
# 暴露端口
EXPOSE 3000
# 健康检查
HEALTHCHECK --interval=30s --timeout=3s --start-period=5s --retries=3 \
CMD wget --no-verbose --tries=1 --spider http://localhost:3000/health || exit 1
# 启动应用
CMD ["node", "dist/main.js"]
Python 应用的 Dockerfile
# ============================================
# 阶段 1:构建
# ============================================
FROM python:3.12-slim AS builder
WORKDIR /app
# 安装构建依赖
RUN apt-get update && \
apt-get install -y --no-install-recommends gcc && \
rm -rf /var/lib/apt/lists/*
# 安装 Python 依赖
COPY requirements.txt .
RUN pip install --no-cache-dir --prefix=/install -r requirements.txt
# ============================================
# 阶段 2:生产镜像
# ============================================
FROM python:3.12-slim AS runner
WORKDIR /app
# 从 builder 复制已安装的依赖
COPY --from=builder /install /usr/local
# 复制应用代码
COPY . .
# 安全:非 root 用户
RUN useradd --create-home appuser
USER appuser
EXPOSE 8000
HEALTHCHECK --interval=30s --timeout=3s \
CMD python -c "import urllib.request; urllib.request.urlopen('http://localhost:8000/health')" || exit 1
CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000"]
提示词 2:Docker Compose 生成
请为以下微服务架构生成 docker-compose.yml。
服务列表:
[服务名、端口、依赖关系]
基础设施:
[数据库、缓存、消息队列等]
要求:
1. 服务间的依赖关系和启动顺序
2. 网络隔离
3. 数据持久化(volumes)
4. 环境变量管理
5. 健康检查
6. 开发环境和生产环境的差异处理
输出示例
version: '3.8'
services:
# ============ 基础设施 ============
postgres:
image: postgres:16-alpine
environment:
POSTGRES_DB: myapp
POSTGRES_USER: ${DB_USER:-postgres}
POSTGRES_PASSWORD: ${DB_PASSWORD:-postgres}
volumes:
- postgres_data:/var/lib/postgresql/data
- ./init.sql:/docker-entrypoint-initdb.d/init.sql
ports:
- "5432:5432"
healthcheck:
test: ["CMD-SHELL", "pg_isready -U postgres"]
interval: 10s
timeout: 5s
retries: 5
networks:
- backend
redis:
image: redis:7-alpine
command: redis-server --appendonly yes
volumes:
- redis_data:/data
ports:
- "6379:6379"
healthcheck:
test: ["CMD", "redis-cli", "ping"]
interval: 10s
timeout: 5s
retries: 5
networks:
- backend
# ============ 应用服务 ============
api:
build:
context: ./api
dockerfile: Dockerfile
environment:
DATABASE_URL: postgresql://${DB_USER:-postgres}:${DB_PASSWORD:-postgres}@postgres:5432/myapp
REDIS_URL: redis://redis:6379
NODE_ENV: production
ports:
- "3000:3000"
depends_on:
postgres:
condition: service_healthy
redis:
condition: service_healthy
healthcheck:
test: ["CMD", "wget", "--spider", "http://localhost:3000/health"]
interval: 30s
timeout: 5s
retries: 3
networks:
- backend
- frontend
restart: unless-stopped
web:
build:
context: ./web
dockerfile: Dockerfile
ports:
- "80:80"
depends_on:
api:
condition: service_healthy
networks:
- frontend
restart: unless-stopped
volumes:
postgres_data:
redis_data:
networks:
frontend:
backend:
提示词 3:GitHub Actions 工作流
请为以下项目生成 GitHub Actions CI/CD 工作流。
项目信息:
- 语言/框架:[描述]
- 测试框架:[描述]
- 部署目标:[描述]
- 分支策略:[描述]
工作流要求:
1. PR 触发:lint + test + build
2. 合并到 main:自动部署到 staging
3. 打 tag:自动部署到 production
4. 包含缓存优化
5. 包含并行执行
6. 包含 Slack/钉钉通知
输出示例
name: CI/CD Pipeline
on:
pull_request:
branches: [main, develop]
push:
branches: [main]
tags: ['v*']
env:
NODE_VERSION: '20'
REGISTRY: ghcr.io
IMAGE_NAME: ${{ github.repository }}
jobs:
# ============ 代码质量检查 ============
lint:
name: Lint & Type Check
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with:
node-version: ${{ env.NODE_VERSION }}
cache: 'npm'
- run: npm ci
- run: npm run lint
- run: npm run type-check
# ============ 测试 ============
test:
name: Test
runs-on: ubuntu-latest
services:
postgres:
image: postgres:16-alpine
env:
POSTGRES_DB: test
POSTGRES_PASSWORD: test
ports:
- 5432:5432
options: >-
--health-cmd pg_isready
--health-interval 10s
--health-timeout 5s
--health-retries 5
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with:
node-version: ${{ env.NODE_VERSION }}
cache: 'npm'
- run: npm ci
- run: npm test -- --coverage
env:
DATABASE_URL: postgresql://postgres:test@localhost:5432/test
- uses: actions/upload-artifact@v4
with:
name: coverage
path: coverage/
# ============ 构建 Docker 镜像 ============
build:
name: Build Docker Image
needs: [lint, test]
runs-on: ubuntu-latest
if: github.event_name == 'push'
permissions:
contents: read
packages: write
steps:
- uses: actions/checkout@v4
- uses: docker/setup-buildx-action@v3
- uses: docker/login-action@v3
with:
registry: ${{ env.REGISTRY }}
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}
- uses: docker/metadata-action@v5
id: meta
with:
images: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}
tags: |
type=ref,event=branch
type=semver,pattern={{version}}
type=sha
- uses: docker/build-push-action@v5
with:
context: .
push: true
tags: ${{ steps.meta.outputs.tags }}
cache-from: type=gha
cache-to: type=gha,mode=max
# ============ 部署到 Staging ============
deploy-staging:
name: Deploy to Staging
needs: [build]
runs-on: ubuntu-latest
if: github.ref == 'refs/heads/main'
environment: staging
steps:
- uses: actions/checkout@v4
- name: Deploy to Staging
run: |
echo "Deploying to staging..."
# kubectl set image deployment/app app=${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}:sha-${GITHUB_SHA::7}
# ============ 部署到 Production ============
deploy-production:
name: Deploy to Production
needs: [build]
runs-on: ubuntu-latest
if: startsWith(github.ref, 'refs/tags/v')
environment: production
steps:
- uses: actions/checkout@v4
- name: Deploy to Production
run: |
echo "Deploying to production..."
# kubectl set image deployment/app app=${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}:${{ github.ref_name }}
提示词 4:Kubernetes Manifest
请为以下服务生成 Kubernetes 部署配置。
服务信息:
- 服务名:[名称]
- 镜像:[镜像地址]
- 端口:[端口]
- 资源需求:[CPU/内存]
- 副本数:[数量]
- 环境变量:[列表]
要求:
1. Deployment + Service + Ingress
2. 资源限制(requests/limits)
3. 健康检查(liveness/readiness)
4. 滚动更新策略
5. ConfigMap 和 Secret 管理
6. HPA(自动扩缩容)
输出示例
# configmap.yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: api-config
namespace: production
data:
NODE_ENV: "production"
LOG_LEVEL: "info"
REDIS_HOST: "redis-master"
---
# secret.yaml(实际使用时通过 sealed-secrets 或外部密钥管理)
apiVersion: v1
kind: Secret
metadata:
name: api-secrets
namespace: production
type: Opaque
stringData:
DATABASE_URL: "postgresql://user:pass@postgres:5432/myapp"
JWT_SECRET: "your-jwt-secret-here"
---
# deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: api
namespace: production
labels:
app: api
spec:
replicas: 3
selector:
matchLabels:
app: api
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 1
maxUnavailable: 0
template:
metadata:
labels:
app: api
spec:
containers:
- name: api
image: ghcr.io/myorg/api:v1.0.0
ports:
- containerPort: 3000
envFrom:
- configMapRef:
name: api-config
- secretRef:
name: api-secrets
resources:
requests:
cpu: 250m
memory: 256Mi
limits:
cpu: 500m
memory: 512Mi
livenessProbe:
httpGet:
path: /health
port: 3000
initialDelaySeconds: 10
periodSeconds: 30
failureThreshold: 3
readinessProbe:
httpGet:
path: /ready
port: 3000
initialDelaySeconds: 5
periodSeconds: 10
---
# service.yaml
apiVersion: v1
kind: Service
metadata:
name: api
namespace: production
spec:
selector:
app: api
ports:
- port: 80
targetPort: 3000
type: ClusterIP
---
# hpa.yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: api-hpa
namespace: production
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: api
minReplicas: 3
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 80
提示词 5:Terraform/IaC
请用 Terraform 编写以下基础设施配置。
基础设施需求:
[描述需要的云资源]
云平台:[AWS/GCP/Azure]
要求:
1. 使用模块化结构
2. 使用变量和输出
3. 状态管理配置
4. 安全最佳实践(最小权限、加密)
5. 添加注释说明
提示词 6:故障排查
请帮我排查以下 DevOps 问题。
问题描述:
[描述问题现象]
环境信息:
- 平台:[K8s/Docker/VM]
- 相关日志:
[日志内容]
已尝试的排查步骤:
[已经做了什么]
请:
1. 分析可能的原因(按可能性排序)
2. 给出排查命令
3. 给出修复方案
常见问题排查模板
问题:Pod 一直处于 CrashLoopBackOff
排查步骤:
1. kubectl describe pod [pod-name] -n [namespace]
2. kubectl logs [pod-name] -n [namespace] --previous
3. 检查资源限制是否太低
4. 检查健康检查配置是否合理
5. 检查环境变量和 Secret 是否正确
提示词 7:监控和告警
请为以下服务设计监控方案。
服务架构:
[描述服务架构]
要求:
1. Prometheus 指标设计
2. Grafana Dashboard JSON
3. 告警规则(PrometheusRule)
4. 告警分级和通知渠道
5. SLO/SLI 定义
SLO 设计示例
# SLO 定义
slos:
- name: API 可用性
target: 99.9% # 每月允许 43 分钟不可用
indicator:
type: availability
metric: |
sum(rate(http_requests_total{status!~"5.."}[5m]))
/
sum(rate(http_requests_total[5m]))
- name: API 延迟
target: 95% # 95% 的请求在 200ms 内完成
indicator:
type: latency
metric: |
histogram_quantile(0.95,
sum(rate(http_request_duration_seconds_bucket[5m])) by (le)
) < 0.2
# 告警规则
alerts:
- name: HighErrorRate
severity: critical
condition: error_rate > 1% for 5m
action: 立即通知 On-Call 工程师
- name: HighLatency
severity: warning
condition: p95_latency > 500ms for 10m
action: 通知团队 Slack 频道
DevOps 配置审查提示词
请审查以下 DevOps 配置文件,指出问题和改进建议。
配置文件:
```yaml
[配置内容]
请检查:
- 安全问题(硬编码密钥、过大权限)
- 最佳实践(镜像标签、资源限制)
- 可靠性(健康检查、重启策略)
- 可维护性(注释、命名规范)
- 成本优化(资源过度分配)
---
## 总结
| 场景 | 提示词 | 关键要点 |
|------|--------|---------|
| Dockerfile | 提示词 1 | 多阶段构建、非 root、缓存优化 |
| Docker Compose | 提示词 2 | 依赖顺序、健康检查、网络隔离 |
| GitHub Actions | 提示词 3 | 缓存、并行、环境分离 |
| K8s | 提示词 4 | 资源限制、HPA、滚动更新 |
| Terraform | 提示词 5 | 模块化、状态管理、最小权限 |
| 故障排查 | 提示词 6 | 日志分析、系统性排查 |
| 监控 | 提示词 7 | SLO/SLI、告警分级 |
DevOps 配置文件的特点是:格式严格、细节多、一个小错误就可能导致大问题。AI 能帮你快速生成符合最佳实践的配置,但一定要在测试环境验证后再上生产。
> 基础设施即代码的精髓不是"用代码管理基础设施",而是"像对待代码一样对待基础设施"——版本控制、代码审查、自动测试,一个都不能少。 相关文章
评论
加载中...
评论
加载中...