
fikolis/airflowApache Airflow是一款开源的工作流编排平台,用于以代码形式定义、调度和监控复杂工作流。Airflow Docker镜像是官方提供的容器化分发版本,旨在简化Airflow的部署流程,确保环境一致性,并支持快速集成到容器化基础设施中。
主要用途:
Airflow官方镜像托管于Docker Hub,默认标签为apache/airflow:latest。建议指定具体版本(如apache/airflow:2.8.0)以确保稳定性。
bash# 拉取最新版镜像 docker pull apache/airflow:latest # 拉取指定版本镜像 docker pull apache/airflow:2.8.0
Airflow官方镜像托管于Docker Hub,默认标签为apache/airflow:latest。建议指定具体版本(如apache/airflow:2.8.0)以确保稳定性。
bash# 拉取最新版镜像 docker pull apache/airflow:latest # 拉取指定版本镜像 docker pull apache/airflow:2.8.0
以下示例为单机测试场景,使用默认SequentialExecutor和SQLite元数据库(生产环境需替换为PostgreSQL/MySQL)。
首次运行需初始化元数据库(存储工作流元数据、任务状态等):
bashdocker run --rm \ -e AIRFLOW__CORE__EXECUTOR=SequentialExecutor \ -e AIRFLOW__DATABASE__SQL_ALCHEMY_CONN=sqlite:////opt/airflow/airflow.db \ -e _AIRFLOW_DB_UPGRADE=true \ -v ./dags:/opt/airflow/dags \ # 挂载本地DAG目录 apache/airflow:latest
初始化完成后,启动Web UI(端口8080)和调度器(负责任务触发):
bashdocker run -d \ --name airflow-webserver \ -p 8080:8080 \ # Web UI端口映射 -e AIRFLOW__CORE__EXECUTOR=SequentialExecutor \ -e AIRFLOW__DATABASE__SQL_ALCHEMY_CONN=sqlite:////opt/airflow/airflow.db \ -e AIRFLOW__WEBSERVER__EXPOSE_CONFIG=true \ # Web UI显示配置详情 -v ./dags:/opt/airflow/dags \ -v ./logs:/opt/airflow/logs \ # 挂载日志目录(持久化任务日志) apache/airflow:latest webserver docker run -d \ --name airflow-scheduler \ -e AIRFLOW__CORE__EXECUTOR=SequentialExecutor \ -e AIRFLOW__DATABASE__SQL_ALCHEMY_CONN=sqlite:////opt/airflow/airflow.db \ -v ./dags:/opt/airflow/dags \ -v ./logs:/opt/airflow/logs \ apache/airflow:latest scheduler
生产环境建议使用docker-compose管理多组件(如Web服务、调度器、元数据库、消息队列)。以下为基于CeleryExecutor的分布式部署示例(需PostgreSQL+Redis):
docker-compose.ymlyamlversion: '3.8' x-airflow-common: &airflow-common image: apache/airflow:2.8.0 environment: &airflow-common-env AIRFLOW__CORE__EXECUTOR: CeleryExecutor AIRFLOW__DATABASE__SQL_ALCHEMY_CONN: postgresql+psycopg2://airflow:airflow@postgres/airflow AIRFLOW__CELERY__BROKER_URL: redis://:@redis:6379/0 AIRFLOW__CORE__LOAD_EXAMPLES: 'false' # 禁用示例DAG AIRFLOW__WEBSERVER__EXPOSE_CONFIG: 'true' _AIRFLOW_DB_UPGRADE: 'true' # 启动时自动升级数据库 _AIRFLOW_WWW_USER_CREATE: 'true' # 创建默认管理员用户 _AIRFLOW_WWW_USER_USERNAME: admin # 管理员用户名 _AIRFLOW_WWW_USER_PASSWORD: admin # 管理员密码(生产环境需修改) volumes: - ./dags:/opt/airflow/dags - ./logs:/opt/airflow/logs - ./plugins:/opt/airflow/plugins # 挂载自定义插件目录 depends_on: - postgres - redis services: postgres: # 元数据库(存储工作流状态) image: postgres:15 environment: POSTGRES_USER: airflow POSTGRES_PASSWORD: airflow POSTGRES_DB: airflow volumes: - postgres-db-volume:/var/lib/postgresql/data redis: # Celery消息队列(分发任务) image: redis:latest ports: - "6379:6379" healthcheck: test: ["CMD", "redis-cli", "ping"] interval: 30s timeout: 30s retries: 3 airflow-webserver: # Web UI服务 <<: *airflow-common command: webserver ports: - "8080:8080" healthcheck: test: ["CMD", "curl", "--fail", "http://localhost:8080/health"] interval: 30s timeout: 30s retries: 3 restart: always airflow-scheduler: # 调度器 <<: *airflow-common command: scheduler restart: always airflow-worker: # 任务执行节点(可横向扩展多个实例) <<: *airflow-common command: celery worker restart: always airflow-init: # 初始化服务(仅首次运行) <<: *airflow-common command: version environment: <<: *airflow-common-env _AIRFLOW_DB_UPGRADE: 'true' _AIRFLOW_WWW_USER_CREATE: 'true' _AIRFLOW_WWW_USER_USERNAME: admin _AIRFLOW_WWW_USER_PASSWORD: admin volumes: postgres-db-volume:
bash# 初始化目录权限(避免容器内权限问题) mkdir -p ./dags ./logs ./plugins chmod -R 777 ./dags ./logs ./plugins # 生产环境需按需调整权限 # 启动所有服务 docker-compose up -d # 查看服务状态 docker-compose ps
Airflow通过环境变量配置核心参数,格式为AIRFLOW__<SECTION>__<KEY>(SECTION对应配置文件章节,KEY对应具体配置项)。常用变量如下:
| 环境变量 | 说明 | 默认值/示例 |
|---|---|---|
AIRFLOW__CORE__EXECUTOR | 执行器类型 | SequentialExecutor/CeleryExecutor |
AIRFLOW__DATABASE__SQL_ALCHEMY_CONN | 元数据库连接串 | postgresql+psycopg2://user:pass@host/db |
AIRFLOW__CELERY__BROKER_URL | Celery消息队列地址(CeleryExecutor需配置) | redis://:@redis:6379/0 |
AIRFLOW__WEBSERVER__EXPOSE_CONFIG | Web UI是否显示配置详情 | false/true |
AIRFLOW__CORE__LOAD_EXAMPLES | 是否加载示例DAG | true/false |
_AIRFLOW_DB_UPGRADE | 容器启动时是否执行airflow db upgrade | true(初始化数据库) |
_AIRFLOW_WWW_USER_CREATE | 是否创建Web UI管理员用户 | true |
_AIRFLOW_WWW_USER_USERNAME | 管理员用户名 | admin |
_AIRFLOW_WWW_USER_PASSWORD | 管理员密码 | admin(生产环境必须修改) |
Airflow需持久化的数据包括:
/opt/airflow/dags(如-v ./dags:/opt/airflow/dags)/opt/airflow/logs(如-v ./logs:/opt/airflow/logs)Web UI默认监听容器内8080端口,通过宿主机端口映射访问(如http://localhost:8080)。登录用户为环境变量_AIRFLOW_WWW_USER_USERNAME和_AIRFLOW_WWW_USER_PASSWORD配置的值(默认admin/admin)。
Web UI功能包括:


manifest unknown 错误
TLS 证书验证失败
DNS 解析超时
410 错误:版本过低
402 错误:流量耗尽
身份认证失败错误
429 限流错误
凭证保存错误
来自真实用户的反馈,见证轩辕镜像的优质服务