zrudko/homework

zrudko

Homework container

下载次数: 0状态：社区镜像维护者：zrudko仓库类型：镜像最近更新：6 年前

让 AI 帮你使用轩辕镜像？ · 展开查看说明 · 点击收起说明

如果你使用 DeepSeek、元宝 AI、Copilot、ChatGPT 等 AI 工具，协助配置轩辕镜像、编写 docker pull 命令、修改 Docker Compose 镜像地址、配置镜像加速、排查镜像拉取失败、分析报错日志等问题，请先让 AI 阅读并遵守轩辕镜像的规则文档。

只需在 AI 对话中先发送下面这句话即可：

请先完整阅读并严格遵守以下文档中的全部规则与要求：

https://xuanyuan.cloud/agents.md

在未充分阅读并理解该文档前，不要生成任何命令、配置、修改建议、故障排查方案或技术回答。后续所有输出都必须严格以该文档中的规范为最高优先级执行。

查看 agents.md 用法指南与完整示范。国内用户首推元宝 AI、DeepSeek 的深度思考模式，不推荐豆包 AI；Cursor 等编辑器可在对话 @ 该链接，或加入 User Rules。若 AI 无法访问外链，可打开说明文档复制全文粘贴。文档会随站点更新，复制内容可能过期，建议定期检查。

镜像标签列表与下载命令

Homework

Documentation

Basic links

http://homework.itchy.cz.s3-website.eu-central-1.amazonaws.com
https://s3.eu-central-1.amazonaws.com/homework.itchy.cz/archive/covid-19/05-06-2020.csv
https://hub.docker.com/r/zrudko/homework

Service updates

Each commit to https://github.com/westfood/homework triggers Dockerhub to build image https://hub.docker.com/r/zrudko/homework. New build artefact rewrites older ones. To archive previous artefacts we should employ tagging strategy. But it's overkill for this homework.

Fargate will pull new version of image from Dockerhub when task is scheduled (every 6 hours).

Prepare AWS environment for service | WIP

This playbook prepare ECS cluster called homework-runner for running dockerized service as a scheduled task every 6 hours. It should be idempotent - so there should be no issue running it again and again. Thus this step could be part deployment pipeline. Secrets would be provided from runner environment (be it jenkins, github actions or whatever.) S3 bucket homework.itchy.cz with enabled bucket hosting is created to provide index page for HTTP endpoint and archive of full datasets is created.

I had to define Scheduled task via Cloudwatch event via console, doing research in proper way to define it programatically.

S3 bucket name is: [***] it's defined via ansible declaration for production
ECS cluster name is defined via defaults/main.yml for deploy-to-aws role
Task have IAM role which allow it to push to S3 from ECS cluster.

Deployment to AWS from local machine

docker run --env-file aws-credentials zrudko/homework:latest ansible-playbook deploy.yaml -i prod
provide env-file with your credentials such as, use filename aws-credentials ideally as it's contained in gitignore to mitigate hasty commits of plaintext credentials.

bash
AWS_ACCESS_KEY_ID=NOT_BIG_SECRET
AWS_SECRET_ACCESS_KEY=SECRET
AWS_REGION=eu-central-1

Service for updating dataset

Via running docker image zrudko/homework:latest) we get dataset from COVID-19 https://github.com/CSSEGISandData/COVID-19 maintained by John Hopkins University. It is actually just ansible playbook role update-dataset. Python + boto3 as lambda function would be better approach, but this was very quick and it's DSL approach. Plus I wanted to try way Fargate to runs containers. So Fargate pull and run zrudko/homework:latest. That's it.

I did not add any logic for testing if URL to dataset filename with today's date yeald any HTTP 200. I just use shell date -1 day to get yesterday.

Services without any arguments should exit 0 docker run zrudko/homework:latest after fulfilling purpose.
It is meant to run as Fargate task which is triggered by cloudwatch event scheduler.
AWS credentials to access S3 should be provided by IAM role once service is running as Fargate task.
Dockerfile CMD is set to run ansible-playbook update-public-page.yaml -i prod which will get new dataset, parse it via read_csv module, push it to S3 archive and publish Czechia related data to S3 as HTML.
Service is stupid. It's trying to get dataset from previous day. If fails, it should not change published html or archive. There is no error handling.

Run service from local machine

provide docker run argument --env-file aws-credentials if you want to run service from your computer.

### Before going to production

If dataset is in Github and owned by 3rd party, we should find good way to monitor repository updates. Maybe using github API for periodical checks and trigger updates based on it. Main issue with covid-19 repo is we are getting new datasets with delay because of -1 day hack while requesting dataset. Some simple method if dataset filename is not found, then try yesterday could be used. If data would be under our control, push based approach would be best.
Switch from Fargate with Ansible to Lambda function. Run python function when repo is updated.
Publish HTML via CloudFront to use HTTPS and enjoy caching (and deal with cache invalidations).
Build and deployment should be handled by pipeline runner. Secrets should be provided to automation. For my projects I would go with Github actions and build pipelines there. If bitbucket I would go with their service.
Test if playbook is able to provide cleanup of AWS resources.
Provide tags for billing and inline tags with common resource definitions in operation.
Make develop branch and run towards testing environment first. This would require to remove hardcoded CMD for prod group_vars from Dockerfile. I usually decide about target environment from branch name.
Do monitoring of success/failure of tasks.

Initial thoughts before starting work

Actualy best solution would be python+boto3 in lambda for getting URL and publishing to S3. But then there would be not much place for Ansible and not much place for dockerization. Maybe dockerization would be useful for keeping ansible deployment DSL and lambda function stuff in one place. But I have no experience with Fargate, so I wanted to do solve via Fargate.

Update after playing with Ansible and Fargate. It's quite harder to deploy Fargate tasks via Ansible modules. Probably running CloudFormation via Ansible is better approach. Lambda would be definetely much faster deployment wise. Terraform would serve deployment better, but scheduled tasks has been as plugin in Terraform.

Fargate with scheduled task
- docker
  - getting dataset
    - optional: check file hash before putting to S3, if some file has been updated
  - parse for latest data
  - update table in S3
  - optional: S3 as https via Cloudfront / update cache afterwards (maybe S3 could do https out of box)
Operation
- Fargate + task defintion via terraform
  - check complexity of cloudformation versus Terraform

Homework definition

Goal

Show the ability to automate the deployment of dockerized application Infrastructure and tools:

AWS EC2 // I used ECS + Fargate instead
AWS S3 // Done
Docker and your preferred docker image // Alpine for size
Ansible // Done
Python or shell // Just bit of shell as I did not prepared lambda function. If would I would use boto3 for S3 communication, check form some csv to dict library and render HTML somehow.

Task

Download regularly (e.g. daily / hourly) some dataset from the free data provider. If you down know any, choose from:

a. https://github.com/CSSEGISandData/COVID-19/ // every 6 hours
b. [***]

Store downloaded dataset to S3 bucket // https://s3.eu-central-1.amazonaws.com/homework.itchy.cz/archive/covid-19/05-06-2020.csv has no directory index
From every downloaded dataset, extract some specific data (eg data relevant for Czechia, Prague, ...) // [*]
Display all extracted data using a single HTML page served from S3. A simple table is enough. // [*]

Instructions

Use well-known languages (preferable Python 3 or shell) to create scripts/application // Just bit of shell
Create a docker to encapsulate the application logic // Done
Use latest Ansible to create roles and playbooks // Done, if it's OK to *** latest by alpine package maintainers
Put all your source code in a public git repository (e.g. Github) // https://github.com/westfood/homework
Use Readme.MD file for the documentation (while evaluating we will use it to run the code) // done
If you find problems, or not implement something, you should mention it there // I did not used much of shell or Python - is it OK?
You don't need to provide automation for AWS infrastructure (EC2, S3) setup but you should document it // WIP via deploy-to-aws role, S3 ready, working on handling fargate

Bonus points

Replace EC2 with AWS serverless offering // Does Fargate scheduled tasks counts as serverless?
Document the next steps to make this small app being ready for production // should be there
Automate even the infrastructure setup (cloudformation, terraform) // WIP via Ansible and shell
Create a CI / CD pipelines // Just CD
Use your imagination and provide more than expected Not much, let's say HTTPS should be served via Cloudfront, if page would be consumed heavily.

镜像拉取方式

您可以使用以下命令拉取该镜像。请将 <标签> 替换为具体的标签版本。如需查看所有可用标签版本，请访问标签列表页面。

轩辕镜像加速拉取命令点我查看更多 homework 镜像标签

docker pull docker.xuanyuan.run/zrudko/homework:<标签>

使用方法：

DockerHub 原生拉取命令

docker pull zrudko/homework:<标签>

轩辕镜像配置手册

按平台快速找到配置文档

一键安装

一键安装 Docker

Linux Docker 一键安装

AI

用 AI 使用轩辕镜像

agents.md · AI 对话 · 提示词

Docker

登录仓库拉取

登录认证 · 私有仓库

专属域名拉取

免登录 · 高速拉取

Linux

Docker 镜像配置

Windows / Mac

Docker Desktop 配置

MacOS OrbStack

OrbStack 容器

Apple Container

macOS 原生容器

Docker Compose

Compose 项目配置

NAS

群晖

Synology 配置

飞牛

fnOS 镜像配置

绿联

绿联 NAS

威联通

QNAP 配置

极空间

极空间 NAS

Unraid

Unraid NAS

企业仓库

其他仓库

ghcr · Quay · nvcr

Harbor 镜像源

Proxy Repository 对接

Portainer 镜像源

Registries 配置

Nexus 镜像源

Docker Proxy 缓存

开发工具

Dev Containers

VS Code 开发容器

Podman

Podman 配置指南

Singularity / Apptainer

HPC 科学计算容器

Kubernetes

K8s Containerd

Kubernetes · Containerd

K3s

轻量级集群

面板 / 网络

爱快路由

iKuai 镜像加速

宝塔面板

一键配置镜像源

需要其他帮助？请查看我们的常见问题Docker 镜像访问常见问题解答或提交工单

镜像拉取常见问题

功能

版本功能对比

功能对比 · 版本选择

支持的镜像仓库

Docker Hub · GCR · GHCR

新手拉取配置

docker search 限制

专属域名 · Hub 搜索

不支持 push

仅支持 pull · 不支持

拉取速度原因

带宽 · 缓存 · 冷热镜像

错误码

402 与流量用尽

402 · 流量包 · 充值

401 认证失败

401 · docker login

manifest unknown

标签错误 · 镜像不存在

410 Gone 排查

410 · Docker 升级

429 限流

免费版 · 专业版 · 企业版 · 请求频率

其他报错

DNS 超时

DNS 解析 · 网络超时

TLS 证书失败

no matching manifest（架构）

账号

失败是否计费

manifest · blob · 计费

申请开发票（企业 / 个人）

企业 · 个人 · 工单

修改登录密码

网站 · 仓库 · 重置

注销账户

工单 · 数据 · 注销

原理

mirrors 不生效

daemon.json · 重启

去掉域名前缀

docker tag · 重命名

指定架构拉取

ARM64 · AMD64 · 多架构

latest 与「最新」

digest · 版本号 · 标签

查看全部问题→

用户好评

来自真实用户的反馈，见证轩辕镜像的优质服务

oldzhang

运维工程师

Linux服务器

"Docker访问体验非常流畅，大镜像也能快速完成下载。"