
nathanhowell/parquet-tools本镜像为小型容器镜像,包含AdoptOpenJDK 8 JRE和parquet-tools库。最初创建用于解决Java 9环境下无法加载部分Hadoop库的问题,尽管Java 11已不存在此限制,但该镜像仍可提供便捷的parquet-tools命令运行环境,适用于Parquet文件的各类操作。
cat:查看Parquet文件内容schema:分析Parquet文件的schema结构meta:提取Parquet文件的元数据通过docker run命令运行镜像,挂载本地Parquet文件到容器内路径,然后执行对应的parquet-tools命令。基本格式如下:
console$ docker run --rm -it -v [本地文件路径]:[容器内文件路径] nathanhowell/parquet-tools [parquet-tools命令] [容器内文件路径]
参数说明:
--rm:容器退出后自动删除-it:交互式终端-v:挂载本地文件到容器内用于查看Parquet文件的具体内容:
console$ docker run --rm -it -v $(PWD)/users.parquet:/tmp/file.parquet nathanhowell/parquet-tools cat /tmp/file.parquet name = Alyssa favorite_numbers: .array = 3 .array = 9 .array = 15 .array = 20 name = Ben favorite_color = red favorite_numbers:
用于查看Parquet文件的schema结构:
console$ docker run --rm -it -v $(PWD)/users.parquet:/tmp/file.parquet nathanhowell/parquet-tools schema /tmp/file.parquet message example.avro.User { required binary name (STRING); optional binary favorite_color (STRING); required group favorite_numbers (LIST) { repeated int32 array; } }
用于查看Parquet文件的元数据信息:
console$ docker run --rm -it -v $(PWD)/users.parquet:/tmp/file.parquet nathanhowell/parquet-tools meta /tmp/file.parquet file: file:/tmp/file.parquet creator: parquet-mr version 1.4.3 extra: avro.schema = {"type":"record","name":"User","namespace":"example.avro","fields":[{"name":"name","type":"string"},{"name":"favorite_color","type":["string","null"]},{"name":"favorite_numbers","type":{"type":"array","items":"int"}}]} file schema: example.avro.User -------------------------------------------------------------------------------- name: REQUIRED BINARY L:STRING R:0 D:0 favorite_color: OPTIONAL BINARY L:STRING R:0 D:1 favorite_numbers: REQUIRED F:1 .array: REPEATED INT32 R:1 D:1 row group 1: RC:2 TS:109 OFFSET:4 -------------------------------------------------------------------------------- name: BINARY SNAPPY DO:0 FPO:4 SZ:36/34/0.94 VC:2 ENC:PLAIN,BIT_PACKED ST:[no stats for this column] favorite_color: BINARY SNAPPY DO:0 FPO:40 SZ:32/30/0.94 VC:2 ENC:RLE,PLAIN,BIT_PACKED ST:[no stats for this column] favorite_numbers: .array: INT32 SNAPPY DO:0 FPO:72 SZ:45/45/1.00 VC:5 ENC:RLE,PLAIN ST:[no stats for this column]
make native命令使用GraalVM AOT编译器编译parquet-tools,在开发机器上可将启动时间减少超过一秒。


manifest unknown 错误
TLS 证书验证失败
DNS 解析超时
410 错误:版本过低
402 错误:流量耗尽
身份认证失败错误
429 限流错误
凭证保存错误
来自真实用户的反馈,见证轩辕镜像的优质服务