
library/sparkMaintained by:
Apache Spark
Where to get help:
Apache Spark™ community
Dockerfile links4.0.1-scala2.13-java21-python3-ubuntu, 4.0.1-java21-python3, 4.0.1-java21, python3, latest
4.0.1-scala2.13-java21-r-ubuntu, 4.0.1-java21-r
4.0.1-scala2.13-java21-ubuntu, 4.0.1-java21-scala
4.0.1-scala2.13-java21-python3-r-ubuntu
4.0.1-scala2.13-java17-python3-ubuntu, 4.0.1-python3, 4.0.1, python3-java17
4.0.1-scala2.13-java17-r-ubuntu, 4.0.1-r, r
4.0.1-scala2.13-java17-ubuntu, 4.0.1-scala, scala
4.0.1-scala2.13-java17-python3-r-ubuntu
3.5.7-scala2.12-java17-python3-ubuntu, 3.5.7-java17-python3, 3.5.7-java17
3.5.7-scala2.12-java17-r-ubuntu, 3.5.7-java17-r
3.5.7-scala2.12-java17-ubuntu, 3.5.7-java17-scala
3.5.7-scala2.12-java17-python3-r-ubuntu
3.5.7-scala2.12-java11-python3-ubuntu, 3.5.7-python3, 3.5.7
3.5.7-scala2.12-java11-r-ubuntu, 3.5.7-r
3.5.7-scala2.12-java11-ubuntu, 3.5.7-scala
3.5.7-scala2.12-java11-python3-r-ubuntu
Where to file issues:
[***]
Supported architectures: (more info)
amd64, arm64v8
Published image artifact details:
repo-info repo's repos/spark/ directory (history)
(image metadata, transfer size, etc)
Image updates:
official-images repo's library/spark label
official-images repo's library/spark file (history)
Source of this description:
docs repo's spark/ directory (history)
Apache Spark™ is a multi-language engine for executing data engineering, data science, and machine learning on single-node machines or clusters. It provides high-level APIs in Scala, Java, Python, and R, and an optimized engine that supports general computation graphs for data analysis. It also supports a rich set of higher-level tools including Spark SQL for SQL and DataFrames, pandas API on Spark for pandas workloads, MLlib for machine learning, GraphX for graph processing, and Structured Streaming for stream processing.
!logo
You can find the latest Spark documentation, including a programming guide, on the project web page. This README file only contains basic setup instructions.
The easiest way to start using Spark is through the Scala shell:
consoledocker run -it spark /opt/spark/bin/spark-shell
Try the following command, which should return 1,000,000,000:
scalascala> spark.range(1000 * 1000 * 1000).count()
The easiest way to start using PySpark is through the Python shell:
consoledocker run -it spark:python3 /opt/spark/bin/pyspark
And run the following command, which should also return 1,000,000,000:
python>>> spark.range(1000 * 1000 * 1000).count()
The easiest way to start using R on Spark is through the R shell:
consoledocker run -it spark:r /opt/spark/bin/sparkR
[***]
See more in [***]
Apache Spark, Spark, Apache, the Apache feather logo, and the Apache Spark project logo are trademarks of The Apache Software Foundation.
Licensed under the Apache License, Version 2.0.
As with all Docker images, these likely also contain other software which may be under other licenses (such as Bash, etc from the base distribution, along with any direct or indirect dependencies of the primary software being contained).
Some additional license information which was able to be auto-detected might be found in the repo-info repository's spark/ directory.
As for any pre-built image usage, it is the image user's responsibility to ensure that any use of this image complies with any relevant licenses for all software contained within.


manifest unknown 错误
TLS 证书验证失败
DNS 解析超时
410 错误:版本过低
402 错误:流量耗尽
身份认证失败错误
429 限流错误
凭证保存错误
来自真实用户的反馈,见证轩辕镜像的优质服务