
sedgewickmm18/flinkVery much influenced by Spark Example
The Docker images are available on docker hub, for the docker build, please see the description here.
kubectl command line tool somewhere in your path.console$ kubectl create -f namespace.yaml
Now list all namespaces:
console$ kubectl get namespaces NAME STATUS AGE default Active 3h flink Active 20m kube-system Active 3h
In order not to pass the --namespace flink argument each time, we define a context and use it:
console$ kubectl config set-context flink --namespace=flink --cluster=${YOUR_CLUSTER_NAME} --user=${YOUR_USER_NAME} $ kubectl config use-context flink
You can find your cluster name and user name in kubernetes config in ~/.kube/config.
The Job Manager service is the master service for a Flink cluster.
Use the
jobmanager-controller.yaml
file to create a
replication controller
running the Flink Job Manager.
console$ kubectl create -f jobmanager-controller.yaml replicationcontroller "jobmanager-controller" created
Then, use the
jobmanager-service.yaml file to
create a logical service endpoint that Flink Task Managers can use to access the Job Manager pod.
console$ kubectl create -f jobmanager-service.yaml service "jobmanager" created
You can then create a service for the Flink Job Manager WebUI:
console$ kubectl create -f jobmanager-webui-service.yaml service "jobmanager-webui" created
console$ kubectl get pods NAME READY STATUS RESTARTS AGE jobmanager-controller-5u0q5 1/1 Running 0 8m
Check logs to see the status of the Job Manager. (Use the pod name retrieved on the previous step.)
console$ kubectl logs jobmanager-controller-5u0q5 [...] -------------------------------------------------------------------------------- 2016-11-12 21:34:32,100 INFO org.apache.flink.runtime.jobmanager.JobManager - Starting JobManager (Version: 1.1.3, Rev:3c95f71, Date:19.10.2016 @ 17:54:57 CEST) 2016-11-12 21:34:32,100 INFO org.apache.flink.runtime.jobmanager.JobManager - Current user: root 2016-11-12 21:34:32,100 INFO org.apache.flink.runtime.jobmanager.JobManager - JVM: OpenJDK 64-Bit Server VM - Oracle Corporation - 1.8/25.66-b17 2016-11-12 21:34:32,101 INFO org.apache.flink.runtime.jobmanager.JobManager - Maximum heap size: 245 MiBytes 2016-11-12 21:34:32,101 INFO org.apache.flink.runtime.jobmanager.JobManager - JAVA_HOME: /usr/lib/jvm/java-1.8-openjdk 2016-11-12 21:34:32,103 INFO org.apache.flink.runtime.jobmanager.JobManager - Hadoop version: 2.3.0 2016-11-12 21:34:32,103 INFO org.apache.flink.runtime.jobmanager.JobManager - JVM Options: 2016-11-12 21:34:32,103 INFO org.apache.flink.runtime.jobmanager.JobManager - -Xms256m 2016-11-12 21:34:32,103 INFO org.apache.flink.runtime.jobmanager.JobManager - -Xmx256m 2016-11-12 21:34:32,103 INFO org.apache.flink.runtime.jobmanager.JobManager - -Dlog.file=/opt/flink-1.1.3-custom-akka3/log/flink--jobmanager-0-jobmanager-controller-i4oc9.log 2016-11-12 21:34:32,103 INFO org.apache.flink.runtime.jobmanager.JobManager - -Dlog4j.configuration=file:/opt/flink-1.1.3-custom-akka3/conf/log4j.properties 2016-11-12 21:34:32,104 INFO org.apache.flink.runtime.jobmanager.JobManager - -Dlogback.configurationFile=file:/opt/flink-1.1.3-custom-akka3/conf/logback.xml 2016-11-12 21:34:32,104 INFO org.apache.flink.runtime.jobmanager.JobManager - Program Arguments: 2016-11-12 21:34:32,104 INFO org.apache.flink.runtime.jobmanager.JobManager - --configDir 2016-11-12 21:34:32,104 INFO org.apache.flink.runtime.jobmanager.JobManager - /opt/flink-1.1.3-custom-akka3/conf 2016-11-12 21:34:32,104 INFO org.apache.flink.runtime.jobmanager.JobManager - --executionMode 2016-11-12 21:34:32,104 INFO org.apache.flink.runtime.jobmanager.JobManager - cluster 2016-11-12 21:34:32,104 INFO org.apache.flink.runtime.jobmanager.JobManager - Classpath: /opt/flink-1.1.3-custom-akka3/lib/flink-dist_2.10-1.1.3.jar:/opt/flink-1.1.3-custom-akka3/lib/flink-python_2.10-1.1.3.jar:/opt/flink-1.1.3-custom-akka3/lib/log4j-1.2.17.jar:/opt/flink-1.1.3-custom-akka3/lib/slf4j-log4j12-1.7.7.jar::: 2016-11-12 21:34:32,104 INFO org.apache.flink.runtime.jobmanager.JobManager - -------------------------------------------------------------------------------- 2016-11-12 21:34:32,106 INFO org.apache.flink.runtime.jobmanager.JobManager - Registered UNIX signal handlers for [TERM, HUP, INT] 2016-11-12 21:34:32,222 INFO org.apache.flink.runtime.jobmanager.JobManager - Loading configuration from /opt/flink-1.1.3-custom-akka3/conf 2016-11-12 21:34:32,232 INFO org.apache.flink.runtime.jobmanager.JobManager - Starting JobManager without high-availability 2016-11-12 21:34:32,236 INFO org.apache.flink.runtime.jobmanager.JobManager - Starting JobManager on 0.0.0.0:6123 with execution mode CLUSTER 2016-11-12 21:34:32,258 INFO org.apache.flink.runtime.jobmanager.JobManager - Security is not enabled. Starting non-authenticated JobManager. 2016-11-12 21:34:32,265 INFO org.apache.flink.runtime.jobmanager.JobManager - Starting JobManager 2016-11-12 21:34:32,266 INFO org.apache.flink.runtime.jobmanager.JobManager - Starting JobManager actor system at 0.0.0.0:6123 2016-11-12 21:34:32,330 INFO org.apache.flink.runtime.akka.AkkaUtils$ - Using listening address "0.0.0.0":6123 and external address "10.0.0.240":6123 2016-11-12 21:34:32,577 INFO akka.event.slf4j.Slf4jLogger - Slf4jLogger started 2016-11-12 21:34:32,615 INFO Remoting - Starting remoting 2016-11-12 21:34:32,740 INFO Remoting - Remoting started; listening on addresses :[akka.tcp://flink@10.0.0.240:6123] 2016-11-12 21:34:32,745 INFO org.apache.flink.runtime.jobmanager.JobManager - Starting JobManager web frontend 2016-11-12 21:34:32,765 INFO org.apache.flink.runtime.webmonitor.WebMonitorUtils - Determined location of JobManager log file: /opt/flink-1.1.3-custom-akka3/log/flink--jobmanager-0-jobmanager-controller-i4oc9.log 2016-11-12 21:34:32,765 INFO org.apache.flink.runtime.webmonitor.WebMonitorUtils - Determined location of JobManager stdout file: /opt/flink-1.1.3-custom-akka3/log/flink--jobmanager-0-jobmanager-controller-i4oc9.out 2016-11-12 21:34:32,808 INFO org.apache.flink.runtime.webmonitor.WebRuntimeMonitor - Using directory /tmp/flink-web-42636f9e-2e7b-468b-9a50-f3bf2a3ad196 for the web interface files 2016-11-12 21:34:32,808 INFO org.apache.flink.runtime.webmonitor.WebRuntimeMonitor - Using directory /tmp/flink-web-upload-7af4968b-96cd-4e3e-8fec-2430bdf32e09 for web frontend JAR file uploads 2016-11-12 21:34:33,020 INFO org.apache.flink.runtime.webmonitor.WebRuntimeMonitor - Web frontend listening at 0:0:0:0:0:0:0:0:8081 2016-11-12 21:34:33,021 INFO org.apache.flink.runtime.jobmanager.JobManager - Starting JobManager actor 2016-11-12 21:34:33,026 INFO org.apache.flink.runtime.blob.BlobServer - Created BLOB server storage directory /tmp/blobStore-66dae3f5-eabd-4f11-bbae-35a0222dd192 2016-11-12 21:34:33,027 INFO org.apache.flink.runtime.blob.BlobServer - Started BLOB server at 0.0.0.0:40125 - max concurrent requests: 50 - max backlog: 1000 2016-11-12 21:34:33,031 INFO org.apache.flink.runtime.checkpoint.savepoint.SavepointStoreFactory - Using job manager savepoint state backend. 2016-11-12 21:34:33,034 INFO org.apache.flink.runtime.metrics.MetricRegistry - No metrics reporter configured, no metrics will be exposed/reported. 2016-11-12 21:34:33,039 INFO org.apache.flink.runtime.jobmanager.MemoryArchivist - Started memory archivist akka://flink/user/archive 2016-11-12 21:34:33,042 INFO org.apache.flink.runtime.jobmanager.JobManager - Starting JobManager at akka.tcp://flink@10.0.0.240:6123/user/jobmanager. 2016-11-12 21:34:33,044 INFO org.apache.flink.runtime.webmonitor.WebRuntimeMonitor - Starting with JobManager akka.tcp://flink@10.0.0.240:6123/user/jobmanager on port 8081 2016-11-12 21:34:33,044 INFO org.apache.flink.runtime.webmonitor.JobManagerRetriever - New leader reachable under akka.tcp://flink@10.0.0.240:6123/user/jobmanager:null. 2016-11-12 21:34:33,051 INFO org.apache.flink.runtime.clusterframework.standalone.StandaloneResourceManager - Trying to associate with JobManager leader akka.tcp://flink@10.0.0.240:6123/user/jobmanager 2016-11-12 21:34:33,079 INFO org.apache.flink.runtime.jobmanager.JobManager - JobManager akka.tcp://flink@10.0.0.240:6123/user/jobmanager was granted leadership with leader session ID None. 2016-11-12 21:34:33,085 INFO org.apache.flink.runtime.clusterframework.standalone.StandaloneResourceManager - Resource Manager associating with leading JobManager Actor[akka://flink/user/jobmanager#***] - leader session null
After you know the master is running, you can use the cluster proxy to connect to the Flink WebUI:
consolekubectl proxy --port=8081 &
At which point the UI will be available at http://localhost:8080/api/v1/proxy/namespaces/flink/services/jobmanager-webui/.
I had some issues with proxying so I resorted to port forwarding for the pod running the jobmanager
Just remember that running kubectl get po will tell you the correct name of your jobmanager pod.
consolekubectl port-forward jobmanager-controller-9pc74 8081:8081
Use the taskmanager-controller.yaml file to create a
replication controller that manages the Task Manager pods.
console$ kubectl create -f taskmanager-controller.yaml replicationcontroller "taskmanager-controller" created
If you launched the Flink WebUI, your Task Managers should just appear in the UI when they're ready. (It may take a little bit to pull the images and launch the pods.) You can also interrogate the status in the following way:
console$ kubectl get pods NAME READY STATUS RESTARTS AGE jobmanager-controller-5u0q5 1/1 Running 0 25m taskmanager-controller-e8otp 1/1 Running 0 6m taskmanager-controller-fiivl 1/1 Running 0 6m $ kubectl logs jobmanager-controller-5u0q5 [...] 2016-11-12 21:36:25,651 INFO org.apache.flink.runtime.instance.InstanceManager - Registered TaskManager at taskmanager-controller-ypz70 (akka.tcp://flink@172.17.0.5:36580/user/taskmanager) as c23e0d5439461e2a8e50494974763c0c. Current number of registered hosts is 1. Current number of alive task slots is 2. 2016-11-12 21:36:25,653 INFO org.apache.flink.runtime.clusterframework.standalone.StandaloneResourceManager - TaskManager ResourceID{resourceId='4aa8f076fb7fa6f726d9292e56c9da0c'} has started. 2016-11-12 21:36:25,666 INFO org.apache.flink.runtime.clusterframework.standalone.StandaloneResourceManager - TaskManager ResourceID{resourceId='12ad33ad4d6c98a02081935133336500'} has started. 2016-11-12 21:36:25,667 INFO org.apache.flink.runtime.instance.InstanceManager - Registered TaskManager at taskmanager-controller-2chea (akka.tcp://flink@172.17.0.6:42142/user/taskmanager) as 6ba1e278196cab39238e407e8fee3833. Current number of registered hosts is 2. Current number of alive task slots is 4. 2016-11-12 21:36:25,968 INFO org.apache.flink.runtime.instance.InstanceManager - Registered TaskManager at taskmanager-controller-q8aei (akka.tcp://flink@172.17.0.4:36143/user/taskmanager) as 889f76511e7f778f18c2d4a9c75e6920. Current number of registered hosts is 3. Current number of alive task slots is 6.
Assuming you still have the kubectl proxy or kubectl port-forward running from the previous section, you should now see the Task Managers in the UI as well.
You can introspect your pods running jobmanager and taskmanagers with kubectl exec; for example the following command sequence returns the logs of a taskmanager after retrieving the name of the most recent one.
kubectl exec taskmanager-controller-dc0zx -- ls /opt/flink/log kubectl exec taskmanager-controller-dc0zx -- less /opt/flink/log/flink--taskmanager-0-taskmanager-controller-dc0zx.log
Native logs help when trying to understand why a taskmanager does not connect to a jobmanager.
See also here for reference
First start the jobmanager with the following command (type jobmanager, DNS name also jobmanager)
docker run -ti -p 8081:8081 --name jobmanager flink jobmanager jobmanager
then start a single taskmanager with
docker run -ti --link jobmanager flink taskmanager jobmanager num
so that the taskmanager looks for an entity with DNS name jobmanager to register and uses its numeric IP address for its Akka actor.
See above for kubectl proxy examples. It works well on real Kubernetes cluster but minikube might need a different approach.
One option is to use Web UI, upload a JAR and submit a job from there.
This issue below has been solved with Flink 1.1.4 and 1.2.0, I'm including it as reference.
FLINK-2821 which is strictly speaking not a Flink bug, seems to be preventing Task Managers from talking to Job Managers because of the recipient IP address mismatch. Therefore we're using a Docker image with a custom Akka 3 build of Flink 1.1.3 instead of the vanilla 1.1.3.


manifest unknown 错误
TLS 证书验证失败
DNS 解析超时
410 错误:版本过低
402 错误:流量耗尽
身份认证失败错误
429 限流错误
凭证保存错误
来自真实用户的反馈,见证轩辕镜像的优质服务