1.视界
2.启动
lcc@lcc flink-1.9.0$ bin/start-cluster.sh
Starting cluster.
Starting standalonesession daemon on host lcc.
Starting taskexecutor daemon on host lcc.
[deploy@kylin1 third]$ bin/start-cluster.sh
lcc@lcc flink-1.9.0$ jps -ml | grep flink
53531 org.apache.flink.runtime.taskexecutor.TaskManagerRunner --configDir /Users/lcc/soft/flink/flink-1.9.0/conf
53085 org.apache.flink.runtime.entrypoint.StandaloneSessionClusterEntrypoint --configDir /Users/lcc/soft/flink/flink-1.9.0/conf --executionMode cluster
这里可以看到启动了TaskManagerRunner
和StandaloneSessionClusterEntrypoint
,怎么启动的呢?
3.界面
地址:http://localhost:8081/#/overview
默认端口:8081
4. Start-cluster.sh脚本
#!/usr/bin/env bash
################################################################################
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
################################################################################
bin=`dirname "$0"`
bin=`cd "$bin"; pwd`
# 执行配置
. "$bin"/config.sh
# Start the JobManager instance(s)
shopt -s nocasematch
if [[ $HIGH_AVAILABILITY == "zookeeper" ]]; then
# HA Mode
readMasters
echo "Starting HA cluster with ${#MASTERS[@]} masters."
for ((i=0;i<${#MASTERS[@]};++i)); do
master=${MASTERS[i]}
webuiport=${WEBUIPORTS[i]}
if [ ${MASTERS_ALL_LOCALHOST} = true ] ; then
"${FLINK_BIN_DIR}"/jobmanager.sh start "${master}" "${webuiport}"
else
ssh -n $FLINK_SSH_OPTS $master -- "nohup /bin/bash -l \"${FLINK_BIN_DIR}/jobmanager.sh\" start ${master} ${webuiport} &"
fi
done
else
echo "Starting cluster."
# Start single JobManager on this machine
"$FLINK_BIN_DIR"/jobmanager.sh start
fi
shopt -u nocasematch
# Start TaskManager instance(s)
TMSlaves start
步骤:
- 执行配置,执行配置请参考配置config.sh
- 执行jobmanager.sh,执行任务请参考jobmanager.sh
5.jobmanager.sh
#!/usr/bin/env bash
################################################################################
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
################################################################################
# Start/stop a Flink JobManager.
USAGE="Usage: jobmanager.sh ((start|start-foreground) [host] [webui-port])|stop|stop-all"
STARTSTOP=$1
HOST=$2 # optional when starting multiple instances
WEBUIPORT=$3 # optional when starting multiple instances
# 不是以这些结尾的 打印 使用命令
if [[ $STARTSTOP != "start" ]] && [[ $STARTSTOP != "start-foreground" ]] && [[ $STARTSTOP != "stop" ]] && [[ $STARTSTOP != "stop-all" ]]; then
echo $USAGE
exit 1
fi
bin=`dirname "$0"`
bin=`cd "$bin"; pwd`
. "$bin"/config.sh
ENTRYPOINT=standalonesession
if [[ $STARTSTOP == "start" ]] || [[ $STARTSTOP == "start-foreground" ]]; then
if [ ! -z "${FLINK_JM_HEAP_MB}" ] && [ "${FLINK_JM_HEAP}" == 0 ]; then
echo "used deprecated key \`${KEY_JOBM_MEM_MB}\`, please replace with key \`${KEY_JOBM_MEM_SIZE}\`"
else
flink_jm_heap_bytes=$(parseBytes ${FLINK_JM_HEAP})
FLINK_JM_HEAP_MB=$(getMebiBytes ${flink_jm_heap_bytes})
fi
if [[ ! ${FLINK_JM_HEAP_MB} =~ $IS_NUMBER ]] || [[ "${FLINK_JM_HEAP_MB}" -lt "0" ]]; then
echo "[ERROR] Configured JobManager memory size is not a valid value. Please set '${KEY_JOBM_MEM_SIZE}' in ${FLINK_CONF_FILE}."
exit 1
fi
if [ "${FLINK_JM_HEAP_MB}" -gt "0" ]; then
export JVM_ARGS="$JVM_ARGS -Xms"$FLINK_JM_HEAP_MB"m -Xmx"$FLINK_JM_HEAP_MB"m"
fi
# Add JobManager-specific JVM options
export FLINK_ENV_JAVA_OPTS="${FLINK_ENV_JAVA_OPTS} ${FLINK_ENV_JAVA_OPTS_JM}"
# Startup parameters
args=("--configDir" "${FLINK_CONF_DIR}" "--executionMode" "cluster")
if [ ! -z $HOST ]; then
args+=("--host")
args+=("${HOST}")
fi
if [ ! -z $WEBUIPORT ]; then
args+=("--webui-port")
args+=("${WEBUIPORT}")
fi
fi
if [[ $STARTSTOP == "start-foreground" ]]; then
exec "${FLINK_BIN_DIR}"/flink-console.sh $ENTRYPOINT "${args[@]}"
else
# 这一句是这样的 /flink-1.9.0/bin/flink-daemon.sh.start standalonesession --configDir /flink-1.9.0/conf --executionMode cluster
"${FLINK_BIN_DIR}"/flink-daemon.sh $STARTSTOP $ENTRYPOINT "${args[@]}"
fi
启动命令大致如下
/Users/lcc/soft/flink/flink-1.9.0/bin/flink-daemon.sh start standalonesession
--configDir /Users/lcc/soft/flink/flink-1.9.0/conf
--executionMode cluster
下面要看flink-daemon.sh做了什么
6.flink-daemon.sh
代码入戏
#!/usr/bin/env bash
################################################################################
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
################################################################################
# Start/stop a Flink daemon.
USAGE="Usage: flink-daemon.sh (start|stop|stop-all) (taskexecutor|zookeeper|historyserver|standalonesession|standalonejob) [args]"
STARTSTOP=$1
DAEMON=$2
ARGS=("${@:3}") # get remaining arguments as array
bin=`dirname "$0"`
bin=`cd "$bin"; pwd`
. "$bin"/config.sh
# 默认 $DAEMON = standalonesession
case $DAEMON in
(taskexecutor)
CLASS_TO_RUN=org.apache.flink.runtime.taskexecutor.TaskManagerRunner
;;
(zookeeper)
CLASS_TO_RUN=org.apache.flink.runtime.zookeeper.FlinkZooKeeperQuorumPeer
;;
(historyserver)
CLASS_TO_RUN=org.apache.flink.runtime.webmonitor.history.HistoryServer
;;
(standalonesession)
CLASS_TO_RUN=org.apache.flink.runtime.entrypoint.StandaloneSessionClusterEntrypoint
;;
(standalonejob)
CLASS_TO_RUN=org.apache.flink.container.entrypoint.StandaloneJobClusterEntryPoint
;;
(*)
echo "Unknown daemon '${DAEMON}'. $USAGE."
exit 1
;;
esac
if [ "$FLINK_IDENT_STRING" = "" ]; then
FLINK_IDENT_STRING="$USER"
fi
FLINK_TM_CLASSPATH=`constructFlinkClassPath`
# /tmp/flink-lcc-standalonesession.pid
pid=$FLINK_PID_DIR/flink-$FLINK_IDENT_STRING-$DAEMON.pid
mkdir -p "$FLINK_PID_DIR"
# Log files for daemons are indexed from the process ID's position in the PID
# file. The following lock prevents a race condition during daemon startup
# when multiple daemons read, index, and write to the PID file concurrently.
# The lock is created on the PID directory since a lock file cannot be safely
# removed. The daemon is started with the lock closed and the lock remains
# active in this script until the script exits.
command -v flock >/dev/null 2>&1
if [[ $? -eq 0 ]]; then
exec 200<"$FLINK_PID_DIR"
flock 200
fi
# Ascending ID depending on number of lines in pid file.
# This allows us to start multiple daemon of each type.
# 默认等于0
id=$([ -f "$pid" ] && echo $(wc -l < "$pid") || echo "0")
FLINK_LOG_PREFIX="${FLINK_LOG_DIR}/flink-${FLINK_IDENT_STRING}-${DAEMON}-${id}-${HOSTNAME}"
log="${FLINK_LOG_PREFIX}.log"
out="${FLINK_LOG_PREFIX}.out"
# 默认值 -Dlog.file=/Users/lcc/soft/flink/flink-1.9.0/log/flink-lcc-standalonesession-0-lcc.log
log_setting=("-Dlog.file=${log}" "-Dlog4j.configuration=file:${FLINK_CONF_DIR}/log4j.properties" "-Dlogback.configurationFile=file:${FLINK_CONF_DIR}/logback.xml")
# 默认18
JAVA_VERSION=$(${JAVA_RUN} -version 2>&1 | sed 's/.*version "\(.*\)\.\(.*\)\..*"/\1\2/; 1q')
# Only set JVM 8 arguments if we have correctly extracted the version
if [[ ${JAVA_VERSION} =~ ${IS_NUMBER} ]]; then
if [ "$JAVA_VERSION" -lt 18 ]; then
JVM_ARGS="$JVM_ARGS -XX:MaxPermSize=256m"
fi
fi
case $STARTSTOP in
(start)
# Rotate log files
rotateLogFilesWithPrefix "$FLINK_LOG_DIR" "$FLINK_LOG_PREFIX"
# Print a warning if daemons are already running on host
if [ -f "$pid" ]; then
active=()
while IFS='' read -r p || [[ -n "$p" ]]; do
kill -0 $p >/dev/null 2>&1
if [ $? -eq 0 ]; then
active+=($p)
fi
done < "${pid}"
count="${#active[@]}"
if [ ${count} -gt 0 ]; then
echo "[INFO] $count instance(s) of $DAEMON are already running on $HOSTNAME."
fi
fi
# Evaluate user options for local variable expansion
FLINK_ENV_JAVA_OPTS=$(eval echo ${FLINK_ENV_JAVA_OPTS})
echo "Starting $DAEMON daemon on host $HOSTNAME."
# 执行命令
# /Library/Java/JavaVirtualMachines/jdk1.8.0_211.jdk/Contents/Home/bin/java -Xms1024m -Xmx1024m
# -Dlog.file=/Users/lcc/soft/flink/flink-1.9.0/log/flink-lcc-standalonesession-0-lcc.log
# -Dlog4j.configuration=file:/Users/lcc/soft/flink/flink-1.9.0/conf/log4j.properties
# -Dlogback.configurationFile=file:/Users/lcc/soft/flink/flink-1.9.0/conf/logback.xml
# -classpath manglePathList /Users/lcc/soft/flink/flink-1.9.0/lib/flink-table-blink_2.11-1.9.0.jar:
# /Users/lcc/soft/flink/flink-1.9.0/lib/flink-table_2.11-1.9.0.jar:
# /Users/lcc/soft/flink/flink-1.9.0/lib/log4j-1.2.17.jar:
# /Users/lcc/soft/flink/flink-1.9.0/lib/slf4j-log4j12-1.7.15.jar:
# /Users/lcc/soft/flink/flink-1.9.0/lib/flink-dist_2.11-1.9.0.jar::
# /Users/lcc/soft/hadoop/hadoop-2.7.4/etc/hadoop::
# /Users/lcc/soft/hbase/hbase-1.2.0/conf/
# org.apache.flink.runtime.entrypoint.StandaloneSessionClusterEntrypoint
# --configDir /Users/lcc/soft/flink/flink-1.9.0/conf
# --executionMode cluster > /Users/lcc/soft/flink/flink-1.9.0/log/flink-lcc-standalonesession-0-lcc.out
# 200<&- 2>&1 < /dev/null &
$JAVA_RUN $JVM_ARGS ${FLINK_ENV_JAVA_OPTS} "${log_setting[@]}" -classpath "`manglePathList "$FLINK_TM_CLASSPATH:$INTERNAL_HADOOP_CLASSPATHS"`" ${CLASS_TO_RUN} "${ARGS[@]}" > "$out" 200<&- 2>&1 < /dev/null &
mypid=$!
# Add to pid file if successful start
if [[ ${mypid} =~ ${IS_NUMBER} ]] && kill -0 $mypid > /dev/null 2>&1 ; then
echo $mypid >> "$pid"
else
echo "Error starting $DAEMON daemon."
exit 1
fi
;;
(stop)
if [ -f "$pid" ]; then
# Remove last in pid file
to_stop=$(tail -n 1 "$pid")
if [ -z $to_stop ]; then
rm "$pid" # If all stopped, clean up pid file
echo "No $DAEMON daemon to stop on host $HOSTNAME."
else
sed \$d "$pid" > "$pid.tmp" # all but last line
# If all stopped, clean up pid file
[ $(wc -l < "$pid.tmp") -eq 0 ] && rm "$pid" "$pid.tmp" || mv "$pid.tmp" "$pid"
if kill -0 $to_stop > /dev/null 2>&1; then
echo "Stopping $DAEMON daemon (pid: $to_stop) on host $HOSTNAME."
kill $to_stop
else
echo "No $DAEMON daemon (pid: $to_stop) is running anymore on $HOSTNAME."
fi
fi
else
echo "No $DAEMON daemon to stop on host $HOSTNAME."
fi
;;
(stop-all)
if [ -f "$pid" ]; then
mv "$pid" "${pid}.tmp"
while read to_stop; do
if kill -0 $to_stop > /dev/null 2>&1; then
echo "Stopping $DAEMON daemon (pid: $to_stop) on host $HOSTNAME."
kill $to_stop
else
echo "Skipping $DAEMON daemon (pid: $to_stop), because it is not running anymore on $HOSTNAME."
fi
done < "${pid}.tmp"
rm "${pid}.tmp"
fi
;;
(*)
echo "Unexpected argument '$STARTSTOP'. $USAGE."
exit 1
;;
esac
最终执行了两端代码
lcc@lcc flink-1.9.0$ bin/start-cluster.sh
Starting cluster.
/Users/lcc/soft/flink/flink-1.9.0/bin/flink-daemon.sh start standalonesession
--configDir /Users/lcc/soft/flink/flink-1.9.0/conf --executionMode cluster
Starting standalonesession daemon on host lcc.
/Library/Java/JavaVirtualMachines/jdk1.8.0_211.jdk/Contents/Home/bin/java
-Xms1024m -Xmx1024m
-Dlog.file=/Users/lcc/soft/flink/flink-1.9.0/log/flink-lcc-standalonesession-0-lcc.log
-Dlog4j.configuration=file:/Users/lcc/soft/flink/flink-1.9.0/conf/log4j.properties
-Dlogback.configurationFile=file:/Users/lcc/soft/flink/flink-1.9.0/conf/logback.xml
-classpath manglePathList /Users/lcc/soft/flink/flink-1.9.0/lib/flink-table-blink_2.11-1.9.0.jar
:/Users/lcc/soft/flink/flink-1.9.0/lib/flink-table_2.11-1.9.0.jar:
/Users/lcc/soft/flink/flink-1.9.0/lib/log4j-1.2.17.jar:
/Users/lcc/soft/flink/flink-1.9.0/lib/slf4j-log4j12-1.7.15.jar:
/Users/lcc/soft/flink/flink-1.9.0/lib/flink-dist_2.11-1.9.0.jar::
/Users/lcc/soft/hadoop/hadoop-2.7.4/etc/hadoop::
/Users/lcc/soft/hbase/hbase-1.2.0/conf/
org.apache.flink.runtime.entrypoint.StandaloneSessionClusterEntrypoint
--configDir /Users/lcc/soft/flink/flink-1.9.0/conf
--executionMode cluster > /Users/lcc/soft/flink/flink-1.9.0/log/flink-lcc-standalonesession-0-lcc.out 200<&- 2>&1 < /dev/null &
Starting taskexecutor daemon on host lcc.
/Library/Java/JavaVirtualMachines/jdk1.8.0_211.jdk/Contents/Home/bin/java
-XX:+UseG1GC -Xms922M -Xmx922M -XX:MaxDirectMemorySize=8388607T
-Dlog.file=/Users/lcc/soft/flink/flink-1.9.0/log/flink-lcc-taskexecutor-0-lcc.log
-Dlog4j.configuration=file:/Users/lcc/soft/flink/flink-1.9.0/conf/log4j.properties
-Dlogback.configurationFile=file:/Users/lcc/soft/flink/flink-1.9.0/conf/logback.xml
-classpath manglePathList
/Users/lcc/soft/flink/flink-1.9.0/lib/flink-table-blink_2.11-1.9.0.jar:
/Users/lcc/soft/flink/flink-1.9.0/lib/flink-table_2.11-1.9.0.jar:
/Users/lcc/soft/flink/flink-1.9.0/lib/log4j-1.2.17.jar:
/Users/lcc/soft/flink/flink-1.9.0/lib/slf4j-log4j12-1.7.15.jar:
/Users/lcc/soft/flink/flink-1.9.0/lib/flink-dist_2.11-1.9.0.jar::
/Users/lcc/soft/hadoop/hadoop-2.7.4/etc/hadoop::/Users/lcc/soft/hbase/hbase-1.2.0/conf/
org.apache.flink.runtime.taskexecutor.TaskManagerRunner
--configDir /Users/lcc/soft/flink/flink-1.9.0/conf > /Users/lcc/soft/flink/flink-1.9.0/log/flink-lcc-taskexecutor-0-lcc.out 200<&- 2>&1 < /dev/null &
lcc@lcc flink-1.9.0$
可以从上面看到总共启动了2个东西。
扫描二维码关注公众号,回复:
10630804 查看本文章
- 启动standalonesession
- 启动taskexecutor
7.启动TaskManager
在脚本start-cluster. sh
集群启动的脚本,在脚本中通过运行jobmanager.sh
和TMSlaves start
来启动JobManager
和TaskManager
。其中TMSlaves定义在config.sh中。
config. sh
这里只看一下TMSlaves()函数,可以看到其是通过调用taskmanager.sh启动TaskManager的。
# TMSlaves start|stop
TMSlaves() {
CMD=$1
readSlaves
if [ ${SLAVES_ALL_LOCALHOST} = true ] ; then
# all-local setup
for slave in ${SLAVES[@]}; do
"${FLINK_BIN_DIR}"/taskmanager.sh "${CMD}"
done
else
# non-local setup
# Stop TaskManager instance(s) using pdsh (Parallel Distributed Shell) when available
command -v pdsh >/dev/null 2>&1
if [[ $? -ne 0 ]]; then
for slave in ${SLAVES[@]}; do
ssh -n $FLINK_SSH_OPTS $slave -- "nohup /bin/bash -l \"${FLINK_BIN_DIR}/taskmanager.sh\" \"${CMD}\" &"
done
else
PDSH_SSH_ARGS="" PDSH_SSH_ARGS_APPEND=$FLINK_SSH_OPTS pdsh -w $(IFS=, ; echo "${SLAVES[*]}") \
"nohup /bin/bash -l \"${FLINK_BIN_DIR}/taskmanager.sh\" \"${CMD}\""
fi
fi
}
8.main方法
public static void main(String[] args) throws Exception {
// startup checks and logging
// 加载环境
EnvironmentInformation.logEnvironmentInfo(LOG, "TaskManager", args);
// 注册信号
SignalHandler.register(LOG);
// 添加关闭的Hook
JvmShutdownSafeguard.installAsShutdownHook(LOG);
// 得到最大能打开的句柄数
long maxOpenFileHandles = EnvironmentInformation.getOpenFileHandlesLimit();
if (maxOpenFileHandles != -1L) {
LOG.info("Maximum number of open file descriptors is {}.", maxOpenFileHandles);
} else {
LOG.info("Cannot determine the maximum number of open file descriptors");
}
// 加载配置
final Configuration configuration = loadConfiguration(args);
// 初始化文件系统
FileSystem.initialize(configuration, PluginUtils.createPluginManagerFromRootFolder(configuration));
// 初始化安全模块
SecurityUtils.install(new SecurityConfiguration(configuration));
try {
SecurityUtils.getInstalledContext().runSecured(new Callable<Void>() {
@Override
public Void call() throws Exception {
/** 启动taskManager */
runTaskManager(configuration, ResourceID.generate());
return null;
}
});
} catch (Throwable t) {
final Throwable strippedThrowable = ExceptionUtils.stripException(t, UndeclaredThrowableException.class);
LOG.error("TaskManager initialization failed.", strippedThrowable);
System.exit(STARTUP_FAILURE_RETURN_CODE);
}
}
主要步骤:
- 加载环境
- 注册信号
- 添加关闭的Hook
- 得到最大能打开的句柄数
- 加载配置
- 初始化文件系统
- 初始化安全模块
- 启动taskManager
9.启动taskManager
代码:org.apache.flink.runtime.taskexecutor.TaskManagerRunner#runTaskManager
public static void runTaskManager(Configuration configuration, ResourceID resourceId) throws Exception {
//主要初始化一堆的service,并新建一个org.apache.flink.runtime.taskexecutor.TaskExecutor
final TaskManagerRunner taskManagerRunner = new TaskManagerRunner(configuration, resourceId);
//调用TaskExecutor的start()方法
taskManagerRunner.start();
}
9.1 初始化
TaskManagerRunner初始化
略
9.2 启动
public void start() throws Exception {
taskManager.start();
}
然后调用org.apache.flink.runtime.rpc.RpcEndpoint#start
public final void start() {
rpcServer.start();
}
z这里已经是Akka的远程调用了,这里不清楚了。