1. Why not use Pod directly
Kubernetes
The core object Pod
is used to arrange one or more containers, so that these containers share resources such as network and storage, and are always scheduled together, so as to work closely together.
Because is more representative of actual applications Pod
than containers, it Kubernetes
is not used to orchestrate services at the container level, but is used Pod
as the smallest unit for scheduling O&M in the cluster.
Although object-oriented design ideas are mostly used in software development, it is unexpectedly appropriate to put them Kubernetes
in . Because Kubernetes
using YAML
to describe resources, the business is simplified into individual objects, which have attributes internally and external connections, and also need to cooperate with each other, but we don't need programming, and it is completely handled Kubernetes
automatically Kubernetes
.Go
object-oriented).
There are many basic principles in object-oriented design, two of which I think describe Kubernetes
the object , one is "single responsibility" and the other is "composition is better than inheritance".
- "Single responsibility" means that the object should only focus on doing one thing well, don't be greedy for everything, and keep the granularity small enough to make it easier to reuse and manage.
- "Combination is better than inheritance" means that objects should be connected at runtime as much as possible to keep loose coupling, instead of fixing the relationship of objects in a hard-coded way.
Applying these two principles, the resource object we will Kubernetes
look will be very clear. Because it is Pod
already a relatively complete object, which is responsible for managing containers, so we should not blindly expand its functions by "adding to the snake", but to maintain its independence. Functions outside the container need to define other objects , "combined" Pod
as one of its members.
2. Job/CronJob
Kubernetes
Two new objects in: Job
and CronJob
, they are combined Pod
to realize the processing of offline business.
- There are many "online business" types of applications, such as
Nginx
,Node.js
,MySQL
,Redis
and so on. Once they are running, they will basically not stop, that is, they will always be online. - The characteristic of "offline business" is that it will definitely exit and will not run indefinitely, so its scheduling strategy is very different from "online business". It needs to consider running timeout, status check, failure retry, and acquisition Management matters such as calculation results.
However, these business features are not necessarily related to container management. If they are implemented Pod
by will assume unnecessary obligations and violate the "single responsibility". Therefore, we should separate this part of the function to another object and let this object go. Pod
Control the operation and complete additional work.
"Offline business" can also be divided into two types. One is a "temporary task", which is finished after running, and will be rescheduled next time if there is a need; the other is a "scheduled task", which can run on time and periodically without too much intervention.
Corresponds to Kubernetes
where :
- The "temporary task" is
API
the objectJob
; - "Timed task" is
API
the objectCronJob
;
Using these two objects, you can schedule and manage any offline business Kubernetes
in .
3. Use YAML to describe the job
Job
The YAML
"file header" part of the file is still the required fields, briefly:
apiVersion
Nov1
, butbatch/v1
.kind
YesJob
, this is consistent with the name of the object.metadata
You still have to havename
the tag , and you canlabels
add arbitrary tags with .
job/Cronjob
The apiVersion
field is batch/v1
, indicating that they are not part of the core object group core group
, but the batch object group batch group
.
You can also use the command kubectl explain job
to view its field descriptions. However, if you want to generate a YAML
template file, you cannot use it kubectl run
, because kubectl run
you can only create it Pod
. To Pod
create API
objects other than , you need to use the command kubectl create
and add the type name of the object.
For example, to busybox
create one with echo-job
, the command is like this:
export out="--dry-run=client -o yaml" # 定义Shell变量
kubectl create job echo-job --image=busybox $out
A basic YAML
file , and after saving and making some modifications, there will be a Job
object:
apiVersion: batch/v1
kind: Job
metadata:
name: echo-job
spec:
template:
spec:
restartPolicy: OnFailure
containers:
- image: busybox
name: echo-job
imagePullPolicy: IfNotPresent
command: ["/bin/echo"]
args: ["hello", "world"]
You will notice Job
that the description of Pod
is very similar to , but there are some differences. The main difference is in spec
the field . There is one more template
field, and then another spec
, which seems a bit strange.
If you understand the object-oriented design ideas just mentioned, you will understand the reason for this approach. It actually applies the composite mode in Job
the object , template
and the field defines an "application template", which embeds one Pod
, so Job
that can be created from this template Pod
.
And Pod
because is under Job
the management control of , it does not directly apiserver
deal with , so there is no need to repeat apiVersion
the "header fields" such as . It only needs to define the key spec
and describe the information related to the container. It can be said to be a "headless" Pod
object .
In fact, there are not many extra functions echo-job
in , just Pod
a simple package:
Image source: https://time.geekbang.org/column/article/531566
In general, Pod
the work is very simple, containers
write the name and image in , command
execute /bin/echo
, and output "hello world".
However, because of the particularity of Job
the business , we also need spec
to add an additional field in restartPolicy
to determine the strategy when Pod
the operation fails, OnFailure
which is to restart the container on failure, Never
while does not restart the container and let Job
to reschedule to generate a new one Pod
.
4. Operate Job in Kubernetes
Now let's create Job
the object and run this simple offline job, again with the command kubectl apply
:
kubectl apply -f job.yml
After creation Kubernetes
, will be YAML
extracted from the template definition of andPod
run under the control of , you can use to view the status of and respectively:Job
Pod
kubectl get job
kubectl get pod
Job
Pod
$ kubectl apply -f job.yml
job.batch/echo-job created
wohu@dev:~/k8s$ kubectl get pod
NAME READY STATUS RESTARTS AGE
echo-job-lt2sq 0/1 Completed 0 6s
ngx 1/1 Running 0 5h37m
wohu@dev:~/k8s$ kubectl get job
NAME COMPLETIONS DURATION AGE
echo-job 1/1 2s 11s
wohu@dev:~/k8s$
It can be seen that because Pod
is Job
managed by , it will not repeatedly restart and report an error, but will be displayed as to Completed
indicate that the task is completed, and Job
will also list the number of jobs that ran successfully. There is only one job here, so it is 1/1.
You can also see Pod
that a name is automatically associated Job
with the name ( )echo-job
plus a random string (lt2sq). Job
You can use the command kubectl logs
to get Pod
the running result of :
$ kubectl logs echo-job-lt2sq
hello world
Kubernetes
This framework for YAML
describing objects provides a lot of flexibility. You can add arbitrary fields at Job
the level and Pod
level to customize the business. This advantage is incomparable with simple container technology. For example, the following fields, other more detailed information can refer to Job
Documentation
activeDeadlineSeconds
, set the timeout forPod
running .backoffLimit
, sets the numberPod
of failed retries for .completions
,Job
how many times to completePod
, the default is 1.parallelism
, which iscompletions
related indicatesPod
the number of allowed concurrent operations to avoid excessive resource occupation.
It should be noted that these 4 fields are not under template
the field , but spec
under the field, so they belong to Job
the level and are used to control Pod
the objects in the template.
Next, I will create another Job
object named "sleep-job", which sleeps for a random period of time and then exits, simulating a long-running job (for example MapReduce
). Job
The parameter is set to a 15-second timeout, and a maximum of 2 retries is required. A total of 4 operations need to be completed Pod
, but at most 2 concurrent operations at the same time Pod
:
apiVersion: batch/v1
kind: Job
metadata:
name: sleep-job
spec:
activeDeadlineSeconds: 15
backoffLimit: 2
completions: 4
parallelism: 2
template:
spec:
restartPolicy: OnFailure
containers:
- image: busybox
name: echo-job
imagePullPolicy: IfNotPresent
command:
- sh
- -c
- sleep $(($RANDOM % 10 + 1)) && echo done
Job configuration details:
job.spec.containers.template.spec.containers.image cannot specify the image version number, only the image: complete image: the version number can only be defined by the pod, otherwise the image will be pulled from the Internet, If you can connect to the Internet, of course it’s fine. In the offline environment, an error will be reported directly and the image cannot be pulled, although you do have this version of the image locally and the imagePullPolicy is set to Never or IfNotPresent.
For example, I am in an offline environment, and the image configuration in the job is: - image: busybox:1.35.0, then an error will be reported and the image cannot be pulled.
After using kubectl apply
Create Job
, we can use kubectl get pod -w
to observe Pod
the status of in real time, and see the process of being queued, created, and running Pod
continuously :
kubectl apply -f sleep-job.yml
kubectl get pod -w
When Pod
all finished running, we will use kubectl get
to look Job
at Pod
the status of and , and
we will see Job
that the number of completed is 4 as we expected, and all 4 Pod
are also in the completed state.
Job
It will not be deleted immediately after the running, this is for the convenience of obtaining the calculation results, but if too many completed Job
will consume system resources, you can use the field ttlSecondsAfterFinished
to set a retention time limit.
5. Use YAML to describe CronJob
- Because the name
CronJob
of is a bit long,Kubernetes
a shorthand is providedcj
, which can alsokubectl api-resources
be seen ; CronJob
It needs to be run regularly, so we also need to specify parameters on the command line--schedule
.
Use the command directly kubectl create
to create CronJob
templates for .
export out="--dry-run=client -o yaml" # 定义Shell变量
kubectl create cj echo-cj --image=busybox --schedule="" $out
Then we edit the YAML
boilerplate to generate CronJob
the object:
apiVersion: batch/v1
kind: CronJob
metadata:
name: echo-cj
spec:
schedule: '*/1 * * * *'
jobTemplate:
spec:
template:
spec:
restartPolicy: OnFailure
containers:
- image: busybox
name: echo-cj
imagePullPolicy: IfNotPresent
command: ["/bin/echo"]
args: ["hello", "world"]
We still focus on its spec
fields , and you will find that it has three consecutive spec
nesting levels:
- The first
spec
isCronJob
the own object specification declaration. - The second
spec
belongs tojobTemplate
, which defines aJob
object . - The third
spec
is subordinatetemplate
, which defines what runsJob
inPod
.
Therefore, CronJob
in fact, the new object Job
generated by
In addition to Job
defining jobTemplate
the field of the object, CronJob
there is also a new field that is schedule
used to define the rules for the periodic operation of the task. It uses the standard Cron
syntax to specify minutes, hours, days, months, weeks, theLinux
same as on .crontab
Except for the different names, theCronJob
usage is almost the same as , use to create , use , to view the status:Job
kubectl apply
CronJob
kubectl get cj
kubectl get pod
kubectl apply -f cronjob.yml
kubectl get cj
kubectl get pod
For the sake of saving resources, it CronJob
will not keep the running ones indefinitely Job
, it only keeps the 3 most recent execution results by default, but it can successfulJobsHistoryLimit
be changed .
Cron
Time setting syntax: https://crontab.guru/
6. Summary
Through this nesting method, Kubernetes
these API
objects in form a "control chain":
CronJob
Use timing rules to control Job
, Job
use concurrent quantity control Pod
, Pod
define parameters to control the container, and the container then isolates the control process, and the process finally realizes the business function. The progressive form is a bit like the design mode Decorator
(decoration mode), each in the chain Each link performs its own duties and completes the task under the unified command Kubernetes
of the leader.
Pod
Kubernetes
is the smallest scheduling unit of , but in order to maintain its independence, no redundant functionality should be added to it.Kubernetes
Two objects,Job
and , are provided for offline business , which deal with "temporary tasks" and "scheduled tasks" respectively.CronJob
API
Job
The key field isspec.template
, which definesPod
the template , and other important fields arecompletions
,parallelism
etc.CronJob
The key fields of arespec.jobTemplate
andspec.schedule
, which defineJob
the template .