Spring Batch的核心概念
如下图,JobLancher启动job,一个job包含若干step,每个step又包含一个ItemReader(读数据),ItemProcessor(处理数据),和ItemWriter(输出数据),job的元数据和运行状态则存储在JobRepository中。
Spring batch主要有以下部分组成
- JobRepository 用来注册job的容器
- JobLauncher 用来启动Job的接口
- Job 实际执行的任务,包含一个或多个Step
- Step step包含ItemReader、ItemProcessor和ItemWriter
- ItemReader 用来读取数据的接口
- ItemProcessor 用来处理数据的接口
- ItemWriter 用来输出数据的接口
以上Spring Batch的主要组成部分只需要注册成Spring的Bean即可。若想开启批处理的支持还需在配置类上使用@EnableBatchProcessing,在Spring Batch中提供了大量的ItemReader和ItemWriter的实现,用来读取不同的数据来源,数据的处理和校验都要通过ItemProcessor接口实现来完成。
Job运行时概念
Job的一次完整运行称为一个JobInstance,由JobParameter区分(Spring认为相同的Job不应该多次运行),即如果JobParameter相同则为同一个Job,而一次运行如果中途失败或者抛异常,再次运行仍为一个JobInstance,而其中的每次运行称为一个JobExecution。执行一个step称为StepExecution
关系如下:
Job 1->n JobInstance 1->n JobExecution 1->n StepExecution
JobExecution和StepExecution各包含一个ExecutionContext,其中存储了key-value对,可以用来存储运行状态。
框架实践
下面这个例子实现的是:从变量中读取3个字符串全转化大写并输出到控制台,加了一个监听,当任务完成时输出一个字符串到控制台,通过web端来调用。
版本说明:
- spring-boot:2.0.1.RELEASE
- spring-batch-4.0.1.RELEASE(Spring-Boot 2.0.1就是依赖的此版本)
POM文件:
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>com.javainuse</groupId>
<artifactId>springboot-batch</artifactId>
<version>0.0.1</version>
<packaging>jar</packaging>
<name>SpringBatch</name>
<description>Spring Batch-Spring Boot</description>
<parent>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-parent</artifactId>
<version>2.0.1.RELEASE</version>
<relativePath /> <!-- lookup parent from repository -->
</parent>
<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
<project.reporting.outputEncoding>UTF-8</project.reporting.outputEncoding>
<java.version>1.8</java.version>
</properties>
<dependencies>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter</artifactId>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-web</artifactId>
</dependency>
<dependency>
<groupId>com.h2database</groupId>
<artifactId>h2</artifactId>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-devtools</artifactId>
<optional>true</optional>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-test</artifactId>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-batch</artifactId>
</dependency>
</dependencies>
<build>
<plugins>
<plugin>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-maven-plugin</artifactId>
</plugin>
</plugins>
</build>
</project>
Item Reader
读取:从数组中读取3个字符串
package com.javainuse.step;
import org.springframework.batch.item.ItemReader;
import org.springframework.batch.item.NonTransientResourceException;
import org.springframework.batch.item.ParseException;
import org.springframework.batch.item.UnexpectedInputException;
public class Reader implements ItemReader<String> {
private String[] messages = { "javainuse.com",
"Welcome to Spring Batch Example",
"We use H2 Database for this example" };
private int count = 0;
@Override
public String read() throws Exception, UnexpectedInputException, ParseException, NonTransientResourceException {
if (count < messages.length) {
return messages[count++];
} else {
count = 0;
}
return null;
}
}
Item Processor
处理:将字符串转为大写
package com.javainuse.step;
import org.springframework.batch.item.ItemProcessor;
public class Processor implements ItemProcessor<String, String> {
@Override
public String process(String data) throws Exception {
return data.toUpperCase();
}
}
Item Writer
输出:把转为大写的字符串输出到控制台
package com.javainuse.step;
import java.util.List;
import org.springframework.batch.item.ItemWriter;
public class Writer implements ItemWriter<String> {
@Override
public void write(List<? extends String> messages) throws Exception {
for (String msg : messages) {
System.out.println("Writing the data " + msg);
}
}
}
Listener
监听:任务成功完成后往控制台输出一行字符串
package com.javainuse.listener;
import org.springframework.batch.core.BatchStatus;
import org.springframework.batch.core.JobExecution;
import org.springframework.batch.core.listener.JobExecutionListenerSupport;
public class JobCompletionListener extends JobExecutionListenerSupport {
@Override
public void afterJob(JobExecution jobExecution) {
if (jobExecution.getStatus() == BatchStatus.COMPLETED) {
System.out.println("BATCH JOB COMPLETED SUCCESSFULLY");
}
}
}
Config
Spring Boot配置
package com.javainuse.config;
import org.springframework.batch.core.Job;
import org.springframework.batch.core.JobExecutionListener;
import org.springframework.batch.core.Step;
import org.springframework.batch.core.configuration.annotation.JobBuilderFactory;
import org.springframework.batch.core.configuration.annotation.StepBuilderFactory;
import org.springframework.batch.core.launch.support.RunIdIncrementer;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import com.javainuse.listener.JobCompletionListener;
import com.javainuse.step.Processor;
import com.javainuse.step.Reader;
import com.javainuse.step.Writer;
@Configuration
public class BatchConfig {
@Autowired
public JobBuilderFactory jobBuilderFactory;
@Autowired
public StepBuilderFactory stepBuilderFactory;
@Bean
public Job processJob() {
return jobBuilderFactory.get("processJob")
.incrementer(new RunIdIncrementer()).listener(listener())
.flow(orderStep1()).end().build();
}
@Bean
public Step orderStep1() {
return stepBuilderFactory.get("orderStep1").<String, String> chunk(1)
.reader(new Reader()).processor(new Processor())
.writer(new Writer()).build();
}
@Bean
public JobExecutionListener listener() {
return new JobCompletionListener();
}
}
Controller
控制器:配置web路由,访问/invokejob来调用任务
package com.javainuse.controller;
import org.springframework.batch.core.Job;
import org.springframework.batch.core.JobParameters;
import org.springframework.batch.core.JobParametersBuilder;
import org.springframework.batch.core.launch.JobLauncher;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.web.bind.annotation.RequestMapping;
import org.springframework.web.bind.annotation.RestController;
@RestController
public class JobInvokerController {
@Autowired
JobLauncher jobLauncher;
@Autowired
Job processJob;
@RequestMapping("/invokejob")
public String handle() throws Exception {
JobParameters jobParameters = new JobParametersBuilder().addLong("time", System.currentTimeMillis())
.toJobParameters();
jobLauncher.run(processJob, jobParameters);
return "Batch job has been invoked";
}
}
application.properties
配置:Spring Batch在加载的时候job默认都会执行,把spring.batch.job.enabled
置为false,即把job设置成不可用,应用便会根据jobLauncher.run来执行。下面2行是数据库的配置,不配置也可以,使用的嵌入式数据库h2。
spring.batch.job.enabled=false
spring.datasource.url=jdbc:h2:file:./DB
spring.jpa.properties.hibernate.hbm2ddl.auto=update
Application
Spring Boot入口类:加注解@EnableBatchProcessing
package com.javainuse;
import org.springframework.batch.core.configuration.annotation.EnableBatchProcessing;
import org.springframework.boot.SpringApplication;
import org.springframework.boot.autoconfigure.SpringBootApplication;
@SpringBootApplication
@EnableBatchProcessing
public class SpringBatchApplication {
public static void main(String[] args) {
SpringApplication.run(SpringBatchApplication.class, args);
}
}
进阶
方案一:spring boot上构建spring batch远程分区Step
https://gitee.com/kailing/partitionjob
方案二:spring batch进阶-基于RabbitMQ远程分区Step
http://www.kailing.pub/article/index/arcid/196.html
参考:
https://jamie-wang.iteye.com/blog/1876320
查了十几个帖子总结,如有侵权请联系作者。