Spring Batch读取数据通过ItemReader接口的实现类来完成,包括FlatFileItemReader文本数据读取、StaxEventItemReader XML文件数据读取、JsonItemReader JSON文件数据读取、JdbcPagingItemReader数据库分页数据读取等实现,更多可用的实现可以参考:https://docs.spring.io/spring-batch/docs/4.2.x/reference/html/appendix.html#itemReadersAppendix,本文只介绍这四种比较常用的读取数据方式。

框架搭建

新建一个Spring Boot项目,版本为2.2.4.RELEASE,artifactId为spring-batch-itemreader,项目结构如下图所示:

Spring Batch读取数据 - 图1

剩下的数据库层的准备,项目配置,依赖引入和Spring Batch入门文章中的框架搭建步骤一致,这里就不再赘述。

简单数据读取

前面提到,Spring Batch读取数据是通过ItemReader接口的实现类来完成的,所以我们可以自定义一个ItemReader的实现类,实现简单数据的读取。

在cc.mrbird.batch包下新建reader包,然后在该包下新建ItemReader接口的实现类MySimpleIteamReader

  1. public class MySimpleIteamReader implements ItemReader<String> {
  2. private Iterator<String> iterator;
  3. public MySimpleIteamReader(List<String> data) {
  4. this.iterator = data.iterator();
  5. }
  6. @Override
  7. public String read() {
  8. // 数据一个接着一个读取
  9. return iterator.hasNext() ? iterator.next() : null;
  10. }
  11. }

泛型指定读取数据的格式,这里读取的是String类型的List,read()方法的实现也很简单,就是遍历集合数据。

接着在cc.mrbird.batch包下新建job包,然后在该包下新建MySimpleItemReaderDemo类,用于测试我们定义的MySimpleIteamReaderMySimpleItemReaderDemo类代码如下:

  1. @Component
  2. public class MySimpleItemReaderDemo {
  3. @Autowired
  4. private JobBuilderFactory jobBuilderFactory;
  5. @Autowired
  6. private StepBuilderFactory stepBuilderFactory;
  7. @Bean
  8. public Job mySimpleItemReaderJob() {
  9. return jobBuilderFactory.get("mySimpleItemReaderJob")
  10. .start(step())
  11. .build();
  12. }
  13. private Step step() {
  14. return stepBuilderFactory.get("step")
  15. .<String, String>chunk(2)
  16. .reader(mySimpleItemReader())
  17. .writer(list -> list.forEach(System.out::println)) // 简单输出,后面再详细介绍writer
  18. .build();
  19. }
  20. private ItemReader<String> mySimpleItemReader() {
  21. List<String> data = Arrays.asList("java", "c++", "javascript", "python");
  22. return new MySimpleIteamReader(data);
  23. }
  24. }

上面代码中,我们通过mySimpleItemReader()方法创建了一个MySimpleIteamReader,并且传入了List数据。上面代码大体和上一节中介绍的差不多,最主要的区别就是Step的创建过程稍有不同。

MySimpleItemReaderDemo类中,我们通过StepBuilderFactory创建步骤Step,不过不再是使用tasklet()方法创建,而是使用chunk()方法。chunk字面上的意思是“块”的意思,可以简单理解为数据块,泛型<String, String>用于指定读取的数据和输出的数据类型,构造器入参指定了数据块的大小,比如指定为2时表示每当读取2组数据后做一次数据输出处理。接着reader()方法指定读取数据的方式,该方法接收ItemReader的实现类,这里使用的是我们自定义的MySimpleIteamReaderwriter()方法指定数据输出方式,因为这块不是本文的重点,所以先简单遍历输出即可。

启动项目,控制台日志打印如下:

  1. 2020-03-07 11:17:32.303 INFO 28381 --- [ main] o.s.b.c.l.support.SimpleJobLauncher : Job: [SimpleJob: [name=mySimpleItemReaderJob]] launched with the following parameters: [{}]
  2. 2020-03-07 11:17:32.369 INFO 28381 --- [ main] o.s.batch.core.job.SimpleStepHandler : Executing step: [step]
  3. java
  4. c++
  5. javascript
  6. python
  7. 2020-03-07 11:17:32.428 INFO 28381 --- [ main] o.s.batch.core.step.AbstractStep : Step: [step] executed in 59ms
  8. 2020-03-07 11:17:32.451 INFO 28381 --- [ main] o.s.b.c.l.support.SimpleJobLauncher : Job: [SimpleJob: [name=mySimpleItemReaderJob]] completed with the following parameters: [{}] and the following status: [COMPLETED] in 125ms

文本数据读取

Spring Batch读取文本类型数据可以通过FlatFileItemReader实现,在演示怎么使用之前,我们先准备好数据文件。

在resources目录下新建file文件,内容如下:

  1. // 演示文件数据读取
  2. 1,11,12,13
  3. 2,21,22,23
  4. 3,31,32,33
  5. 4,41,42,43
  6. 5,51,52,53
  7. 6,61,62,63

file的数据是一行一行以逗号分隔的数据(在批处理业务中,文本类型的数据文件一般都是有一定规律的)。在文本数据读取的过程中,我们需要将读取的数据转换为POJO对象存储,所以我们需要创建一个与之对应的POJO对象。在cc.mrbird.batch包下新建entity包,然后在该包下新建TestData类:

  1. public class TestData {
  2. private int id;
  3. private String field1;
  4. private String field2;
  5. private String field3;
  6. // get,set,toString略
  7. }

因为file文本中的一行数据经过逗号分隔后为1、11、12、13,所以我们创建的与之对应的POJO TestData包含4个属性id、field1、field2和field3。

接着在job包下新建FileItemReaderDemo

  1. @Component
  2. public class FileItemReaderDemo {
  3. // 任务创建工厂
  4. @Autowired
  5. private JobBuilderFactory jobBuilderFactory;
  6. // 步骤创建工厂
  7. @Autowired
  8. private StepBuilderFactory stepBuilderFactory;
  9. @Bean
  10. public Job fileItemReaderJob() {
  11. return jobBuilderFactory.get("fileItemReaderJob")
  12. .start(step())
  13. .build();
  14. }
  15. private Step step() {
  16. return stepBuilderFactory.get("step")
  17. .<TestData, TestData>chunk(2)
  18. .reader(fileItemReader())
  19. .writer(list -> list.forEach(System.out::println))
  20. .build();
  21. }
  22. private ItemReader<TestData> fileItemReader() {
  23. FlatFileItemReader<TestData> reader = new FlatFileItemReader<>();
  24. reader.setResource(new ClassPathResource("file")); // 设置文件资源地址
  25. reader.setLinesToSkip(1); // 忽略第一行
  26. // AbstractLineTokenizer的三个实现类之一,以固定分隔符处理行数据读取,
  27. // 使用默认构造器的时候,使用逗号作为分隔符,也可以通过有参构造器来指定分隔符
  28. DelimitedLineTokenizer tokenizer = new DelimitedLineTokenizer();
  29. // 设置属性名,类似于表头
  30. tokenizer.setNames("id", "field1", "field2", "field3");
  31. // 将每行数据转换为TestData对象
  32. DefaultLineMapper<TestData> mapper = new DefaultLineMapper<>();
  33. // 设置LineTokenizer
  34. mapper.setLineTokenizer(tokenizer);
  35. // 设置映射方式,即读取到的文本怎么转换为对应的POJO
  36. mapper.setFieldSetMapper(fieldSet -> {
  37. TestData data = new TestData();
  38. data.setId(fieldSet.readInt("id"));
  39. data.setField1(fieldSet.readString("field1"));
  40. data.setField2(fieldSet.readString("field2"));
  41. data.setField3(fieldSet.readString("field3"));
  42. return data;
  43. });
  44. reader.setLineMapper(mapper);
  45. return reader;
  46. }
  47. }

上面代码中,我们在fileItemReader()方法里编写了具体的文本数据读取代码,过程参考注释即可。DelimitedLineTokenizer分隔符行处理器的默认构造器源码如下所示:

Spring Batch读取数据 - 图2

常量DELIMITER_COMMA的值为public static final String DELIMITER_COMMA = ",";,假如我们的数据并不是用逗号分隔,而是用|等字符分隔的话,可以使用它的有参构造器指定:

  1. DelimitedLineTokenizer tokenizer = new DelimitedLineTokenizer("|");

DelimitedLineTokenizerAbstractLineTokenizer三个实现类之一:

Spring Batch读取数据 - 图3

顾名思义,FixedLengthTokenizer通过指定的固定长度来截取数据,RegexLineTokenizer通过正则表达式来匹配数据,这里就不演示了,有兴趣的可以自己玩玩。

编写好FileItemReaderDemo后,启动项目,控制台日志打印如下:

  1. 2020-03-07 12:06:11.876 INFO 29042 --- [ main] o.s.b.c.l.support.SimpleJobLauncher : Job: [SimpleJob: [name=fileItemReaderJob]] launched with the following parameters: [{}]
  2. 2020-03-07 12:06:11.937 INFO 29042 --- [ main] o.s.batch.core.job.SimpleStepHandler : Executing step: [step]
  3. TestData{id=1, field1='11', field2='12', field3='13'}
  4. TestData{id=2, field1='21', field2='22', field3='23'}
  5. TestData{id=3, field1='31', field2='32', field3='33'}
  6. TestData{id=4, field1='41', field2='42', field3='43'}
  7. TestData{id=5, field1='51', field2='52', field3='53'}
  8. TestData{id=6, field1='61', field2='62', field3='63'}
  9. 2020-03-07 12:06:12.020 INFO 29042 --- [ main] o.s.batch.core.step.AbstractStep : Step: [step] executed in 83ms
  10. 2020-03-07 12:06:12.044 INFO 29042 --- [ main] o.s.b.c.l.support.SimpleJobLauncher : Job: [SimpleJob: [name=fileItemReaderJob]] completed with the following parameters: [{}] and the following status: [COMPLETED] in 146ms

数据库数据读取

在演示从数据库中读取数据之前,我们先准备好测试数据。在springbatch数据库中新建一张TEST表,SQL语句如下所示:

  1. -- ----------------------------
  2. -- Table structure for TEST
  3. -- ----------------------------
  4. DROP TABLE IF EXISTS `TEST`;
  5. CREATE TABLE `TEST` (
  6. `id` bigint(10) NOT NULL COMMENT 'ID',
  7. `field1` varchar(10) NOT NULL COMMENT '字段一',
  8. `field2` varchar(10) NOT NULL COMMENT '字段二',
  9. `field3` varchar(10) NOT NULL COMMENT '字段三',
  10. PRIMARY KEY (`id`)
  11. ) ENGINE=InnoDB DEFAULT CHARSET=utf8;
  12. -- ----------------------------
  13. -- Records of TEST
  14. -- ----------------------------
  15. BEGIN;
  16. INSERT INTO `TEST` VALUES (1, '11', '12', '13');
  17. INSERT INTO `TEST` VALUES (2, '21', '22', '23');
  18. INSERT INTO `TEST` VALUES (3, '31', '32', '33');
  19. INSERT INTO `TEST` VALUES (4, '41', '42', '43');
  20. INSERT INTO `TEST` VALUES (5, '51', '52', '53');
  21. INSERT INTO `TEST` VALUES (6, '61', '62', '63');
  22. COMMIT;

TEST表的字段和上面创建的TestData实体类一致。

然后在job包下新建DataSourceItemReaderDemo类,测试从数据库中读取数据:

  1. @Component
  2. public class DataSourceItemReaderDemo {
  3. @Autowired
  4. private JobBuilderFactory jobBuilderFactory;
  5. @Autowired
  6. private StepBuilderFactory stepBuilderFactory;
  7. // 注入数据源
  8. @Autowired
  9. private DataSource dataSource;
  10. @Bean
  11. public Job dataSourceItemReaderJob() throws Exception {
  12. return jobBuilderFactory.get("dataSourceItemReaderJob")
  13. .start(step())
  14. .build();
  15. }
  16. private Step step() throws Exception {
  17. return stepBuilderFactory.get("step")
  18. .<TestData, TestData>chunk(2)
  19. .reader(dataSourceItemReader())
  20. .writer(list -> list.forEach(System.out::println))
  21. .build();
  22. }
  23. private ItemReader<TestData> dataSourceItemReader() throws Exception {
  24. JdbcPagingItemReader<TestData> reader = new JdbcPagingItemReader<>();
  25. reader.setDataSource(dataSource); // 设置数据源
  26. reader.setFetchSize(5); // 每次取多少条记录
  27. reader.setPageSize(5); // 设置每页数据量
  28. // 指定sql查询语句 select id,field1,field2,field3 from TEST
  29. MySqlPagingQueryProvider provider = new MySqlPagingQueryProvider();
  30. provider.setSelectClause("id,field1,field2,field3"); //设置查询字段
  31. provider.setFromClause("from TEST"); // 设置从哪张表查询
  32. // 将读取到的数据转换为TestData对象
  33. reader.setRowMapper((resultSet, rowNum) -> {
  34. TestData data = new TestData();
  35. data.setId(resultSet.getInt(1));
  36. data.setField1(resultSet.getString(2)); // 读取第一个字段,类型为String
  37. data.setField2(resultSet.getString(3));
  38. data.setField3(resultSet.getString(4));
  39. return data;
  40. });
  41. Map<String, Order> sort = new HashMap<>(1);
  42. sort.put("id", Order.ASCENDING);
  43. provider.setSortKeys(sort); // 设置排序,通过id 升序
  44. reader.setQueryProvider(provider);
  45. // 设置namedParameterJdbcTemplate等属性
  46. reader.afterPropertiesSet();
  47. return reader;
  48. }
  49. }

dataSourceItemReader()方法中的主要步骤就是:通过JdbcPagingItemReader设置对应的数据源,然后设置数据量、获取数据的sql语句、排序规则和查询结果与POJO的映射规则等。方法末尾之所以需要调用JdbcPagingItemReaderafterPropertiesSet()方法是因为需要设置JDBC模板(afterPropertiesSet()方法源码):

Spring Batch读取数据 - 图4

启动项目,控制台日志打印如下:

  1. 2020-03-07 16:01:05.366 INFO 30264 --- [ main] o.s.b.c.l.support.SimpleJobLauncher : Job: [SimpleJob: [name=dataSourceItemReaderJob]] launched with the following parameters: [{}]
  2. 2020-03-07 16:01:05.420 INFO 30264 --- [ main] o.s.batch.core.job.SimpleStepHandler : Executing step: [step]
  3. TestData{id=1, field1='11', field2='12', field3='13'}
  4. TestData{id=2, field1='21', field2='22', field3='23'}
  5. TestData{id=3, field1='31', field2='32', field3='33'}
  6. TestData{id=4, field1='41', field2='42', field3='43'}
  7. TestData{id=5, field1='51', field2='52', field3='53'}
  8. TestData{id=6, field1='61', field2='62', field3='63'}
  9. 2020-03-07 16:01:05.512 INFO 30264 --- [ main] o.s.batch.core.step.AbstractStep : Step: [step] executed in 92ms
  10. 2020-03-07 16:01:05.534 INFO 30264 --- [ main] o.s.b.c.l.support.SimpleJobLauncher : Job: [SimpleJob: [name=dataSourceItemReaderJob]] completed with the following parameters: [{}] and the following status: [COMPLETED] in 147ms

XML数据读取

Spring Batch借助Spring OXM可以轻松地实现xml格式数据文件读取。在resources目录下新建file.xml,内容如下所示:

  1. <?xml version="1.0" encoding="utf-8" ?>
  2. <tests>
  3. <test>
  4. <id>1</id>
  5. <field1>11</field1>
  6. <field2>12</field2>
  7. <field3>13</field3>
  8. </test>
  9. <test>
  10. <id>2</id>
  11. <field1>21</field1>
  12. <field2>22</field2>
  13. <field3>23</field3>
  14. </test>
  15. <test>
  16. <id>3</id>
  17. <field1>31</field1>
  18. <field2>32</field2>
  19. <field3>33</field3>
  20. </test>
  21. <test>
  22. <id>4</id>
  23. <field1>41</field1>
  24. <field2>42</field2>
  25. <field3>43</field3>
  26. </test>
  27. <test>
  28. <id>5</id>
  29. <field1>51</field1>
  30. <field2>52</field2>
  31. <field3>53</field3>
  32. </test>
  33. <test>
  34. <id>6</id>
  35. <field1>61</field1>
  36. <field2>62</field2>
  37. <field3>63</field3>
  38. </test>
  39. </tests>

xml文件内容由一组一组的<test></test>标签组成,<test>标签又包含四组子标签,标签名称和TestData实体类属性一一对应。

准备好xml文件后,我们在pom中引入spring-oxm依赖:

  1. <dependency>
  2. <groupId>org.springframework</groupId>
  3. <artifactId>spring-oxm</artifactId>
  4. </dependency>
  5. <dependency>
  6. <groupId>com.thoughtworks.xstream</groupId>
  7. <artifactId>xstream</artifactId>
  8. <version>1.4.11.1</version>
  9. </dependency>

接着在job包下新建XmlFileItemReaderDemo,演示xml文件数据获取:

  1. @Component
  2. public class XmlFileItemReaderDemo {
  3. @Autowired
  4. private JobBuilderFactory jobBuilderFactory;
  5. @Autowired
  6. private StepBuilderFactory stepBuilderFactory;
  7. @Bean
  8. public Job xmlFileItemReaderJob() {
  9. return jobBuilderFactory.get("xmlFileItemReaderJob")
  10. .start(step())
  11. .build();
  12. }
  13. private Step step() {
  14. return stepBuilderFactory.get("step")
  15. .<TestData, TestData>chunk(2)
  16. .reader(xmlFileItemReader())
  17. .writer(list -> list.forEach(System.out::println))
  18. .build();
  19. }
  20. private ItemReader<TestData> xmlFileItemReader() {
  21. StaxEventItemReader<TestData> reader = new StaxEventItemReader<>();
  22. reader.setResource(new ClassPathResource("file.xml")); // 设置xml文件源
  23. reader.setFragmentRootElementName("test"); // 指定xml文件的根标签
  24. // 将xml数据转换为TestData对象
  25. XStreamMarshaller marshaller = new XStreamMarshaller();
  26. // 指定需要转换的目标数据类型
  27. Map<String, Class<TestData>> map = new HashMap<>(1);
  28. map.put("test", TestData.class);
  29. marshaller.setAliases(map);
  30. reader.setUnmarshaller(marshaller);
  31. return reader;
  32. }
  33. }

xmlFileItemReader()方法中,我们通过StaxEventItemReader读取xml文件,代码较简单,看注释即可。

启动项目,控制台日志打印如下:

  1. 020-03-07 16:23:47.775 INFO 30450 --- [ main] o.s.b.c.l.support.SimpleJobLauncher : Job: [SimpleJob: [name=xmlFileItemReaderJob]] launched with the following parameters: [{}]
  2. 2020-03-07 16:23:47.820 INFO 30450 --- [ main] o.s.batch.core.job.SimpleStepHandler : Executing step: [step]
  3. TestData{id=1, field1='11', field2='12', field3='13'}
  4. TestData{id=2, field1='21', field2='22', field3='23'}
  5. TestData{id=3, field1='31', field2='32', field3='33'}
  6. TestData{id=4, field1='41', field2='42', field3='43'}
  7. TestData{id=5, field1='51', field2='52', field3='53'}
  8. TestData{id=6, field1='61', field2='62', field3='63'}
  9. 2020-03-07 16:23:47.961 INFO 30450 --- [ main] o.s.batch.core.step.AbstractStep : Step: [step] executed in 140ms
  10. 2020-03-07 16:23:47.984 INFO 30450 --- [ main] o.s.b.c.l.support.SimpleJobLauncher : Job: [SimpleJob: [name=xmlFileItemReaderJob]] completed with the following parameters: [{}] and the following status: [COMPLETED] in 200ms

JSON数据读取

在resources目录下新建file.json文件,内容如下:

  1. [
  2. {
  3. "id": 1,
  4. "field1": "11",
  5. "field2": "12",
  6. "field3": "13"
  7. },
  8. {
  9. "id": 2,
  10. "field1": "21",
  11. "field2": "22",
  12. "field3": "23"
  13. },
  14. {
  15. "id": 3,
  16. "field1": "31",
  17. "field2": "32",
  18. "field3": "33"
  19. }
  20. ]

JSON对象属性和TestData对象属性一一对应。在job包下新建JSONFileItemReaderDemo,用于测试JSON文件数据读取:

  1. @Component
  2. public class JSONFileItemReaderDemo {
  3. @Autowired
  4. private JobBuilderFactory jobBuilderFactory;
  5. @Autowired
  6. private StepBuilderFactory stepBuilderFactory;
  7. @Bean
  8. public Job jsonFileItemReaderJob() {
  9. return jobBuilderFactory.get("jsonFileItemReaderJob")
  10. .start(step())
  11. .build();
  12. }
  13. private Step step() {
  14. return stepBuilderFactory.get("step")
  15. .<TestData, TestData>chunk(2)
  16. .reader(jsonItemReader())
  17. .writer(list -> list.forEach(System.out::println))
  18. .build();
  19. }
  20. private ItemReader<TestData> jsonItemReader() {
  21. // 设置json文件地址
  22. ClassPathResource resource = new ClassPathResource("file.json");
  23. // 设置json文件转换的目标对象类型
  24. JacksonJsonObjectReader<TestData> jacksonJsonObjectReader = new JacksonJsonObjectReader<>(TestData.class);
  25. JsonItemReader<TestData> reader = new JsonItemReader<>(resource, jacksonJsonObjectReader);
  26. // 给reader设置一个别名
  27. reader.setName("testDataJsonItemReader");
  28. return reader;
  29. }
  30. }

启动项目,控制台输出如下:

  1. 2020-03-07 16:40:52.508 INFO 30599 --- [ main] o.s.b.c.l.support.SimpleJobLauncher : Job: [SimpleJob: [name=jsonFileItemReaderJob]] launched with the following parameters: [{}]
  2. 2020-03-07 16:40:52.554 INFO 30599 --- [ main] o.s.batch.core.job.SimpleStepHandler : Executing step: [step]
  3. TestData{id=1, field1='11', field2='12', field3='13'}
  4. TestData{id=2, field1='21', field2='22', field3='23'}
  5. TestData{id=3, field1='31', field2='32', field3='33'}
  6. 2020-03-07 16:40:52.622 INFO 30599 --- [ main] o.s.batch.core.step.AbstractStep : Step: [step] executed in 67ms
  7. 2020-03-07 16:40:52.642 INFO 30599 --- [ main] o.s.b.c.l.support.SimpleJobLauncher : Job: [SimpleJob: [name=jsonFileItemReaderJob]] completed with the following parameters: [{}] and the following status: [COMPLETED] in 124ms

多文本数据读取

多文本的数据读取本质还是单文件数据读取,区别就是多文件读取需要在单文件读取的方式上设置一层代理。

在resources目录下新建两个文件file1和file2,file1内容如下所示:

  1. // 演示文件数据读取
  2. 1,11,12,13
  3. 2,21,22,23
  4. 3,31,32,33
  5. 4,41,42,43
  6. 5,51,52,53
  7. 6,61,62,63

file2内容如下所示:

  1. // 演示文件数据读取
  2. 7,71,72,73
  3. 8,81,82,83

然后在job包下新建MultiFileIteamReaderDemo,演示多文件数据读取:

  1. @Component
  2. public class MultiFileIteamReaderDemo {
  3. @Autowired
  4. private JobBuilderFactory jobBuilderFactory;
  5. @Autowired
  6. private StepBuilderFactory stepBuilderFactory;
  7. @Bean
  8. public Job multiFileItemReaderJob() {
  9. return jobBuilderFactory.get("multiFileItemReaderJob")
  10. .start(step())
  11. .build();
  12. }
  13. private Step step() {
  14. return stepBuilderFactory.get("step")
  15. .<TestData, TestData>chunk(2)
  16. .reader(multiFileItemReader())
  17. .writer(list -> list.forEach(System.out::println))
  18. .build();
  19. }
  20. private ItemReader<TestData> multiFileItemReader() {
  21. MultiResourceItemReader<TestData> reader = new MultiResourceItemReader<>();
  22. reader.setDelegate(fileItemReader()); // 设置文件读取代理,方法可以使用前面文件读取中的例子
  23. Resource[] resources = new Resource[]{
  24. new ClassPathResource("file1"),
  25. new ClassPathResource("file2")
  26. };
  27. reader.setResources(resources); // 设置多文件源
  28. return reader;
  29. }
  30. private FlatFileItemReader<TestData> fileItemReader() {
  31. FlatFileItemReader<TestData> reader = new FlatFileItemReader<>();
  32. reader.setLinesToSkip(1); // 忽略第一行
  33. // AbstractLineTokenizer的三个实现类之一,以固定分隔符处理行数据读取,
  34. // 使用默认构造器的时候,使用逗号作为分隔符,也可以通过有参构造器来指定分隔符
  35. DelimitedLineTokenizer tokenizer = new DelimitedLineTokenizer();
  36. // 设置属姓名,类似于表头
  37. tokenizer.setNames("id", "field1", "field2", "field3");
  38. // 将每行数据转换为TestData对象
  39. DefaultLineMapper<TestData> mapper = new DefaultLineMapper<>();
  40. mapper.setLineTokenizer(tokenizer);
  41. // 设置映射方式
  42. mapper.setFieldSetMapper(fieldSet -> {
  43. TestData data = new TestData();
  44. data.setId(fieldSet.readInt("id"));
  45. data.setField1(fieldSet.readString("field1"));
  46. data.setField2(fieldSet.readString("field2"));
  47. data.setField3(fieldSet.readString("field3"));
  48. return data;
  49. });
  50. reader.setLineMapper(mapper);
  51. return reader;
  52. }
  53. }

上面代码中fileItemReader()方法在文本数据读取中介绍过了,多文件读取的关键在于multiFileItemReader()方法,该方法通过MultiResourceItemReader对象设置了多个文件的目标地址,并且将单文件的读取方式设置为代理。

启动项目,控制台日志打印如下:

  1. 2020-03-07 16:55:24.480 INFO 30749 --- [ main] o.s.b.c.l.support.SimpleJobLauncher : Job: [SimpleJob: [name=multiFileItemReaderJob]] launched with the following parameters: [{}]
  2. 2020-03-07 16:55:24.536 INFO 30749 --- [ main] o.s.batch.core.job.SimpleStepHandler : Executing step: [step]
  3. TestData{id=1, field1='11', field2='12', field3='13'}
  4. TestData{id=2, field1='21', field2='22', field3='23'}
  5. TestData{id=3, field1='31', field2='32', field3='33'}
  6. TestData{id=4, field1='41', field2='42', field3='43'}
  7. TestData{id=5, field1='51', field2='52', field3='53'}
  8. TestData{id=6, field1='61', field2='62', field3='63'}
  9. TestData{id=7, field1='71', field2='72', field3='73'}
  10. TestData{id=8, field1='81', field2='82', field3='83'}
  11. 2020-03-07 16:55:24.617 INFO 30749 --- [ main] o.s.batch.core.step.AbstractStep : Step: [step] executed in 81ms
  12. 2020-03-07 16:55:24.643 INFO 30749 --- [ main] o.s.b.c.l.support.SimpleJobLauncher : Job: [SimpleJob: [name=multiFileItemReaderJob]] completed with the following parameters: [{}] and the following status: [COMPLETED] in 153ms