Are you in a state where your application requires a large amount of data to be processed in a bunch? Then you are on the right track to visit this article Spring batch.

Spring batch comes with a very light framework where huge amount of data can be processed at various intervals, used to process billions of transactions everday for enterprises, it also provides logging/tracing, transaction management,job processing statistics,job restart,skips and resource management.

Let us see the below diagram and understand what Spring batch actually is!!

 

spribatch

As you can see above batch consists of various components interconnected to each other.

Let us see each one of them in brief.

  • JobLauncher : As the name says it is used to launch a job. Certain triggers like cron commands or some other command line commands triggers the joblauncher to launch a job. It connects with Job and the Job repository.
  • JobRepository: A system to manage the condition of job and the steps in the execution of the job. Particularly consist of the metadata of the job and its statistics
  • Job: This is the actual thing to be executed. It consists of the main processes to be executed to perform the execution.
  • Step: The job which is to be executed is executed in the form of steps. This steps can be divided into two types -chunk based, tasklet based. We will see this further.

Now to understand the last 3 components let us see the chunk based model.

When the job is launched by the job launcher, three components come into play.

The job is first read by an Item Reader, then it is processed for performing certain business logic and then it is written via Item Writer. For eg , say you have a job to read certain CSV file containing employee information(name,age,address) and the write only those employees having age greater than 30 into XML. Now here you read the CSV file using Item Reader, process the data to have employees age greater than 30 using Item Processor and then write it to XML. So this is a kind of Chunk processing model.

The other model is the tasklet model. This is basically used when you dont have to carry out several steps in between input and output. If you want to send any communication via email,any reminders,stop the job after certain condition is met then tasklet based model comes into use.

Enough for now lets get into some example. Lets say you have a CSV file containing employee data and you want to write that data into XML file. Below is the csv file

examResult.txt

John Kennedy   |   London|   34

Jimmy Snuka    |   Sweden  |   39

Renard konig   |   France  |   21

And the mapped POJO with fields corresponding to the row content of above file:

com.techninfo.springbatch

package com.techninfo.springbatch;

 

import javax.xml.bind.annotation.XmlElement;

import javax.xml.bind.annotation.XmlRootElement;

import javax.xml.bind.annotation.adapters.XmlJavaTypeAdapter;

 

 

@XmlRootElement(name = “ExamResult”)

public class ExamResult {

 

private String studentName;

 

private String address;

 

private String age;

 

@XmlElement(name = “studentName”)

public String getStudentName() {

return studentName;

}

 

public void setStudentName(String studentName) {

this.studentName = studentName;

}

 

@XmlElement(name = “address”)

 

public String getAddress() {

return address;

}

 

public void setAddress(String add) {

this.address = add;

}

 

@XmlElement(name = “age”)

public String getAge() {

return age;

}

 

public void setAge(String age) {

this.age = age;

}

 

 

 

 

}

Also note that we have used JAXB annotations in order to map the class properties to XML tags.

 

Step 4: Create a FieldSetMapper

FieldSetMapper is responsible for mapping each field form the input to a domain object

com.websystique.springbatch.ExamResultFieldSetMapper

package com.techninfo.springbatch;

 

import org.joda.time.LocalDate;

import org.springframework.batch.item.file.mapping.FieldSetMapper;

import org.springframework.batch.item.file.transform.FieldSet;

import org.springframework.validation.BindException;

 

import com.techninfo.springbatch.ExamResult;

 

public class ExamResultFieldSetMapper implements FieldSetMapper<ExamResult>{

 

@Override

public ExamResult mapFieldSet(FieldSet fieldSet) throws BindException {

ExamResult result = new ExamResult();

result.setStudentName(fieldSet.readString(0));

result.setAddress(fieldSet.readString(1));

result.setAge(fieldSet.readString(2));

return result;

}

 

}

Step 5: Create an ItemProcessor

ItemProcessor is Optional, and called after item read but before item write. It gives us the opportunity to perform a business logic on each item.In our case, for example, we will filter out all the items whose age is less than 30.So final result will only have records with age >= 30.

com.techninfo.springbatch.ExamResultItemProcessor

package com.techninfo.springbatch;

 

import org.springframework.batch.item.ItemProcessor;

 

import com.techninfo.springbatch.ExamResult;

 

public class ExamResultItemProcessor implements ItemProcessor<ExamResult, ExamResult>{

 

@Override

public ExamResult process(ExamResult result) throws Exception {

System.out.println(“Processing result :”+result);

 

/*

* Only return results which are equal or more than 60%

*

*/

if(result.getAge()< 30){

return null;

}

 

return result;

}

 

}

Step 7: Create Spring Context with job configuration

src/main/resource/spring-batch-context.xml

<beans xmlns=”<a class=”vglnk” href=”http://www.springframework.org/schema/beans” rel=”nofollow”><span>http</span><span>://</span><span>www</span><span>.</span><span>springframework</span><span>.</span><span>org</span><span>/</span><span>schema</span><span>/</span><span>beans</span></a>”

xmlns:batch=”<a class=”vglnk” href=”http://www.springframework.org/schema/batch” rel=”nofollow”><span>http</span><span>://</span><span>www</span><span>.</span><span>springframework</span><span>.</span><span>org</span><span>/</span><span>schema</span><span>/</span><span>batch</span></a>” xmlns:xsi=”<a class=”vglnk” href=”http://www.w3.org/2001/XMLSchema-instance” rel=”nofollow”><span>http</span><span>://</span><span>www</span><span>.</span><span>w3</span><span>.</span><span>org</span><span>/</span><span>2001</span><span>/</span><span>XMLSchema</span><span>-</span><span>instance</span></a>”

xsi:schemaLocation=”<a class=”vglnk” href=”http://www.springframework.org/schema/batch” rel=”nofollow”><span>http</span><span>://</span><span>www</span><span>.</span><span>springframework</span><span>.</span><span>org</span><span>/</span><span>schema</span><span>/</span><span>batch</span></a>   <a class=”vglnk” href=”http://www.springframework.org/schema/batch/spring-batch-3.0.xsd” rel=”nofollow”><span>http</span><span>://</span><span>www</span><span>.</span><span>springframework</span><span>.</span><span>org</span><span>/</span><span>schema</span><span>/</span><span>batch</span><span>/</span><span>spring</span><span>-</span><span>batch</span><span>-</span><span>3</span><span>.</span><span>0</span><span>.</span><span>xsd</span></a>

<a class=”vglnk” href=”http://www.springframework.org/schema/beans” rel=”nofollow”><span>http</span><span>://</span><span>www</span><span>.</span><span>springframework</span><span>.</span><span>org</span><span>/</span><span>schema</span><span>/</span><span>beans</span></a> <a class=”vglnk” href=”http://www.springframework.org/schema/beans/spring-beans-4.0.xsd” rel=”nofollow”><span>http</span><span>://</span><span>www</span><span>.</span><span>springframework</span><span>.</span><span>org</span><span>/</span><span>schema</span><span>/</span><span>beans</span><span>/</span><span>spring</span><span>-</span><span>beans</span><span>-</span><span>4</span><span>.</span><span>0</span><span>.</span><span>xsd</span></a>”>

 

<!– JobRepository and JobLauncher are configuration/setup classes –>

<bean id=”jobRepository” class=”org.springframework.batch.core.repository.support.MapJobRepositoryFactoryBean” />

 

<bean id=”jobLauncher”   class=”org.springframework.batch.core.launch.support.SimpleJobLauncher”>

<property name=”jobRepository” ref=”jobRepository” />

</bean>

 

<!– ItemReader reads a complete line one by one from input file –>

<bean id=”flatFileItemReader” class=”org.springframework.batch.item.file.FlatFileItemReader”  scope=”step”>

<property name=”resource” value=”classpath:examResult.txt” />

 

<property name=”lineMapper”>

 

<bean class=”org.springframework.batch.item.file.mapping.DefaultLineMapper”>

 

<property name=”fieldSetMapper”>

<!– Mapper which maps each individual items in a record to properties in POJO –>

<bean class=” com.techninfo.springbatch.ExamResultFieldSetMapper” />

</property>

 

<property name=”lineTokenizer”>

<!– A tokenizer class to be used when items in input record are separated by specific characters –>

<bean class=”org.springframework.batch.item.file.transform.DelimitedLineTokenizer”>

<property name=”delimiter” value=”|” />

</bean>

</property>

 

</bean>

 

</property>

 

</bean>

 

<!– XML ItemWriter which writes the data in XML format –>

<bean id=”xmlItemWriter” class=”org.springframework.batch.item.xml.StaxEventItemWriter”>

 

<property name=”resource” value=”file:xml/examResult.xml” />

 

<property name=”rootTagName” value=”UniversityExamResultList” />

 

<property name=”marshaller”>

 

<bean class=”org.springframework.oxm.jaxb.Jaxb2Marshaller”>

<property name=”classesToBeBound”>

<list>

<value>com.techninfo.springbatch.ExamResult</value>

</list>

</property>

</bean>

 

</property>

 

</bean>

 

<!– Optional ItemProcessor to perform business logic/filtering on the input records –>

<bean id=”itemProcessor” class=” com.techninfo.springbatch.ExamResultItemProcessor” />

 

<!– Step will need a transaction manager –>

<bean id=”transactionManager” class=”org.springframework.batch.support.transaction.ResourcelessTransactionManager” />

 

<!– Actual Job –>

<batch:job id=”examResultJob”>

<batch:step id=”step1″>

<batch:tasklet transaction-manager=”transactionManager”>

<batch:chunk reader=”flatFileItemReader” writer=”xmlItemWriter”  processor=”itemProcessor” commit-interval=”10″ />

</batch:tasklet>

</batch:step>

</batch:job>

 

</beans>

As you can see, we have setup a job with only one step. Step uses FlatFileItemReader to read the records, itemProcessor to process the record & StaxEventItemWriter to write the records. commit-interval specifies the number of items that can be processed before the transaction is committed/ before the write will happen.Grouping several record in single transaction and write them as chunk provides performance improvement. We have also shown the use of jobListener which can contain any arbitrary logic you might need to run before and after the job.

Step 8: Create Main application to finally run the job

Create a Java application to run the job.

com.techninfo.springbatch.Main

package com.techninfo.springbatch;

import org.springframework.batch.core.Job;

import org.springframework.batch.core.JobExecution;

import org.springframework.batch.core.JobExecutionException;

import org.springframework.batch.core.JobParameters;

import org.springframework.batch.core.launch.JobLauncher;

import org.springframework.context.ApplicationContext;

import org.springframework.context.support.ClassPathXmlApplicationContext;

 

public class Main {

@SuppressWarnings(“resource”)

public static void main(String args[]){

 

ApplicationContext context = new ClassPathXmlApplicationContext(“spring-batch-context.xml”);

 

JobLauncher jobLauncher = (JobLauncher) context.getBean(“jobLauncher”);

Job job = (Job) context.getBean(“examResultJob”);

 

try {

JobExecution execution = jobLauncher.run(job, new JobParameters());

System.out.println(“Job Exit Status : “+ execution.getStatus());

 

} catch (JobExecutionException e) {

System.out.println(“Job ExamResult failed”);

e.printStackTrace();

}

}

 

}

Running above program as java application, you will see following output

You can see that we have processed all input records. Below is the generated XML file found in project/xml folder

<?xml version=”1.0″ encoding=”UTF-8″?>

<UniversityExamResultList>

<ExamResult>

<age>34</age>

<address>London</address>

<studentName>John Kennedy </studentName>

</ExamResult>

<ExamResult>

<age>39</age>

<address>Sweden</address>

<studentName>Jimmy Snuka </studentName>

</ExamResult>

</UniversityExamResultList>

That’s all for getting basic understanding on spring batch. For more information visit

Spring batch