Batch processing in Mule 4

What is batch processing?

  • Batch processing allows for the efficient processing of records in groups or batches.

  • Batch Processing is done by Batch Job and one or Batch Step

Configuration of Batch Job:

Batch Job instances is created based on the batch block size. For example, if there are 10,000 records and if block size is set to 100 then 1000 batch job instances is created.

The scheduling strategy can be configured as Ordered_Sequential and Round Robin.

Configuration Parameters:

  • Name: Denotes the batch job name.

  • Max Failed Records: Defines the threshold for failed records before halting the batch job. By default, a value of 0 means halting upon a single failure. A value of -1 means the process won’t halt regardless of the number of failed records.

  • Scheduling Strategy: Determines batch job execution. Choices include:

  • ORDERED_SEQUENTIAL (default): Job instances run consecutively based on their timestamp.

  • ROUND_ROBIN: All available batch job instances execute using a round-robin algorithm.

  • Job Instance ID: Assigned to each batch job for processing.

  • Batch Block Size: Typically set to 100 by default. Defines the number of records assigned to each execution thread.

Phases of Batch Jobs

Each batch job contains three different phases:

1. Load and Dispatch: This is an implicit phase. It works behind the scenes. In this phase, Mule turns the serialized message payload into collection of records for processing in the batch steps.

2. Process: This is the mandatory phase of the batch. It can have one or more batch steps to asynchronously process the records.

3. On Complete: This is the optional phase of the batch. It provides the summary of the records processed and helps the developer to get an insight which record was successful and which one failed so that you can address the issue properly.

failedRecords, loadedRecords, processedRecords, successfulRecords, totalRecords

Batch Step

Each batch steps can contain related processor to work on individual records for any type of processing like enrichment, transformation or routing. This will help in segregating the processing logic

We can see two options

Accept Expression – expression which hold true for the processing records e.g. #[payload.age > 27]

Accept Policy – can hold only below predefined values

  1. NO_FAILURES (Default) – Batch step processes only those records that succeeded to process in all preceding steps.

  2. ONLY_FAILURES – Batch step processes only those records that failed to process in a preceding batch step.

  3. ALL – Batch step processes all records, regardless of whether they failed to process in a preceding batch step.

Batch Aggregator

As the name suggest can execute related processor on bulk records to increase the performance. In today’s time most of the target system e.g. Salesforce, Database, REST call accept collection of records for processing and sending individual records to target system can become a performance bottleneck so Batch Aggregator help by send bulk data to target system.

Size for Batch aggregator can be defined as below (both are mutually exclusive)

Aggregator size – Processing a fixed amount of records e.g. 10 records