Loading
Your request has been received. A member of our team will be reaching out shortly

Merge Multiple JSON Files

Overview

Sometimes it's necessary to merge multiple structured files into a single file, for example prior to a bulk SQL execution. This guide will explain how to use the MergeRecord processor to do so. For this project we utilized GenerateRecord to create many json documents containing auto-generated First and Last name attributes, then split that result set into individual JSON files.

Initial Setup
Initial Setup

The content of each file looks something like:

{
  "first_name" : "Krystina",
  "last_name" : "Carroll"
}

Processor Configuration

MergeRecord has two Controller Service dependencies in order to function, a Record Reader and a Record Writer. Since the input format if the JSON objects are flat we can use the default values when creating each Controller Service. For the Reader we'll need a new JsonTreeReader service. Since this component will most likely be re-used across multiple integration workflows we'll create it scoped to the Root canvas and name it 'Inferred Json Reader'.

Inferred Json Reader Setup
Inferred Json Reader Setup

We also need a JsonRecordSetWriter which will be used to create the JSON output. This Controller Service should also be created at the Root canvas to allow for re-use across different flows. As above the default values are fine.

Json Record Set Writer Setup
Json Record Set Writer Setup

Ensure that both Controller Services are enabled, then configure the MergeRecord processor to use these services. Additionall the Minimum Number of Records and Max Bin Age properties will need to be configured.

In the example below Minimum Number of Records is set to 1000 and Max Bin Age is 5s. This tells the processor to release the merged file once it has reached a size of 1000 or 5 seconds have elapsed since it received a file. Note: If Max Bin Age is left empty the processor will wait indefinitley until 1000 entries are ready to be merged.

Merge Record Configuration
Merge Record Configuration

Result

Once the merge is complete the result will be 10 flowfiles. The first nine flowfiles will contain 1000 elements with the tenth containing the remaining 382:

Result Queue
Result Queue

The content of each flowfile is now a JSON array:

Result Content
Result Content
Get Started