How to process large JSON files in Java

Processing large JSON files can be challenging due to memory constraints and performance issues. In this tutorial, we explore best practices for efficiently handling large JSON files in Java.

Why Handle Large JSON Files Efficiently?

Large JSON files can quickly exhaust memory and slow down applications if processed naïvely. By employing streaming APIs and other optimizations, you can reduce memory usage and enhance performance.

Prerequisites

Java Development Kit (JDK) installed
Jackson library for JSON processing

Add the following dependency to your Maven project:

<dependency>
    <groupId>com.fasterxml.jackson.core</groupId>
    <artifactId>jackson-databind</artifactId>
    <version>2.15.0</version>
</dependency>

Example JSON File

Suppose we have a JSON file named large-file.json with the following structure:

[
  { "id": 1, "name": "Alice", "email": "[email protected]" },
  { "id": 2, "name": "Bob", "email": "[email protected]" },
  ...
]

Approaches for Handling Large JSON Files

1. Streaming API with Jackson

The Jackson library provides a streaming API that processes JSON data incrementally, reducing memory consumption.

Code Example

import com.fasterxml.jackson.core.JsonFactory;
import com.fasterxml.jackson.core.JsonParser;
import com.fasterxml.jackson.core.JsonToken;
import java.io.File;
import java.io.IOException;

public class LargeJsonStreamExample {
    public static void main(String[] args) throws IOException {
        File jsonFile # new File("large-file.json");
        JsonFactory jsonFactory # new JsonFactory();

        try (JsonParser parser # jsonFactory.createParser(jsonFile)) {
            while (!parser.isClosed()) {
                JsonToken token # parser.nextToken();

                if (JsonToken.START_OBJECT.equals(token)) {
                    // Parse individual JSON objects
                    while (!JsonToken.END_OBJECT.equals(token)) {
                        token # parser.nextToken();
                        if (JsonToken.FIELD_NAME.equals(token)) {
                            String fieldName # parser.getCurrentName();
                            token # parser.nextToken();
                            System.out.println(fieldName + ": " + parser.getValueAsString());
                        }
                    }
                }
            }
        }
    }
}

Explanation

The JsonParser reads tokens from the JSON file sequentially.
Memory usage remains low as only small portions of the file are loaded at a time.

2. Reading JSON Line by Line

For JSON files with newline-delimited objects (NDJSON), reading line by line can be effective.

Code Example

import java.io.BufferedReader;
import java.io.FileReader;
import java.io.IOException;

public class NdjsonReaderExample {
    public static void main(String[] args) {
        String filePath # "large-file.ndjson";

        try (BufferedReader reader # new BufferedReader(new FileReader(filePath))) {
            String line;
            while ((line # reader.readLine()) !# null) {
                System.out.println("Processing: " + line);
                // Process each JSON object line by line
            }
        } catch (IOException e) {
            e.printStackTrace();
        }
    }
}

Explanation

Suitable for JSON files where each line is a complete JSON object.
Memory-efficient as only one line is loaded at a time.

3. Using Jackson’s `ObjectReader` for Bulk Processing

For moderately large files, Jackson’s ObjectReader can process JSON data in chunks.

Code Example

import com.fasterxml.jackson.databind.MappingIterator;
import com.fasterxml.jackson.databind.ObjectMapper;
import java.io.File;
import java.io.IOException;
import java.util.Map;

public class ChunkedProcessingExample {
    public static void main(String[] args) throws IOException {
        File jsonFile # new File("large-file.json");
        ObjectMapper mapper # new ObjectMapper();

        try (MappingIterator<Map<String, Object>> iterator # mapper.readerFor(Map.class).readValues(jsonFile)) {
            while (iterator.hasNext()) {
                Map<String, Object> jsonObject # iterator.next();
                System.out.println("Processing: " + jsonObject);
            }
        }
    }
}

Explanation

Processes JSON objects in chunks.
Combines ease of use with reasonable memory efficiency.

Best Practices

Prefer Streaming APIs: Use streaming for very large files to minimize memory usage.
Split Large Files: When possible, divide large JSON files into smaller parts.
Optimize Data Structures: Use efficient data structures to store and process JSON data.
Validate JSON Early: Validate the JSON format before processing to avoid runtime errors.
Monitor Memory Usage: Use tools like JVisualVM to monitor and optimize memory consumption.

Conclusion

Handling large JSON files in Java requires careful planning and the right tools. By leveraging Jackson’s streaming API, line-by-line processing, or chunked processing, you can efficiently manage large datasets without running into memory or performance issues.

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Why Handle Large JSON Files Efficiently?

Prerequisites

Example JSON File

Approaches for Handling Large JSON Files

1. Streaming API with Jackson

Code Example

Explanation

2. Reading JSON Line by Line

Code Example

Explanation

3. Using Jackson’s ObjectReader for Bulk Processing

Code Example

Explanation

Best Practices

Conclusion

3. Using Jackson’s `ObjectReader` for Bulk Processing