jsonperformancenodejavascripttutorial

Parsing Large JSON Files Without Crashing

·7 min read

Why JSON.parse Fails on Large Files

JSON.parse() loads the entire JSON string into memory and then parses it synchronously. For small files (under a few MB) this is instant. For large files:

  • A 500 MB JSON file requires at least 500 MB of string + parsed object memory, often 2–4× the raw file size
  • The synchronous parse blocks the Node.js event loop for seconds
  • If the file exceeds available heap memory, Node.js crashes with `JavaScript heap out of memory`

How Large Is Too Large?

On a typical server with 512 MB heap (Node.js default), JSON.parse() starts to struggle around 50–100 MB. You can increase the heap with `--max-old-space-size=4096` but this is a workaround, not a solution for truly large files.

bash
node --max-old-space-size=4096 script.js

Option 1: Streaming Parser (Best for Large Files)

A streaming JSON parser reads the file incrementally without loading it all into memory. The most popular choice is the stream-json package:

bash
npm install stream-json

Process a large array of objects one item at a time:

javascript
import { createReadStream } from 'fs';
import { chain } from 'stream-chain';
import { parser } from 'stream-json';
import { streamArray } from 'stream-json/streamers/StreamArray.js';

const pipeline = chain([
  createReadStream('large-data.json'),
  parser(),
  streamArray(),
]);

let count = 0;

pipeline.on('data', ({ value }) => {
  // process one array element at a time
  processRecord(value);
  count++;
});

pipeline.on('end', () => {
  console.log(`Processed ${count} records`);
});

This never holds more than one record in memory at a time. A 10 GB JSON file processes in constant memory.

Option 2: jsonl / NDJSON Format

If you control the data source, use newline-delimited JSON (NDJSON) instead of one giant JSON array. Each line is a valid JSON object:

{"id": 1, "name": "Ravi"}
{"id": 2, "name": "Priya"}
{"id": 3, "name": "Arjun"}

Read line by line with readline:

javascript
import { createReadStream } from 'fs';
import { createInterface } from 'readline';

const rl = createInterface({
  input: createReadStream('data.jsonl'),
  crlfDelay: Infinity,
});

for await (const line of rl) {
  if (line.trim()) {
    const record = JSON.parse(line);
    await processRecord(record);
  }
}

This is the fastest approach and works at any scale.

Option 3: Batch Processing with Chunked Reads

If you must use standard JSON format but the file is only moderately large (100–500 MB), batch your processing:

javascript
import { readFileSync } from 'fs';

const data = JSON.parse(readFileSync('data.json', 'utf8'));
const BATCH = 1000;

for (let i = 0; i < data.length; i += BATCH) {
  const batch = data.slice(i, i + BATCH);
  await processBatch(batch);
  // Allow GC to run between batches
  await new Promise(resolve => setImmediate(resolve));
}

Option 4: Use a Database

If you repeatedly query large JSON datasets, import them into SQLite, PostgreSQL, or MongoDB instead of parsing on every request:

bash
# SQLite — import JSON array
sqlite3 data.db ".import --csv <(jq -r '.[] | [.id, .name] | @csv' data.json) users"

Databases are indexed, queryable, and do not hold the entire dataset in memory.

Measuring Parse Time

javascript
console.time('parse');
const data = JSON.parse(jsonString);
console.timeEnd('parse'); // parse: 2341ms for a 200 MB file

Tips for Generating Large JSON Files

When writing large JSON output, stream it instead of building a giant in-memory string:

javascript
import { createWriteStream } from 'fs';

const out = createWriteStream('output.json');
out.write('[\n');

for (let i = 0; i < records.length; i++) {
  out.write(JSON.stringify(records[i]));
  if (i < records.length - 1) out.write(',\n');
}

out.end('\n]\n');

For a quick browser-based check of a large JSON file's structure, paste the first few thousand characters into JSONKit's formatter at /json-formatter to verify the top-level shape without loading the entire file.

Try JSON Formatter

Format and explore large JSON files directly in your browser.