The Problem with Large JSON Arrays
Standard JSON requires you to load the entire file into memory before you can parse any of it. A 2 GB array of log entries needs 2 GB of RAM before you see a single record. For large datasets, this is a dealbreaker.
[
{ "id": 1, "event": "page_view", "url": "/home" },
{ "id": 2, "event": "click", "url": "/pricing" }
]The opening [ and closing ] mean a parser cannot know the array is done until it reads the very last byte.
What is NDJSON?
NDJSON (Newline Delimited JSON) — also called JSON Lines — is a simple convention: one complete, valid JSON value per line, separated by newline characters. No wrapping array. No commas between records.
{ "id": 1, "event": "page_view", "url": "/home" }
{ "id": 2, "event": "click", "url": "/pricing" }
{ "id": 3, "event": "purchase", "amount": 49.99 }Each line is independently parseable. You can process record 1 before reading record 2. You can start reading from the middle of the file. You can append records without rewriting the file.
File Extension and MIME Type
- Extension: .ndjson or .jsonl
- MIME type: application/x-ndjson
- The JSON Lines site (jsonlines.org) uses .jsonl; the NDJSON spec uses .ndjson — both are the same format
Reading NDJSON in Node.js
Line by line without loading the full file:
import { createReadStream } from "fs";
import { createInterface } from "readline";
const rl = createInterface({
input: createReadStream("events.ndjson"),
crlfDelay: Infinity,
});
for await (const line of rl) {
if (!line.trim()) continue;
const record = JSON.parse(line);
console.log(record.event); // process each record as it arrives
}Memory usage is proportional to one line at a time — not the file size.
Writing NDJSON in Node.js
import { createWriteStream } from "fs";
const out = createWriteStream("events.ndjson");
const records = [
{ id: 1, event: "page_view" },
{ id: 2, event: "click" },
];
for (const record of records) {
out.write(JSON.stringify(record) + "
");
}
out.end();Reading NDJSON in Python
import json
with open("events.ndjson", "r") as f:
for line in f:
line = line.strip()
if not line:
continue
record = json.loads(line)
print(record["event"])Appending Records
One of NDJSON's biggest advantages — you can append new records without touching the existing file:
import { appendFileSync } from "fs";
function logEvent(event) {
appendFileSync("events.ndjson", JSON.stringify(event) + "
");
}This is why NDJSON is popular for log files, event streams, and audit trails.
NDJSON in HTTP APIs
You can stream NDJSON over HTTP by setting the Content-Type to application/x-ndjson and flushing each record as it's ready — the client processes records as they arrive instead of waiting for the full response.
// Express.js streaming endpoint
app.get("/events/stream", (req, res) => {
res.setHeader("Content-Type", "application/x-ndjson");
res.setHeader("Transfer-Encoding", "chunked");
const events = getEventStream(); // async generator
for await (const event of events) {
res.write(JSON.stringify(event) + "
");
}
res.end();
});When to Use NDJSON vs Regular JSON
- Use regular JSON for API responses, config files, and small-to-medium payloads where the entire document is meaningful as a unit.
- Use NDJSON for log files, event streams, data exports, ETL pipelines, and any dataset where records are independent and the total size is unpredictable.
Real-World Uses
- Elasticsearch bulk API uses NDJSON for indexing multiple documents in one request
- Docker logs output each log line as an NDJSON record
- GitHub Archive stores all public GitHub events as compressed NDJSON files
- OpenAI's fine-tuning data format is NDJSON
- BigQuery streaming inserts accept NDJSON
Validate individual NDJSON records with JSONKit's validator — paste one line at a time to check each record's structure.