zxml

Low-latency XML DOM parsing for Zig with comptime-specialized parse modes and an in-tree benchmark/conformance harness.

Features

Single-pass XML parsing over []const u8 input.
DOM layout backed by contiguous node/attribute arrays and span slices into source bytes.
Comptime parse configuration via Document.parse(input, .{ ... }).
Two parser profiles: strict and turbo.
Raw borrowed accessors plus allocator-backed decoded helpers for text and attribute values.
In-tree conformance suites and external parser benchmark harness.

Performance

Source: bench/results/latest.json (quick profile).

Parse Throughput (Average Across Fixtures)

stream-turbo  │████████████████████│ 3725.24 MB/s (100.00%)
stream-strict │███████████████████░│ 3577.71 MB/s (96.04%)
ours-turbo    │█████████████████░░░│ 3077.73 MB/s (82.62%)
ours-strict   │████████████████░░░░│ 2942.62 MB/s (78.99%)
pugixml       │████████░░░░░░░░░░░░│ 1455.80 MB/s (39.08%)
rapidxml      │███████░░░░░░░░░░░░░│ 1340.28 MB/s (35.98%)

Stable Gate Snapshot

Profile	Passed	Rule
`quick`	20/20	`ours-turbo >= max(pugixml, rapidxml)`
`quick`	20/20	`stream-turbo >= ours-turbo && stream-strict >= ours-strict`

Quick Start

zig build test
zig build conformance
zig build bench-compare

Minimal parse:

const std = @import("std");
const zxml = @import("zxml");

pub fn main() !void {
    const src = "<root id='r'><child>text</child></root>";
    const options: zxml.ParseOptions = .{ .mode = .strict, .validate_closing_tags = true };
    var doc = try options.parse(std.heap.page_allocator, src);
    defer doc.deinit();

    const root = doc.nodeAt(1).?;
    std.debug.print("{s} {s}\n", .{ root.nameSlice(), root.getAttributeValueRaw("id").? });
}

Library API

zxml.ParseOptions
zxml.ParseMode
zxml.ParseError
zxml.IndexInt
zxml.MaxInputLen
options.parse(allocator, input)
options.Document()
zxml.Types(options).Document / .Node / .Attribute / .StreamingParser

const options: zxml.ParseOptions = .{};
const Document = options.Document();
const StreamingParser = zxml.Types(options).StreamingParser;

Index width is configurable at build time, following the same config-module pattern as htmlparser:

zig build test -Dintlen=u64

Supported widths are u16, u32, u64, and usize. The default is u32.

ParseOptions.parse returns an initialized document; Document.parse remains available for document reuse:

const options: zxml.ParseOptions = .{
    .mode = .turbo,
    .validate_closing_tags = false,
    .expand_dtd_entities = false,
    .max_entity_value_len = 4096,
    .drop_whitespace_text_nodes = true,
    .include_misc_nodes = true,
};
var doc = try options.parse(allocator, input);

Parsing is always non-destructive and the original input is always []const u8.

Serialize without reparsing:

var out: std.Io.Writer.Allocating = .init(allocator);
defer out.deinit();
try doc.write(&out.writer);

Incremental streaming keeps parser state and resumes from saved offsets:

var stream = zxml.Types(options).StreamingParser.init(allocator);
defer stream.deinit();
_ = try stream.parseAvailable(buffer_so_far, &ctx, onNode);
try stream.finish();

Use raw accessors when you want borrowed source slices:

const attr_raw = root.getAttributeValueRaw("id").?;
const text_raw = root.firstChild().?.valueRawSlice();

Use allocator-backed helpers when you want decoded values without mutating the source:

const attr = try root.getAttributeValue(std.heap.page_allocator, "id") orelse return;
defer std.heap.page_allocator.free(attr);

const inner = try root.innerText(std.heap.page_allocator);
defer std.heap.page_allocator.free(inner);

DTD/entity expansion is disabled by default. When expand_dtd_entities = true, zxml parses internal <!ENTITY ...> declarations from the document doctype into a document-owned hash map and uses that map during decoded value access. max_entity_value_len caps each stored expanded entity value.

turbo keeps DOM construction but drops expensive validation work by default. strict enforces stronger well-formedness checks and is the correctness-first profile.

Build And Validation

zig build test
zig build conformance
zig build tools -- run-conformance --suite bench/conformance/well_formedness_w3c_core.json
zig build bench-compare

Benchmark and conformance details are documented in bench/README.md.

Name		Name	Last commit message	Last commit date
Latest commit History 55 Commits
.github/workflows		.github/workflows
bench		bench
examples		examples
src		src
tools		tools
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
build.zig		build.zig
build.zig.zon		build.zig.zon
test_runner.zig		test_runner.zig

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

zxml

Features

Performance

Parse Throughput (Average Across Fixtures)

Stable Gate Snapshot

Quick Start

Library API

Build And Validation

About

Uh oh!

Releases

Contributors

Uh oh!

Languages

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

zxml

Features

Performance

Parse Throughput (Average Across Fixtures)

Stable Gate Snapshot

Quick Start

Library API

Build And Validation

About

Topics

Resources

License

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Contributors

Uh oh!

Languages