xsv, mass assembly.
Drop-in replacement for xsv. Zero dependencies. SIMD-native. Single binary.
Why
Your 8GB CSV crashed VSCode. Excel can't open it. pandas ate 16GB of RAM and died.
oxsv processes it in 0.2 seconds.
Benchmark
1GB CSV, 6.27M rows, 18 columns (WSL2, DDR4-3200)
| Command | oxsv | xsv (Rust) | Speedup |
|---|---|---|---|
| count | 0.17s | 13.14s | 76x |
| headers | 0.002s | 0.044s | 22x |
| head | 0.002s | 0.023s | 12x |
| select | 0.82s | 19.27s | 24x |
| search | 1.31s | 25.63s | 20x |
| frequency | 1.14s | 29.17s | 26x |
| index | 1.11s | 14.22s | 13x |
| stats | 12.97s | 74.25s | 5.7x |
| sort | 2.15s | 47.58s | 22x |
| fmt | 2.43s | 37.26s | 15x |
SIMD microbenchmark (ASM vs naive C, 16MB buffer)
| Function | ASM | C | Speedup |
|---|---|---|---|
| count_byte | 8.9 GB/s | 3.3 GB/s | 2.7x |
| find_byte | 11.3 GB/s | 2.3 GB/s | 5.0x |
| find_byte early | 8 ns | 88 ns | 10.8x |
Binary size
| Binary | Size |
|---|---|
| oxsv | 27 KB |
| xsv (Rust) | 4.8 MB |
178x smaller. oxsv is smaller than most favicons.
Install
make && sudo make install
Requires: nasm, gcc, Linux x86-64.
Usage
oxsv count huge.csv # count rows
oxsv headers huge.csv # show column names
oxsv head -n 20 huge.csv # first 20 rows
oxsv select name,email huge.csv # extract columns
oxsv search -s status "active" huge.csv # filter rows
oxsv search -s name "田中" huge.csv # UTF-8 works
oxsv stats huge.csv # column statistics
oxsv frequency -s status huge.csv # value counts
oxsv sort -s amount -N huge.csv # numeric sort
oxsv index huge.csv # build index → fast slice
oxsv slice -i 99999000 -l 100 huge.csv # instant with index
Pipe from S3:
aws s3 cp s3://bucket/huge.csv - | oxsv search --no-mmap -s status "active"
How It Works
mmap -- The file is memory-mapped, not loaded. Physical RAM usage stays in single-digit megabytes regardless of file size. A 64GB CSV on a 4GB machine works fine.
AVX2 -- Delimiters are scanned 32 bytes at a time using SIMD instructions. On DDR4-3200, this saturates memory bandwidth at ~30GB/s. That's the theoretical floor -- oxsv gets close.
No runtime -- The C layer parses arguments. Everything that touches your data is hand-written x86-64 assembly. No libc in the hot path, no allocator, no GC.
Architecture
main.c argument parsing only (never touches file data)
oxsv.asm mmap + CPUID dispatch
cmd/*.asm one file per subcommand
core/*.asm SIMD scanner, CSV parser, buffered I/O, syscall wrappers
Binary is 27KB stripped. That's smaller than most favicons.
Flags
| Flag | Short | Description |
|---|---|---|
--delimiter | -d | Field delimiter (default: ,) |
--quote | -q | Enable quoted field parsing |
--no-headers | -n | File has no header row |
--output | -o | Write to file instead of stdout |
--no-mmap | Read from stdin/pipe |
Index
oxsv index huge.csv # creates huge.csv.idx in same directory
With an index, slice and tail are O(1). Without it, they stream.
If the source file changes, the index is silently ignored.
UTF-8
oxsv assumes UTF-8. SIMD scans for ASCII delimiters (, \n ") which
never appear inside UTF-8 multibyte sequences. Japanese, emoji, whatever -- it
just works.
Shift_JIS is not supported. iconv -f SHIFT_JIS -t UTF-8 first.
xsv Compatibility
oxsv implements the core xsv commands with the same names and similar flags. Differences:
searchuses exact match by default (xsv uses regex)join,sample,split,flatten,tableare not implemented
License
MIT. Copyright (c) 2026 rxxuzi.
Acknowledgments
- xsv by BurntSushi -- command interface design