Skip to content

Uniq: improve performances #13249

Open
sylvestre wants to merge 2 commits into
uutils:mainfrom
sylvestre:uniq-perf
Open

Uniq: improve performances #13249
sylvestre wants to merge 2 commits into
uutils:mainfrom
sylvestre:uniq-perf

Conversation

@sylvestre

Copy link
Copy Markdown
Contributor

Closes: #13199

sylvestre added 2 commits July 1, 2026 22:06
is_c_locale() was called on every line inside key_end_index() when
-w/--check-chars is set, doing up to 3 std::env::var_os() lookups
each time. Locale env vars can't change mid-process, so this was
pure per-line overhead, causing uniq -w to be ~5x slower than GNU
uniq even for small -w values.

Compute is_c_locale() once at startup and cache it on the Uniq
struct instead.

Fixes uutils#13199
write_line() issued two separate write_all() calls per output line
(line content, then the terminator byte), each going through the
dynamically-dispatched Box<dyn Write> from open_output_file(). Merge
them into a single write via a reused scratch buffer.

Also match the input BufReader's capacity to the existing 128KB
output buffer (previously the 8KB std default), for consistency.

Measured on a 20x-repeated /usr/share/dict/words (~80MB, pinned to
one CPU core to reduce noise): -w 1 dropped from 429.7ms to 395.5ms
(~8%), -w 512 from 578.9ms to 497.8ms (~14%).
@github-actions

github-actions Bot commented Jul 1, 2026

Copy link
Copy Markdown

GNU testsuite comparison:

Skip an intermittent issue tests/misc/io-errors (fails in this run but passes in the 'main' branch)
Skip an intermittent issue tests/tail/retry (fails in this run but passes in the 'main' branch)
Skipping an intermittent issue tests/pr/bounded-memory (passes in this run but fails in the 'main' branch)
Skipping an intermittent issue tests/rm/isatty (passes in this run but fails in the 'main' branch)
Skipping an intermittent issue tests/tail/symlink (passes in this run but fails in the 'main' branch)
Note: The gnu test tests/tail/pipe-f is now being skipped but was previously passing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

perf(uniq): -w benchmark

1 participant