1033 Commits

Author SHA1 Message Date
Daniel Lemire
34164f547b 8.2.10 v8.2.10 2026-06-14 09:52:47 -04:00
Daniel Lemire
4eec7bec38
Merge pull request #394 from fastfloat/int-overflow-simdjson-approach
Int overflow check with a faster approach
2026-06-14 09:52:09 -04:00
Daniel Lemire
fd970ab05e updating visual studio 2026-06-13 21:41:53 -04:00
Daniel Lemire
a7249f86ed replace checked re-parse with O(1) simdjson-style overflow check
The previous commit detects multi-wrap u64 overflow at the max_digits
boundary by re-parsing the digits through a checked multiply-add loop
(O(max_digits)). Replace that with the constant-time check used in
simdjson: the leading digit plus a single threshold comparison.

For a max_digits-length value, min_safe_u64(base) == base^(max_digits-1)
is the smallest such value and also the width of each leading-digit band
[d*ms, (d+1)*ms). Since that width is < 2^64, the only band that can
straddle 2^64 is d == dmax (the largest leading digit that still fits),
and there it straddles at most once, so a single threshold dmax*ms
separates wrapped from non-wrapped values. A leading digit above dmax
always overflows; below dmax always fits. dmax and the threshold derive
from the existing min_safe_u64 table, so no new tables are needed and
dmax*ms cannot itself overflow.

Add a programmatic, self-verifying test for parse_int_string overflow
detection covering bases 2..36, complementing the hand-picked strings
added earlier. Every generated input is cross-checked against an
independent trusted oracle (a plain 64-bit checked multiply-add); on
success the parsed value is also compared exactly and full consumption
of the input is asserted.

Per base it exercises:
  - an exact-boundary sweep of the 64 values straddling 2^64
    (UINT64_MAX-31 .. 2^64+31), built by walking the digit string;
  - UINT64_MAX, 2^64 and the all-max-digit value, each also with
    leading zeros;
  - random max_digits-length values across every leading digit, with
    the heaviest sampling on the lead == dmax band that straddles 2^64,
    and full coverage of lead > dmax (the multi-wrap region the naive
    min_safe check accepted by mistake);
  - max_digits-1 (never overflows) and max_digits+1 (always overflows).
A small signed (int64_t) section checks the exact INT64_MIN/INT64_MAX
limits round-trip and that INT64_MAX+1 / INT64_MIN-1 are rejected in
every base.
2026-06-13 21:34:34 -04:00
sahvx655-wq
632cc97b5b detect uint64 overflow that wraps past min_safe in parse_int_string 2026-06-13 21:21:29 -04:00
Daniel Lemire
8234a89623 8.2.9 v8.2.9 2026-06-11 20:29:24 -04:00
Daniel Lemire
0dce102cb4
Merge pull request #391 from sahvx655-wq/int-fast-path-wide-units
reject non-digit wide code units in uint8/uint16 integer fast path
2026-06-11 20:28:10 -04:00
Daniel Lemire
30868f8734
Merge pull request #392 from biojppm/fix/gcc9_compile_error
Fix compile error with gcc 9: use of [[unlikely]]
2026-06-11 09:37:38 -04:00
Joao Paulo Magalhaes
8e6edc8ad2
Fix compile error with gcc 9: use of [[unlikely]] 2026-06-10 15:37:26 +01:00
sahvx655-wq
82882b237d gate uint8/uint16 base-10 fast paths to single-byte code units 2026-06-10 12:12:34 +05:30
Daniel Lemire
937198691a
Merge pull request #389 from correctmost/cm/remove-unreachable-return
Remove an unreachable return statement
2026-06-09 11:18:15 -04:00
Daniel Lemire
0352ba3fef
Merge pull request #390 from correctmost/cm/remove-unreachable-block
Remove an else if statement that is always false
2026-06-09 11:17:35 -04:00
correctmost
6ae691372f Remove an else if statement that is always false
Commit b334317d added the same std::isnan(v) check as an earlier
condition.

The warning was reported by cppcheck.
2026-06-09 03:48:54 -04:00
correctmost
8fe7a9405b Remove an unreachable return statement
The redundant statement was reported by cppcheck.
2026-06-09 03:37:58 -04:00
Daniel Lemire
e8ec8e8f34 8.2.8 v8.2.8 2026-06-08 15:29:36 -04:00
Daniel Lemire
c05156ff60
Merge pull request #388 from biojppm/fix/clang_compile_error
Fix compile error in clang<10: fails on pragma -Wc++20-extensions
2026-06-08 15:28:45 -04:00
Joao Paulo Magalhaes
23e245f2b3
Fix compile error in clang<10: fails on pragma -Wc++20-extensions
This fixes a compile error in all clang versions lower than 10,
triggered by the use of the pragma ignore with what is an unknown
warning on those compiler versions:

```
/__w/ext/fast_float/include/fast_float/parse_number.h:361:34: error: unknown warning group '-Wc++20-extensions', ignored [-Werror,-Wunknown-pragmas]
```

The fix requires looking at __clang_major__, which is unfortunately
different in Apple, so a version dispatch is required.
2026-06-08 12:39:48 +01:00
Daniel Lemire
e0b53eaf63 8.2.7 v8.2.7 2026-06-07 14:14:42 -04:00
Daniel Lemire
3044c9b182
Merge pull request #387 from fastfloat/pr386
Using unlikely markers for PR386
2026-06-07 14:12:38 -04:00
Daniel Lemire
29bd11571b one too many 2026-06-07 11:19:47 -04:00
Daniel Lemire
b1fbfe932a silencing -Wc++20-extensions at the point of use solely 2026-06-07 11:18:09 -04:00
Daniel Lemire
520fded4a3 adressing comments by @jwakely 2026-06-06 13:13:49 -04:00
Daniel Lemire
b72e07132c let us using 'unlikely' hints. 2026-06-05 22:01:27 -04:00
fcostaoliveira
3067491f41 clang-format (clang-format-17 comment reflow + signature wrap; no semantic change) 2026-06-03 09:35:26 +01:00
fcostaoliveira
cb5d9cd9a4 Skip materializing the integer/fraction spans on the hot path
parsed_number_string_t carries two span<UC const> members (integer, fraction)
that are only read on the rare slow paths (digit_comp, and the >19-significant-
digit truncation recompute). Materializing them on every parse forces the ~56/64-
byte struct to be written out and marshaled through the by-value return, which
shows up as backend/store pressure on the hot path.

This adds a runtime `store_spans` flag (default true, so all existing callers are
unchanged) to parse_number_string; from_chars_float_advanced parses with it false,
attempts the Clinger and Eisel-Lemire fast paths inline, and only re-parses with
spans on the two rare slow branches. The re-parse is pushed into a single
`fastfloat_noinline` (noinline+cold) helper so the force-inlined hot scanner is
emitted once rather than duplicated into the caller (without this the extra inline
copies regress some targets, e.g. ARM gcc, by bloating the hot frame and lengthening
the loop-carried dependency chain).

A runtime flag is used deliberately rather than a template parameter: a template
would create a second instantiation of the whole scanner whose icache cost wipes
out the gain.

Measured (per-parser microbench, median of 5, pinned core), fast_float from_chars
<double>/<float>, vs the current tip:
  - Intel Ice Lake (Xeon 8360Y): +17-19% (gcc), Intel TMA shows backend-bound
    26.0% -> 2.2% and retiring 60.3% -> 77.3% on short floats (the eliminated span
    spill), with -36% pipeline slots.
  - Intel Cascade Lake (Xeon 6248): +18-22% (gcc), +13-23% (clang).
  - ARM Neoverse-V2 (Graviton4): +73-196% (gcc), +8-11% (clang) -- the struct spill
    dominated the gcc hot loop there.
Correctness: the full float exhaustive suite (exhaustive32, exhaustive32_64,
exhaustive32_midpoint, random64) passes, and a 2^32 sweep is byte-identical to the
current tip. Public from_chars / from_chars_advanced / parsed_number_string_t are
unchanged.
2026-06-03 09:30:42 +01:00
Daniel Lemire
6258cbc5a1
Merge pull request #380 from fastfloat/dependabot/github_actions/github-actions-0eb558eb98
Bump the github-actions group across 1 directory with 3 updates
2026-06-02 14:02:10 -04:00
Daniel Lemire
254f10ce39
Merge pull request #385 from jwakely/patch-2
Fix spelling
2026-06-02 14:01:41 -04:00
Jonathan Wakely
1b11407da9 Fix spelling
Run clang-format to reformat the long lines.
2026-06-02 15:30:37 +01:00
Daniel Lemire
f0ed8cdf52 display the latest version. 2026-06-01 18:28:09 -04:00
Daniel Lemire
cfd12ebcf1 8.2.6 v8.2.6 2026-06-01 18:07:41 -04:00
Daniel Lemire
06f3e27411
Merge pull request #383 from redis-performance/pr/parallel-exhaustive
Parallelize the exhaustive float32 sweeps across hardware threads (~75-88x)
2026-06-01 18:07:01 -04:00
fcostaoliveira
b642d9202f tests: parallelize exhaustive32 and exhaustive32_64 sweeps too
Same std::thread split as exhaustive32_midpoint; preserves each test's existing
failure behavior (abort for exhaustive32, stop-flag for exhaustive32_64).
2026-06-01 21:09:46 +01:00
Daniel Lemire
ed861322d8
Merge pull request #382 from redis-performance/pr/four-digit-followup
Add a 4-digit SWAR follow-up to loop_parse_if_eight_digits (clang)
2026-06-01 15:45:15 -04:00
Daniel Lemire
0f682cd6eb
Merge pull request #381 from redis-performance/pr/integer-scan-unroll
Unroll the integer-part digit scan (straight-line for the common 1-5 digit case)
2026-06-01 13:44:06 -04:00
fcostaoliveira
b20c420964 tests: parallelize the exhaustive midpoint sweep across hardware threads 2026-06-01 13:01:10 +01:00
fcostaoliveira
7589a4fea5 Add a 4-digit SWAR follow-up to loop_parse_if_eight_digits (clang)
After the 8-digit SWAR block loop, consume a remaining 4-7 digit run in one
read4_to_u32 + parse_four_digits_unrolled step instead of byte-by-byte (reusing
the existing 4-digit helpers). The parsed result is identical; this is purely a
faster way to consume the same digits.

Gated to clang: on gcc the extra 4-digit check regresses inputs whose remainder
is < 4 digits (e.g. the 17-digit fraction of uniform [0,1] -> -3% on 'random'),
because the check becomes pure overhead there; clang does not show that.

m8g.metal-24xl (Graviton4), -O3 -march=native, simple_fastfloat_benchmark,
from_chars->double, clang 18, base vs patch back-to-back (2 samples):
  canada.txt +11.7%, mesh.txt +7.4%, random ~flat. No regression.
2026-06-01 11:55:50 +01:00
fcostaoliveira
b64d014e2f Unroll the integer-part digit scan (straight-line for the common 1-5 digit case)
parse_number_string scans the integer part one byte at a time in a while loop,
while the fraction already uses the 8-digit SWAR loop. Most integer parts are
1-5 digits, so the loop back-edge dominates. Peel the first five iterations into
nested ifs, falling through to the original while for longer runs. Semantics are
identical (i = 10*i + digit, advancing p); no behavior change.

AWS m8g.metal-24xl (Graviton4), -O3 -march=native, simple_fastfloat_benchmark,
from_chars->double. base vs patch measured back-to-back, mean of 2 runs:
  canada: gcc +3.1%, clang +2.8%
  mesh:   gcc +5.4%, clang +5.1%
  random: ~flat (1-digit integer part)
No regression; gcc and clang agree.

Alternatives benchmarked and rejected: reusing loop_parse_if_eight_digits for the
integer part regressed 5-8% (integer parts are too short for 8-digit SWAR setup);
a counted for(k<5) loop matched on gcc but clang optimized it worse (canada -0.9%).
The explicit peel is the only form solidly positive on both compilers.
2026-06-01 09:55:08 +01:00
dependabot[bot]
b3ec8d89cf
Bump the github-actions group across 1 directory with 3 updates
Bumps the github-actions group with 3 updates in the / directory: [actions/setup-node](https://github.com/actions/setup-node), [mymindstorm/setup-emsdk](https://github.com/mymindstorm/setup-emsdk) and [jidicula/clang-format-action](https://github.com/jidicula/clang-format-action).


Updates `actions/setup-node` from 6.3.0 to 6.4.0
- [Release notes](https://github.com/actions/setup-node/releases)
- [Commits](53b83947a5...48b55a011b)

Updates `mymindstorm/setup-emsdk` from 14 to 16
- [Release notes](https://github.com/mymindstorm/setup-emsdk/releases)
- [Commits](6ab9eb1bda...4528d102f7)

Updates `jidicula/clang-format-action` from 4.17.0 to 4.18.0
- [Release notes](https://github.com/jidicula/clang-format-action/releases)
- [Commits](3a18028048...654a770daa)

---
updated-dependencies:
- dependency-name: actions/setup-node
  dependency-version: 6.4.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: github-actions
- dependency-name: jidicula/clang-format-action
  dependency-version: 4.18.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: github-actions
- dependency-name: mymindstorm/setup-emsdk
  dependency-version: '16'
  dependency-type: direct:production
  update-type: version-update:semver-major
  dependency-group: github-actions
...

Signed-off-by: dependabot[bot] <support@github.com>
2026-06-01 00:15:05 +00:00
Daniel Lemire
7790aa6231 gh pages 2026-05-27 13:11:22 -04:00
Daniel Lemire
ee866c2b92 Merge branch 'main' of github.com:fastfloat/fast_float 2026-05-27 12:43:52 -04:00
Daniel Lemire
baffc57197 new site 2026-05-27 12:43:38 -04:00
Daniel Lemire
3644f5137c
Update README.md to remove and modify content
Removed references to Redis and Valkey, and updated the description of the fast_float library.
2026-05-27 12:13:56 -04:00
Daniel Lemire
05087a303d 8.2.5 v8.2.5 2026-04-16 14:39:03 -04:00
Daniel Lemire
b2b1e203ba removing msys 32-bit 2026-04-16 14:38:27 -04:00
Daniel Lemire
b57ec064ad
Merge pull request #379 from mlippautz/patch-1
Replace std::min with ternary operators to avoid <algorithm> dependency
2026-04-16 14:37:13 -04:00
Michael Lippautz
001c04cc8a Remove <algorithm> include and replace std::min with ternary operators
Replaces uses of std::min with ternary operators in ascii_number.h, digit_comparison.h, and float_common.h to remove the dependency on the <algorithm> header in those files.
2026-04-16 17:17:19 +00:00
Michael Lippautz
b063de82c7
Include <algorithm> in float_common.h
`fastfloat_strncasecmp` relies on `std::min`.
2026-04-16 09:51:58 +02:00
Daniel Lemire
d7ad33a80e
Merge pull request #377 from BYVoid/fix-bazel-bzlmod
Fix Bazel build with bzlmod
2026-03-27 09:37:23 -04:00
Carbo
2027a39ba0 Update MODULE.bazel version to 8.2.4
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-27 14:42:55 +09:00
Carbo
9817d5ddaa Fix Bazel build with bzlmod by loading cc_library rule
With bzlmod, native rules like cc_library are no longer implicitly available
and must be explicitly loaded from rules_cc. Add the rules_cc dependency to
MODULE.bazel and the corresponding load statement to BUILD.bazel.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-27 14:39:16 +09:00