fast_float

mirror of https://github.com/fastfloat/fast_float.git synced 2026-08-01 01:06:35 +08:00

Author	SHA1	Message	Date
fcostaoliveira	cb5d9cd9a4	Skip materializing the integer/fraction spans on the hot path parsed_number_string_t carries two span<UC const> members (integer, fraction) that are only read on the rare slow paths (digit_comp, and the >19-significant- digit truncation recompute). Materializing them on every parse forces the ~56/64- byte struct to be written out and marshaled through the by-value return, which shows up as backend/store pressure on the hot path. This adds a runtime `store_spans` flag (default true, so all existing callers are unchanged) to parse_number_string; from_chars_float_advanced parses with it false, attempts the Clinger and Eisel-Lemire fast paths inline, and only re-parses with spans on the two rare slow branches. The re-parse is pushed into a single `fastfloat_noinline` (noinline+cold) helper so the force-inlined hot scanner is emitted once rather than duplicated into the caller (without this the extra inline copies regress some targets, e.g. ARM gcc, by bloating the hot frame and lengthening the loop-carried dependency chain). A runtime flag is used deliberately rather than a template parameter: a template would create a second instantiation of the whole scanner whose icache cost wipes out the gain. Measured (per-parser microbench, median of 5, pinned core), fast_float from_chars <double>/<float>, vs the current tip: - Intel Ice Lake (Xeon 8360Y): +17-19% (gcc), Intel TMA shows backend-bound 26.0% -> 2.2% and retiring 60.3% -> 77.3% on short floats (the eliminated span spill), with -36% pipeline slots. - Intel Cascade Lake (Xeon 6248): +18-22% (gcc), +13-23% (clang). - ARM Neoverse-V2 (Graviton4): +73-196% (gcc), +8-11% (clang) -- the struct spill dominated the gcc hot loop there. Correctness: the full float exhaustive suite (exhaustive32, exhaustive32_64, exhaustive32_midpoint, random64) passes, and a 2^32 sweep is byte-identical to the current tip. Public from_chars / from_chars_advanced / parsed_number_string_t are unchanged.	2026-06-03 09:30:42 +01:00
Jonathan Wakely	1b11407da9	Fix spelling Run clang-format to reformat the long lines.	2026-06-02 15:30:37 +01:00
Daniel Lemire	cfd12ebcf1	8.2.6	2026-06-01 18:07:41 -04:00
Daniel Lemire	ed861322d8	Merge pull request #382 from redis-performance/pr/four-digit-followup Add a 4-digit SWAR follow-up to loop_parse_if_eight_digits (clang)	2026-06-01 15:45:15 -04:00
fcostaoliveira	7589a4fea5	Add a 4-digit SWAR follow-up to loop_parse_if_eight_digits (clang) After the 8-digit SWAR block loop, consume a remaining 4-7 digit run in one read4_to_u32 + parse_four_digits_unrolled step instead of byte-by-byte (reusing the existing 4-digit helpers). The parsed result is identical; this is purely a faster way to consume the same digits. Gated to clang: on gcc the extra 4-digit check regresses inputs whose remainder is < 4 digits (e.g. the 17-digit fraction of uniform [0,1] -> -3% on 'random'), because the check becomes pure overhead there; clang does not show that. m8g.metal-24xl (Graviton4), -O3 -march=native, simple_fastfloat_benchmark, from_chars->double, clang 18, base vs patch back-to-back (2 samples): canada.txt +11.7%, mesh.txt +7.4%, random ~flat. No regression.	2026-06-01 11:55:50 +01:00
fcostaoliveira	b64d014e2f	Unroll the integer-part digit scan (straight-line for the common 1-5 digit case) parse_number_string scans the integer part one byte at a time in a while loop, while the fraction already uses the 8-digit SWAR loop. Most integer parts are 1-5 digits, so the loop back-edge dominates. Peel the first five iterations into nested ifs, falling through to the original while for longer runs. Semantics are identical (i = 10*i + digit, advancing p); no behavior change. AWS m8g.metal-24xl (Graviton4), -O3 -march=native, simple_fastfloat_benchmark, from_chars->double. base vs patch measured back-to-back, mean of 2 runs: canada: gcc +3.1%, clang +2.8% mesh: gcc +5.4%, clang +5.1% random: ~flat (1-digit integer part) No regression; gcc and clang agree. Alternatives benchmarked and rejected: reusing loop_parse_if_eight_digits for the integer part regressed 5-8% (integer parts are too short for 8-digit SWAR setup); a counted for(k<5) loop matched on gcc but clang optimized it worse (canada -0.9%). The explicit peel is the only form solidly positive on both compilers.	2026-06-01 09:55:08 +01:00
Daniel Lemire	05087a303d	8.2.5	2026-04-16 14:39:03 -04:00
Michael Lippautz	001c04cc8a	Remove <algorithm> include and replace std::min with ternary operators Replaces uses of std::min with ternary operators in ascii_number.h, digit_comparison.h, and float_common.h to remove the dependency on the <algorithm> header in those files.	2026-04-16 17:17:19 +00:00
Michael Lippautz	b063de82c7	Include <algorithm> in float_common.h `fastfloat_strncasecmp` relies on `std::min`.	2026-04-16 09:51:58 +02:00
Daniel Lemire	18e55e48a8	lint	2026-03-10 17:06:04 -04:00
Daniel Lemire	eb9ab42c0a	8.2.4	2026-03-10 12:10:12 -04:00
Koleman Nix	2606bcdf2f	A few inlines	2026-03-07 15:36:09 -05:00
Xisco Fauli	3c6a64b87d	fix warning C4702: unreachable code	2026-02-06 11:28:34 +01:00
Daniel Lemire	01ce95dfe4	v8.2.3	2026-02-03 11:27:40 -05:00
sleepingieght	4fa83ccff4	fix early return error in fastfloat_strncasecmp	2026-01-21 19:21:06 +05:30
Daniel Lemire	babb1f3335	Merge pull request #356 from sleepingeight/surya/opt-cmp optimize fastfloat_strncasecmp	2026-01-18 18:56:05 -05:00
Shikhar	b14e6a466a	simpler optimizations Signed-off-by: Shikhar <shikharish05@gmail.com>	2026-01-02 05:21:01 +05:30
Shikhar	13d4b94183	small fix	2026-01-02 05:21:01 +05:30
Shikhar	d0af1cfdbd	optimize uint16 parsing Signed-off-by: Shikhar <shikharish05@gmail.com>	2026-01-02 05:21:01 +05:30
Daniel Lemire	d5bc4e1b2e	Merge pull request #358 from shikharish/uint8-base-fix add base check for uint8	2025-12-31 13:44:12 -05:00
Daniel Lemire	97b54ca9e7	v8.2.2	2025-12-31 13:12:46 -05:00
Shikhar	4dc5225797	add base check for uint8 parsing Signed-off-by: Shikhar <shikharish05@gmail.com>	2025-12-31 22:07:45 +05:30
Shikhar	fb522b66d0	fix endianess bug in uint8 parsing Signed-off-by: Shikhar <shikharish05@gmail.com>	2025-12-31 21:51:23 +05:30
sleepingieght	4eb0d806fa	add specialisations	2025-12-30 20:27:45 +05:30
sleepingieght	265cb849f3	optimise fastfloat_strncasecmp	2025-12-30 01:15:22 +05:30
Daniel Lemire	11ce67e5eb	v8.2.1	2025-12-29 11:09:40 -05:00
Daniel Lemire	f4f9da1e6b	fix for issue 354	2025-12-29 10:55:20 -05:00
Daniel Lemire	dd77fb5e4c	v8.2.0	2025-12-27 12:08:58 -05:00
Daniel Lemire	b4d26ec866	v8.1.1	2025-12-27 12:06:36 -05:00
Pavel Novikov	cb813a7765	fixed UB	2025-12-27 00:15:30 +03:00
Shikhar	780c341359	fix macro Signed-off-by: Shikhar <shikharish05@gmail.com>	2025-12-25 00:45:51 +05:30
Shikhar	fdb0eddf99	c++14 constexpr Signed-off-by: Shikhar <shikharish05@gmail.com>	2025-12-25 00:45:51 +05:30
Shikhar	fce0ab61df	uint8_t parsing Signed-off-by: Shikhar <shikharish05@gmail.com>	2025-12-25 00:45:51 +05:30
Raine 'Gravecat' Simmons	9d78a01ff7	Fixed formatting with clang-format	2025-11-22 21:53:37 +00:00
Raine 'Gravecat' Simmons	409d6215b4	Fixes compilation on GCC/MinGW	2025-11-22 16:11:06 +00:00
Pavel Novikov	1ea4d2563e	made function non-template +fixed a couple of typos	2025-09-30 12:18:29 +03:00
Daniel Lemire	7262d9454e	lint	2025-09-29 15:08:24 -04:00
Daniel Lemire	fd98fd6689	specialize for std::float32_t and std::float64_t explicitly credit: @lemire	2025-09-29 21:43:36 +03:00
Pavel Novikov	197c0ffca7	clang format	2025-09-29 21:43:35 +03:00
Pavel Novikov	13345cab65	added template overload for `integer_times_pow10()`	2025-09-18 21:29:25 +03:00
Daniel Lemire	88b1e5321c	version 8.1.0	2025-09-18 09:38:45 -06:00
Daniel Lemire	2aa6d0ba72	Merge pull request #326 from fastfloat/patch803 Release candidate 8.0.3	2025-09-18 09:37:20 -06:00
Daniel Lemire	0b6d911220	format	2025-09-18 08:30:28 -06:00
Pavel Novikov	7a77227521	minor fix of forward declaration	2025-09-18 17:02:29 +03:00
Daniel Lemire	e20c952456	Merge pull request #320 from toughengineer/int_multiplication_by_power_of_10 Implemented multiplication of integer by a power of 10	2025-09-18 07:48:09 -06:00
Daniel Lemire	bb956b29db	release candidate 8.0.3	2025-09-18 07:44:53 -06:00
Daniel Lemire	48fc5404d4	compatibility fix	2025-09-18 07:44:05 -06:00
InvalidUsernameException	9d81c71aef	Do not mis-parse certain wide-character emojis as integer When calling ch_to_digit() with a UTF-16 or UTF-32 code unit, it simply truncates away any data stored in the non-low byte(s) of the code unit. It then uses a lookup table to determine whether the low byte corresponds to an ASCII digit. This is incorrect because as soon as any bit outside the low byte is set, the number will never correspond to a ASCII digit anymore. To fix this, we produce a mask that is all zeroes if any bit outside the low byte is set in the code unit, all ones otherwise. Anding this mask with the original code unit forces the table lookup to return the sentinel value from the zero-index if any high bit was set and causes the code unit not to be parsed as integer. This bug was discovered when loading Mastodon posts inside the Ladybird browser where some of Mastodon's JavaScript would trigger the code path that erroneously parsed the emoji as integer. It had the visible effect that some digits inside the posts would get rendered as one of the emojis that parsed to that digit. For more details see this issue: https://github.com/LadybirdBrowser/ladybird/issues/6205 The emojis in the test case are simply all the emojis used on Mastodon that caused the bug. They can be found here: `06803422da/app/javascript/mastodon/features/emoji/emoji_map.json`	2025-09-15 23:12:28 +02:00
WenLei	6677924083	float_common.h: Support RISC-V	2025-09-11 11:11:30 +08:00
Pavel Novikov	0a230326ab	now finally got the anti-ambiguity overloads right, right?	2025-09-06 02:22:43 +03:00

1 2 3 4 5 ...

371 Commits