221 Commits

Author SHA1 Message Date
Daniel Lemire
fd970ab05e updating visual studio 2026-06-13 21:41:53 -04:00
Daniel Lemire
a7249f86ed replace checked re-parse with O(1) simdjson-style overflow check
The previous commit detects multi-wrap u64 overflow at the max_digits
boundary by re-parsing the digits through a checked multiply-add loop
(O(max_digits)). Replace that with the constant-time check used in
simdjson: the leading digit plus a single threshold comparison.

For a max_digits-length value, min_safe_u64(base) == base^(max_digits-1)
is the smallest such value and also the width of each leading-digit band
[d*ms, (d+1)*ms). Since that width is < 2^64, the only band that can
straddle 2^64 is d == dmax (the largest leading digit that still fits),
and there it straddles at most once, so a single threshold dmax*ms
separates wrapped from non-wrapped values. A leading digit above dmax
always overflows; below dmax always fits. dmax and the threshold derive
from the existing min_safe_u64 table, so no new tables are needed and
dmax*ms cannot itself overflow.

Add a programmatic, self-verifying test for parse_int_string overflow
detection covering bases 2..36, complementing the hand-picked strings
added earlier. Every generated input is cross-checked against an
independent trusted oracle (a plain 64-bit checked multiply-add); on
success the parsed value is also compared exactly and full consumption
of the input is asserted.

Per base it exercises:
  - an exact-boundary sweep of the 64 values straddling 2^64
    (UINT64_MAX-31 .. 2^64+31), built by walking the digit string;
  - UINT64_MAX, 2^64 and the all-max-digit value, each also with
    leading zeros;
  - random max_digits-length values across every leading digit, with
    the heaviest sampling on the lead == dmax band that straddles 2^64,
    and full coverage of lead > dmax (the multi-wrap region the naive
    min_safe check accepted by mistake);
  - max_digits-1 (never overflows) and max_digits+1 (always overflows).
A small signed (int64_t) section checks the exact INT64_MIN/INT64_MAX
limits round-trip and that INT64_MAX+1 / INT64_MIN-1 are rejected in
every base.
2026-06-13 21:34:34 -04:00
sahvx655-wq
632cc97b5b detect uint64 overflow that wraps past min_safe in parse_int_string 2026-06-13 21:21:29 -04:00
Daniel Lemire
8234a89623 8.2.9 2026-06-11 20:29:24 -04:00
sahvx655-wq
82882b237d gate uint8/uint16 base-10 fast paths to single-byte code units 2026-06-10 12:12:34 +05:30
Daniel Lemire
937198691a
Merge pull request #389 from correctmost/cm/remove-unreachable-return
Remove an unreachable return statement
2026-06-09 11:18:15 -04:00
correctmost
6ae691372f Remove an else if statement that is always false
Commit b334317d added the same std::isnan(v) check as an earlier
condition.

The warning was reported by cppcheck.
2026-06-09 03:48:54 -04:00
correctmost
8fe7a9405b Remove an unreachable return statement
The redundant statement was reported by cppcheck.
2026-06-09 03:37:58 -04:00
fcostaoliveira
b642d9202f tests: parallelize exhaustive32 and exhaustive32_64 sweeps too
Same std::thread split as exhaustive32_midpoint; preserves each test's existing
failure behavior (abort for exhaustive32, stop-flag for exhaustive32_64).
2026-06-01 21:09:46 +01:00
fcostaoliveira
b20c420964 tests: parallelize the exhaustive midpoint sweep across hardware threads 2026-06-01 13:01:10 +01:00
Daniel Lemire
50c19fad17 init 2026-03-10 11:53:45 -04:00
재욱
3e2b5d3dc3 refactor verification calls for double and float limits 2026-02-04 15:36:31 +09:00
재욱
f43d6711bc Add additional verification cases for double and float limits 2026-02-04 15:27:46 +09:00
Shikhar
97cb3ec28d lint
Signed-off-by: Shikhar <shikharish05@gmail.com>
2025-12-25 03:06:22 +05:30
Daniel Lemire
120bdfd713 adding some ipv4 test 2025-12-24 15:43:43 -05:00
Daniel Lemire
6b72e26ba7 documenting better which types we support 2025-12-15 10:28:06 -05:00
Pavel Novikov
88f6c5e367
Added corner cases around max value/infinity 2025-10-04 14:39:00 +03:00
Pavel Novikov
e9438e64ba
fixed copy&paste error and minor mess 2025-09-29 19:54:25 +03:00
Pavel Novikov
7abb574ffc
added doc to README and examples 2025-09-29 13:00:40 +03:00
Pavel Novikov
01e505797b
added tests + some refactoring 2025-09-18 21:29:25 +03:00
Daniel Lemire
e20c952456
Merge pull request #320 from toughengineer/int_multiplication_by_power_of_10
Implemented multiplication of integer by a power of 10
2025-09-18 07:48:09 -06:00
InvalidUsernameException
9d81c71aef Do not mis-parse certain wide-character emojis as integer
When calling ch_to_digit() with a UTF-16 or UTF-32 code unit, it simply
truncates away any data stored in the non-low byte(s) of the code unit.
It then uses a lookup table to determine whether the low byte
corresponds to an ASCII digit. This is incorrect because as soon as any
bit outside the low byte is set, the number will never correspond to a
ASCII digit anymore.

To fix this, we produce a mask that is all zeroes if any bit outside the
low byte is set in the code unit, all ones otherwise. Anding this mask
with the original code unit forces the table lookup to return the
sentinel value from the zero-index if any high bit was set and causes
the code unit not to be parsed as integer.

This bug was discovered when loading Mastodon posts inside the Ladybird
browser where some of Mastodon's JavaScript would trigger the code path
that erroneously parsed the emoji as integer. It had the visible effect
that some digits inside the posts would get rendered as one of the
emojis that parsed to that digit. For more details see this issue:
https://github.com/LadybirdBrowser/ladybird/issues/6205

The emojis in the test case are simply all the emojis used on Mastodon
that caused the bug. They can be found here:
06803422da/app/javascript/mastodon/features/emoji/emoji_map.json
2025-09-15 23:12:28 +02:00
Pavel Novikov
e12463583f
added lacking overloads to avoid potential ambiguity 2025-09-06 00:12:41 +03:00
Pavel Novikov
6702cd4244
added doc section in the README,
added example code test executable
2025-09-05 13:36:23 +03:00
Pavel Novikov
20a7383442
renamed the function, cleaned up return type 2025-09-05 13:36:23 +03:00
Pavel Novikov
763558b9ac
cleaned up tests 2025-09-05 13:34:48 +03:00
Daniel Lemire
42db9ac1de
Merge branch 'main' into P2497R0 2025-09-03 12:04:36 -04:00
Pavel Novikov
cc90f240ee
added some tests 2025-09-02 23:03:38 +03:00
Daniel Lemire
2d2b42bb38 forked doctest 2025-06-03 18:15:52 -04:00
Daniel Lemire
73b27b7d68 hmmm 2025-06-02 09:52:34 -04:00
Daniel Lemire
a1e272f515 lint 2025-05-19 18:16:14 -04:00
Daniel Lemire
0458c20061 adding missing file 2025-05-19 18:09:34 -04:00
Daniel Lemire
81b8306c5f implementation of https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2023/p2497r0.html 2025-05-19 18:08:36 -04:00
Daniel Lemire
95dedd0aed turning json option into macro parameter 2025-03-09 15:13:43 -04:00
Daniel Lemire
717112d257 lint 2025-02-06 20:25:09 -05:00
Daniel Lemire
f0c709e3e4 ignoring failures 2025-02-06 20:24:43 -05:00
Anders Dalvander
f23ced2e4e fix for supplemental 2024-12-04 01:02:20 +01:00
Anders Dalvander
baaf58d2dd fix -Werror=maybe-uninitialized 2024-12-04 00:13:20 +01:00
Anders Dalvander
63bbefad6b templates and types 2024-12-03 23:47:21 +01:00
Anders Dalvander
ac453a091a overly precise tests for imprecise floats 2024-12-03 23:23:35 +01:00
Anders Dalvander
3b9ff76143 duplicate tests for both float and double 2024-12-03 23:23:34 +01:00
Anders Dalvander
c62b853648 float.rounds_to_nearest 2024-12-03 23:23:34 +01:00
Anders Dalvander
b3acae22ea fix parse_zero and parse_negative_zero output 2024-12-03 23:23:34 +01:00
Anders Dalvander
74e00e1401 fix double test in float region in basictest 2024-12-03 23:23:34 +01:00
Anders Dalvander
558bec8b9b fix logging in basictest 2024-12-03 23:23:34 +01:00
Daniel Lemire
6f8fd6728d make it build 2024-12-03 23:23:34 +01:00
Daniel Lemire
c526899951 cleaning. 2024-12-03 23:23:34 +01:00
Daniel Lemire
bfcff49c83 16-bit float support 2024-12-03 23:23:34 +01:00
Anders Dalvander
3775a81ced formatted code 2024-12-01 16:39:28 +01:00
Anders Dalvander
0a1bf11560 harmonize ifdef checks 2024-12-01 16:36:45 +01:00