219 Commits

Author SHA1 Message Date
Lenard Szolnoki
6732e397d8 Add constexpr testing
When enabled, modify `verify` macro to also verify at compile time,
when the arguments are constant expressions.
2023-03-04 22:36:58 +00:00
Lenard Szolnoki
52618851fd Make all float_common.h functions constexpr in C++20 2023-03-03 22:43:52 +00:00
Lenard Szolnoki
6d2fb68f5c Simplify to_float
* Use right-sized uint type for bit fiddling
** This removes the need to special casing on endianness
* Replace ternary with just shifting the sign at the right place
** This seems to improve codegen (less instructions, no cmov)
2023-03-01 23:39:01 +00:00
Lenard Szolnoki
0e4b873d81 Fix space_lut so it's accepted by MSVC and clang 2023-02-25 18:27:10 +00:00
Lenard Szolnoki
a6991ea44f Add comment to the FASTFLOAT_CONSTEXPR14 macro definition 2023-02-25 11:11:09 +00:00
Lenard Szolnoki
be6084863c Low-risk C++14 constexpr functions 2023-02-25 10:50:45 +00:00
Daniel Lemire
3e2da540ef Support rccpfastfloat. 2023-01-19 20:28:10 -05:00
Sergey Fedorov
ff7fba01d0
float_common.h: add support for ppc32 2023-01-18 14:15:14 +08:00
Daniel Lemire
c8aac4a63d Guard endian 2023-01-07 13:28:12 -05:00
Joao Paulo Magalhaes
7f7838b36a Fix compile warning: implicit double->float type conversion
With Intel 2021.1:
```
/home/runner/work/c4core/c4core/src/c4/ext/fast_float_all.h:319:49: error: implicit conversion between floating point types of different sizes [-Werror,-Wimplicit-float-size-conversion]
constexpr static float powers_of_ten_float[] = {1e0, 1e1, 1e2, 1e3,
1e4, 1e5,
```
2022-12-27 11:09:17 +00:00
Daniel Lemire
29b1a03d5b Make sure that macros have actual values when defined (makes debugging easier) 2022-11-16 15:49:09 -05:00
Daniel Lemire
6ceb29a7e4 We might reenable clinger. 2022-11-16 16:21:34 +00:00
Daniel Lemire
a2cf502395 Typo. 2022-11-03 19:41:30 -04:00
Daniel Lemire
3e29bf78c7 Nicer constants. 2022-11-03 19:40:05 -04:00
Daniel Lemire
e958ff4269 Simplified clinger. 2022-11-03 18:51:37 -04:00
Sutou Kouhei
5a71e5bc40 Don't use __umulh() with MinGW on ARM64 2022-10-28 15:33:37 +09:00
Daniel Lemire
6876616f0f
Update float_common.h 2022-08-04 15:05:22 -04:00
Daniel Lemire
ac81b01696
Added __EMSCRIPTEN__ patch 2022-08-04 13:58:48 -04:00
Jonathan Wakely
61f4840188 Make endianness detection more portable
The current check for endianness fails on platforms using newlib as the
C library, because it provides <machine/endian.h> not <endian.h>. This
could be fixed by adding `|| defined(__NEWLIB__)` to the check for
targets that provide <machine/endian.h> (i.e. BSD-like targets).

A more portable solution is to just check if the compiler has already
defined the necessary macros (which is true for GCC and Clang and Intel,
at least). Then no header is needed, and it works for platforms that
aren't explicitly listed in the conditionals.
2022-01-18 10:17:01 +00:00
Antoine Pitrou
133099ab4e Fix #117: compilation warning with gcc 6.3.0
Fix the following warning:
```
/arrow/cpp/src/arrow/vendored/fast_float/digit_comparison.h:62:50: error: right shift count >= width of type [-Werror=shift-count-overflow]
       am.power2 = int32_t((bits & exponent_mask) >> binary_format<T>::mantissa_explicit_bits());
                           ~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
```
2021-11-30 20:35:09 +01:00
Daniel Lemire
d148241404 Removing CXX20 support 2021-09-20 09:49:23 -04:00
Daniel Lemire
b334317dd2 Minor fixes 2021-09-14 21:31:34 -04:00
Alex Huszagh
fc0c8680a5 Implement the big-integer arithmetic algorithm.
Replaces the existing decimal implementation, for substantial
performance improvements with near-halfway cases. This is especially
fast with a large number of digits.

**Big Integer Implementation**

A small subset of big-integer arithmetic has been added, with the
`bigint` struct. It uses a stack-allocated vector with enough bits to
store the float with the large number of significant digits. This is
log2(10^(769 + 342)), to account for the largest possible magnitude
exponent, and number of digits (3600 bits), and then rounded up to 4k bits.

The limb size is determined by the architecture: most 64-bit
architectures have efficient 128-bit multiplication, either by a single
hardware instruction or 2 native multiplications for the high and low
bits. This includes x86_64, mips64, s390x, aarch64, powerpc64, riscv64,
and the only known exception is sparcv8 and sparcv9. Therefore, we
define a limb size of 64-bits on 64-bit architectures except SPARC,
otherwise we fallback to 32-bit limbs.

A simple stackvector is used, which just has operations to add elements,
index, and truncate the vector.

`bigint` is then just a wrapper around this, with methods for
big-integer arithmetic. For our algorithms, we just need multiplication
by a power (x * b^N), multiplication by a bigint or scalar value, and
addition by a bigint or scalar value. Scalar addition and multiplication
uses compiler extensions when possible (__builtin_add_overflow and
__uint128_t), if not, then we implement simple logic shown to optimize
well on MSVC. Big-integer multiplication is done via grade school
multiplication, which is more efficient than any asymptotically faster
algorithms. Multiplication by a power is then done via bitshifts for
powers-of-two, and by iterative multiplications of a large and then
scalar value for powers-of-5.

**compute_float**

Compute float has been slightly modified so if the algorithm cannot
round correctly, it returns a normalized, extended-precision adjusted
mantissa with the power2 shifted by INT16_MIN so the exponent is always
negative. `compute_error` and `compute_error_scaled` have been added.

**Digit Optimiations**

To improve performance for numbers with many digits,
`parse_eight_digits_unrolled` is used for both integers and fractions,
and uses a while loop than two nested if statements. This adds no
noticeable performance cost for common floats, but dramatically improves
performance for numbers with large digits (without these optimizations,
~65% of the total runtime cost is in parse_number_string).

**Parsed Number**

Two fields have been added to `parsed_number_string`, which contains a
slice of the integer and fraction digits. This is extremely cheap, since
the work is already done, and the strings are pre-tokenized during
parsing. This allows us on overflow to re-parse these tokenized strings,
without checking if each character is an integer. Likewise, for the
big-integer algorithms, we can merely re-parse the pre-tokenized
strings.

**Slow Algorithm**

The new algorithm is `digit_comp`, which takes the parsed number string
and the `adjusted_mantissa` from `compute_float`. The significant digits
are parsed into a big integer, and the exponent relative to the
significant digits is calculated. If the exponent is >= 0, we use
`positive_digit_comp`, otherwise, we use `negative_digit_comp`.

`positive_digit_comp` is quite simple: we scale the significant digits
to the exponent, and then we get the high 64-bits for the native float,
determine if any lower bits were truncated, and use that to direct
rounding.

`negative_digit_comp` is a little more complex, but also quite trivial:
we use the parsed significant digits as the real digits, and calculate
the theoretical digits from `b+h`, the halfway point between `b` and
`b+u`, the next-positive float. To get `b`, we round the adjusted
mantissa down, create an extended-precision representation, and
calculate the halfway point. We now have a base-10 exponent for the real
digits, and a base-2 exponent for the theoretical digits. We scale these
two to the same exponent by multiplying the theoretixal digits by
`5**-real_exp`. We then get the base-2 exponent as `theor_exp -
real_exp`, and if this is positive, we multipy the theoretical digits by
it, otherwise, we multiply the real digits by it. Now, both are scaled
to the same magnitude, and we simply compare the digits in the big
integer, and use that to direct rounding.

**Rust-Isms**

A few Rust-isms have been added, since it simplifies logic assertions.
These can be trivially removed or reworked, as needed.

- a `slice` type has been added, which is a pointer and length.
- `FASTFLOAT_ASSERT`, `FASTFLOAT_DEBUG_ASSERT`, and `FASTFLOAT_TRY` have
  been added
  - `FASTFLOAT_ASSERT` aborts, even in release builds, if the condition
    fails.
  - `FASTFLOAT_DEBUG_ASSERT` defaults to `assert`, for logic errors.
  - `FASTFLOAT_TRY` is like a Rust `Option` type, which propagates
    errors.

Specifically, `FASTFLOAT_TRY` is useful in combination with
`FASTFLOAT_ASSERT` to ensure there are no memory corruption errors
possible in the big-integer arithmetic. Although the `bigint` type
ensures we have enough storage for all valid floats, memory issues are
quite a severe class of vulnerabilities, and due to the low performance
cost of checks, we abort if we would have out-of-bounds writes. This can
only occur when we are adding items to the vector, which is a very small
number of steps. Therefore, we abort if our memory safety guarantees
ever fail. lexical has never aborted, so it's unlikely we will ever fail
these guarantees.
2021-09-10 18:53:53 -05:00
Jonas Rahlf
4e13ec151b check for HAS_CXX20_CONSTEXPR before attempting to do c++20 stuff 2021-09-02 23:20:28 +02:00
Jonas Rahlf
e5d5e576a6 use #if defined __has_include properly 2021-09-02 22:22:03 +02:00
Jonas Rahlf
b17eafd06f chnage compiler check for bit_cast so it compiles with older compilers 2021-09-02 22:00:57 +02:00
Jonas Rahlf
d8ee88e7f6 initial version with working constexpr for c++20 compliant compilers 2021-09-01 00:52:25 +02:00
Daniel Lemire
94c78adb2e Typo 2021-06-07 10:34:44 -04:00
Daniel Lemire
93a2c79cf2 Adding m_arm detection. 2021-06-07 10:27:52 -04:00
Daniel Lemire
f54b41c09e Tweak for 32-bit Windows 2021-06-07 09:14:09 -04:00
Daniel Lemire
06e61729c9 making constexpr as inline. 2021-06-01 09:46:43 -04:00
Alex Huszagh
b712b6f9a5 Add support for other architectures. 2021-05-24 11:37:38 -05:00
Eugene Golushkov
87e5a95585 Prevent fast_float::from_chars from parsing whitespaces and leading '+' sign, similar to MSVC and integer LLVM std::from_chars behavior. See C++17 20.19.3.(7.1) and http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2016/p0067r5.html 2021-03-04 20:21:45 +02:00
Tim Paine
48d30f789b add support for missing flag on emscripten 2021-03-01 22:43:57 -05:00
Neal Richardson
ca0a4646e9
Locate endian header on Solaris 2021-02-17 13:46:04 -08:00
Daniel Lemire
192b271c12 Removing dead code 2021-01-07 18:03:33 -05:00
Daniel Lemire
a27fcc230d This should be mostly correct. 2021-01-07 17:46:47 -05:00
Daniel Lemire
ca51b646c8
Update float_common.h 2021-01-07 16:44:39 -05:00
Daniel Lemire
002966323c
Update float_common.h 2021-01-07 16:44:07 -05:00
Daniel Lemire
47ffc1303b Removing spurious 's'. 2020-12-29 15:29:46 -05:00
Daniel Lemire
a1a7347464 Minor tweaks to better handle cygwin/clang. 2020-12-22 15:55:48 -05:00
Joao Paulo Magalhaes
e65f977135 fix: never include iostream unless it's absolutely necessary 2020-11-24 00:24:17 +00:00
Daniel Lemire
4583e75e3e Merge branch 'main' into dlemire/extended_fast_path 2020-11-22 13:10:07 -05:00
Joao Paulo Magalhaes
037136a966 fix: add bitness for ppc64le 2020-11-21 19:01:26 +00:00
Joao Paulo Magalhaes
ed6664d93e add bitness for s390 2020-11-21 09:08:11 +01:00
Daniel Lemire
7bf5db7216 Tuning. 2020-11-20 17:05:06 -05:00
Joao Paulo Magalhaes
9afc814fb6 tidy float_common.h: put feature test macros at the top 2020-11-20 09:44:27 +00:00
Joao Paulo Magalhaes
bfa33b3ed1 fix mingw compile errors 2020-11-20 00:48:21 +00:00
Joao Paulo Magalhaes
f7b13da349 fix: readjust full_multiplication() and leading_zeroes() on windows 2020-11-20 00:48:20 +00:00
Joao Paulo Magalhaes
5ce64de524 fix: full 64bit multiplication working on 32bit gcc/clang 2020-11-20 00:48:20 +00:00
Joao Paulo Magalhaes
8a04a06a88 leading_zeroes(): 0 is not a valid input 2020-11-20 00:48:20 +00:00
Joao Paulo Magalhaes
449c628645 __emulu() is needed for mingw32 2020-11-20 00:48:19 +00:00
Joao Paulo Magalhaes
c4693cc86f re #33: win32 is working 2020-11-20 00:48:19 +00:00
Joao Paulo Magalhaes
829ac72f87 re #33: 32bit version. gcc compiles successfully, fails tests. 2020-11-20 00:48:19 +00:00
Daniel Lemire
1b5e3f3945 patching be support. (typo) 2020-11-16 12:56:57 -05:00
Daniel Lemire
7ff364b59a This might add support for big endian systems (untested). 2020-11-16 12:04:57 -05:00
Daniel Lemire
8a0a0c4fc1 Being pedantic. 2020-11-15 14:51:54 -05:00
Daniel Lemire
e5917323ec Pedantic member initialization. 2020-11-15 14:47:43 -05:00
Daniel Lemire
e79741ede2 Minor cleaning. 2020-11-12 22:35:32 -05:00
Daniel Lemire
1e92d59997 Sign conversion pedantry. 2020-11-11 20:43:36 -05:00
Maksim Kita
68633178d5 Fixed odr with inlining and anonymous namespace 2020-11-08 15:20:11 +03:00
Daniel Lemire
288efd35eb Minor cleaning. 2020-11-02 21:42:01 -05:00
Daniel Lemire
47d3d443d8 Minor fix. 2020-10-27 21:26:11 -04:00
Daniel Lemire
c53bfc4176 Minor tweak. 2020-10-27 20:10:42 -04:00
Daniel Lemire
eb1103393e Minor tuning. 2020-10-27 19:34:21 -04:00
Daniel Lemire
05ad45dfb5 Let us try the long path. 2020-10-27 18:26:16 -04:00
Daniel Lemire
59f4535adf This branch improves portability (under Windows). 2020-10-21 16:44:54 -04:00
Daniel Lemire
8a43fdb6a1 Minor fix (credit: @pitrou) 2020-10-19 15:11:44 -04:00
Daniel Lemire
1701be0224 First commit 2020-10-19 12:38:13 -04:00