A way to hashing floating-point in Lua 5.4

Last updated on 8 minutes ago

Hash for Float

This function is going to create a integer hash value for floating-point input, and this artical is about to dive into the details. The calculation of hash for floating-point conceptlly is pretty intuitive. The core expression is n = frexp(n, &i); return (n * INT_MAX) + i; But, there are several subtleties to be considered...

Main function defines here:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
/*
** Hash for floating-point numbers.
** The main computation should be just
** n = frexp(n, &i); return (n * INT_MAX) + i
** but there are some numerical subtleties.
** In a two-complement representation, INT_MAX does not has an exact
** representation as a float, but INT_MIN does; because the absolute
** value of 'frexp' is smaller than 1 (unless 'n' is inf/NaN), the
** absolute value of the product 'frexp * -INT_MIN' is smaller or equal
** to INT_MAX. Next, the use of 'unsigned int' avoids overflows when
** adding 'i'; the use of '~u' (instead of '-u') avoids problems with
** INT_MIN.
*/
#if !defined(l_hashfloat)
static int l_hashfloat (lua_Number n) {
int i;
lua_Integer ni;
n = l_mathop(frexp)(n, &i) * -cast_num(INT_MIN);
if (!lua_numbertointeger(n, &ni)) { /* is 'n' inf/-inf/NaN? */
lua_assert(luai_numisnan(n) || l_mathop(fabs)(n) == cast_num(HUGE_VAL));
return 0;
}
else { /* normal case */
unsigned int u = cast_uint(i) + cast_uint(ni);
return cast_int(u <= cast_uint(INT_MAX) ? u : ~u);
}
}
#endif

Some auxiliary function defines here:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
/*
@@ lua_numbertointeger converts a float number with an integral value
** to an integer, or returns 0 if float is not within the range of
** a lua_Integer. (The range comparisons are tricky because of
** rounding. The tests here assume a two-complement representation,
** where MININTEGER always has an exact representation as a float;
** MAXINTEGER may not have one, and therefore its conversion to float
** may have an ill-defined value.)
*/
#define lua_numbertointeger(n,p) \
((n) >= (LUA_NUMBER)(LUA_MININTEGER) && \
(n) < -(LUA_NUMBER)(LUA_MININTEGER) && \
(*(p) = (LUA_INTEGER)(n), 1))


#define luai_numeq(a,b) ((a)==(b))
#define luai_numisnan(a) (!luai_numeq((a), (a)))

/* type casts (a macro highlights casts in the code) */
#define cast(t, exp) ((t)(exp))
#define cast_num(i) cast(lua_Number, (i))
#define cast_uint(i) cast(unsigned int, (i))

There are several key points to be considered:

  • How to check if a floating-point number is inf/-inf/nan:
1
2
3
4
5
6
7
8
9
10
11
12
double inf = INFINITY;
double ninf = -INFINITY;
double nan = NAN;

/* after including math.h header, we can directly use built-in macro */
bool c1 = isinf(inf);
bool c2 = isinf(ninf);
bool c3 = isnan(nan);

/* or, we can manually write a macro to check it */
#define isinf(n) (fabs(n) == HUGE_VAL) /* use built-in macro to represent int */
#define isnan(n) (!((n) == (n))) /* and equality check for nan results faile */
  • Implication of lua_numbertointeger

The main role of this function is checking if a integer number within the range of integer(which is long long) and converting it to integer. Take an example of long long for integer, LLONG_MAX can not be exactly represented by double but LLONG_MIN does. Besides, the absolute value of LLONG_MIN is one greater than LLONG_MAX. Therefore, if we want to check if a integer(say num) to be converted within the range of long long, we can manually check it by num >= double(LLONG_MIN) && num < double(-LLONG_MIN).

  • Implication of cast_int(u <= cast_uint(INT_MAX) ? u : ~u)

Firstly, u is unsigned int, which is always greater than and equal to zero. By contrast, int has negative value. Secondly, if u is nun-negative value, we just return it. If u is negative value, we cannot directly return -u, since abs(INT_MIN) = abs(INT_MAX) + 1, it will cause overflow. Therefore, we just flip all bits for u but not adding one(negative a negative value is equvalent to fliping all bits and adding one).


A way to hashing floating-point in Lua 5.4
https://nishikichisato.github.io/2024/12/21/LuaSource/hashfloat/
Author
Nishiki Chisato
Posted on
December 21, 2024
Updated on
December 21, 2024
Licensed under