[erlang-bugs] Mac OS X - trunc for large float causes ERTS_FP_CHECK_INIT at [...]: detected unhandled FPE at [...]

Wed May 4 01:48:20 CEST 2011

On Tue, May 3, 2011 at 3:35 PM, Mikael Pettersson <mikpe@REDACTED> wrote:
> On Tue, 3 May 2011 07:18:34 -0700, Bob Ippolito <bob@REDACTED> wrote:
>> On Tue, May 3, 2011 at 1:04 AM, Mikael Pettersson <mikpe@REDACTED> wrote:
>> > Bob Ippolito writes:
>> > =C2=A0> I only see this error on Mac OS X. I have not been able to reprod=
>> uce in Linux.
>> > =C2=A0>
>> > =C2=A0> Here's an example, any number larger than 16#7ffffffffffffe00 wil=
>> l
>> > =C2=A0> cause this error.
>> > =C2=A0>
>> > =C2=A0> Erlang R14B02 (erts-5.8.3) [source] [64-bit] [smp:4:4] [rq:4]
>> > =C2=A0> [async-threads:4] [hipe] [kernel-poll:true]
>> > =C2=A0>
>> > =C2=A0> Eshell V5.8.3 =C2=A0(abort with ^G)
>> > =C2=A0> 1> trunc(16#7ffffffffffffdff * 1.0).
>> > =C2=A0> 9223372036854774784
>> > =C2=A0> 2> trunc(16#7ffffffffffffdff * 1.0).
>> > =C2=A0> 9223372036854774784
>> > =C2=A0> 3> trunc(16#7ffffffffffffe00 * 1.0).
>> > =C2=A0> 9223372036854775808
>> > =C2=A0> 4> trunc(16#7ffffffffffffe00 * 1.0).
>> > =C2=A0> ERTS_FP_CHECK_INIT at 0x10086210: detected unhandled FPE at
>> > =C2=A0> 0x19223372036854775808
>> > =C2=A0> 5> trunc(16#7ffffffffffffe00 * 1.0).
>> > =C2=A0> ERTS_FP_CHECK_INIT at 0x10086210: detected unhandled FPE at
>> > =C2=A0> 0x19223372036854775808
>> > =C2=A0> 6> io:format("~s~n", [os:cmd("uname -a")]).
>> > =C2=A0> Darwin ba.local 10.7.0 Darwin Kernel Version 10.7.0: Sat Jan 29
>> > =C2=A0> 15:17:16 PST 2011; root:xnu-1504.9.37~1/RELEASE_I386 i386
>> > =C2=A0>
>> > =C2=A0> Here's another example:
>> > =C2=A0>
>> > =C2=A0> Erlang R14B02 (erts-5.8.3) [source] [64-bit] [smp:4:4] [rq:4]
>> > =C2=A0> [async-threads:4] [hipe] [kernel-poll:true]
>> > =C2=A0>
>> > =C2=A0> Eshell V5.8.3 =C2=A0(abort with ^G)
>> > =C2=A0> 1> <<F/float>> =3D <<67,224,0,0,0,0,0,0>>, trunc(F).
>> > =C2=A0> 9223372036854775808
>> > =C2=A0> 2> <<F/float>> =3D <<67,224,0,0,0,0,0,0>>, trunc(F).
>> > =C2=A0> ERTS_FP_CHECK_INIT at 0x10083e24: detected unhandled FPE at
>> > =C2=A0> 0x19223372036854775808
>> > =C2=A0> 3> <<F/float>> =3D <<67,224,0,0,0,0,0,0>>, trunc(F).
>> > =C2=A0> ERTS_FP_CHECK_INIT at 0x10083e24: detected unhandled FPE at
>> > =C2=A0> 0x19223372036854775808
>> >
>> > It means that the code at 0x19223372036854775808 in the
>> > Erlang VM needs to use the proper ERTS_FP_CHECK_<foo> macros.
>> >
>> > Please attach gdb (or whatever debugger Darwin uses) to a running
>> > Erlang VM and disassemble the code at 0x19223372036854775808.
>> > We need to know the name of the enclosing function, and preferably
>> > also the actual instruction sequence that throws the FPE. If gdb
>> > can show the exact original source code line then that's even better.
>> >
>> > If this is in libc rather than the Erlang VM itself, then we need
>> > a call trace to identify which code in the VM called out to this
>> > FP-throwing code. =C2=A0For that you should probably plant a breakpoint
>> > at 0x19223372036854775808 and then evaluate one of those Erlang
>> > expressions above to trigger the FPE.
>> >
>>
>> Well, it's actually saying 0x1, the result of the trunc is
>> 9223372036854775808  and the message string is truncated to 64
>> characters which is not enough to show it all. Perhaps the buffer size
>> in erts_fp_check_init_error should be adjusted.
>
> Something in your terminal or email client ate a \r\n sequence after the
> 0x1 from erts_fp_check_init_error() making it appear glued together with
> the 9223372036854775808 that the erlang prompt printed.

Not my terminal or email client, this is a bug in
erts_fp_check_init_error. It only allocates a 64 byte buffer for the
error message. The pointer address and the \r\n are eaten because the
buffer is too small to fit the whole error message. buf[64] is too
small... the format string itself is already 57 chars (including the
NULL).

void erts_fp_check_init_error(volatile unsigned long *fpexnp)
{
    char buf[64];
    snprintf(buf, sizeof buf, "ERTS_FP_CHECK_INIT at %p: detected
unhandled FPE at %p\r\n",
         __builtin_return_address(0), (void*)*fpexnp);

> That 0x1 means that the #ifdef tests in erts/emulator/sys/unix/sys_float.c
> failed to enable the proper sigaction-style SIGFPE handler in your Erlang
> VM build, and instead fell back to a very primitive plain SIGFPE handler.
> If you run gdb or objdump on your Erlang VM (the beam.smp executable not
> the erl frontend) I bet you won't find an fpe_sig_action() function but
> a smallish fpe_sig_handler() one instead.
>
> The machine-specific fpe_sig_action() handler is absolutely mandatory for
> reliable FP exceptions.

Maybe you missed it in my previous email, it's not 0x1, it is
0x10025433. I showed that by breaking at the function that prints the
error.
Breakpoint 1, erts_fp_check_init_error (fpexnp=0x110f2528) at
sys/unix/sys_float.c:87
87      {
(gdb) p (void*)*fpexnp
$1 = (void *) 0x10025433

There is a fpe_sig_action, and no fpe_sig_handler.

$ gdb beam.smp
GNU gdb 6.3.50-20050815 (Apple version gdb-1469) (Wed May  5 04:36:56 UTC 2010)
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "x86_64-apple-darwin"...Reading symbols for
shared libraries .... done

(gdb) list fpe_sig_action
523	#elif defined(__sun__) && defined(__x86_64__)
524	#define mc_pc(mc)	((mc)->gregs[REG_RIP])
525	#endif
526	
527	static void fpe_sig_action(int sig, siginfo_t *si, void *puc)
528	{
529	    ucontext_t *uc = puc;
530	    unsigned long pc;
531	
532	#if defined(__linux__)
(gdb) list fpe_sig_handler
Function "fpe_sig_handler" not defined.

> The bug then is that the C compiler on your Mac OS X didn't set up the
> proper preprocessor symbols so that the sigaction-style handler could
> be enabled.
>
> What does `gcc -E -dM -xc /dev/null | sort' say? (For 'gcc' substitute
> whatever compiler and extra compiler options you used to build Erlang.)

#define OBJC_NEW_PROPERTIES 1
#define _LP64 1
#define __APPLE_CC__ 5664
#define __APPLE__ 1
#define __BLOCKS__ 1
#define __CHAR_BIT__ 8
#define __CONSTANT_CFSTRINGS__ 1
#define __DBL_DENORM_MIN__ 4.9406564584124654e-324
#define __DBL_DIG__ 15
#define __DBL_EPSILON__ 2.2204460492503131e-16
#define __DBL_HAS_DENORM__ 1
#define __DBL_HAS_INFINITY__ 1
#define __DBL_HAS_QUIET_NAN__ 1
#define __DBL_MANT_DIG__ 53
#define __DBL_MAX_10_EXP__ 308
#define __DBL_MAX_EXP__ 1024
#define __DBL_MAX__ 1.7976931348623157e+308
#define __DBL_MIN_10_EXP__ (-307)
#define __DBL_MIN_EXP__ (-1021)
#define __DBL_MIN__ 2.2250738585072014e-308
#define __DEC128_DEN__ 0.000000000000000000000000000000001E-6143DL
#define __DEC128_EPSILON__ 1E-33DL
#define __DEC128_MANT_DIG__ 34
#define __DEC128_MAX_EXP__ 6144
#define __DEC128_MAX__ 9.999999999999999999999999999999999E6144DL
#define __DEC128_MIN_EXP__ (-6143)
#define __DEC128_MIN__ 1E-6143DL
#define __DEC32_DEN__ 0.000001E-95DF
#define __DEC32_EPSILON__ 1E-6DF
#define __DEC32_MANT_DIG__ 7
#define __DEC32_MAX_EXP__ 96
#define __DEC32_MAX__ 9.999999E96DF
#define __DEC32_MIN_EXP__ (-95)
#define __DEC32_MIN__ 1E-95DF
#define __DEC64_DEN__ 0.000000000000001E-383DD
#define __DEC64_EPSILON__ 1E-15DD
#define __DEC64_MANT_DIG__ 16
#define __DEC64_MAX_EXP__ 384
#define __DEC64_MAX__ 9.999999999999999E384DD
#define __DEC64_MIN_EXP__ (-383)
#define __DEC64_MIN__ 1E-383DD
#define __DECIMAL_DIG__ 21
#define __DEC_EVAL_METHOD__ 2
#define __DYNAMIC__ 1
#define __ENVIRONMENT_MAC_OS_X_VERSION_MIN_REQUIRED__ 1067
#define __FINITE_MATH_ONLY__ 0
#define __FLT_DENORM_MIN__ 1.40129846e-45F
#define __FLT_DIG__ 6
#define __FLT_EPSILON__ 1.19209290e-7F
#define __FLT_EVAL_METHOD__ 0
#define __FLT_HAS_DENORM__ 1
#define __FLT_HAS_INFINITY__ 1
#define __FLT_HAS_QUIET_NAN__ 1
#define __FLT_MANT_DIG__ 24
#define __FLT_MAX_10_EXP__ 38
#define __FLT_MAX_EXP__ 128
#define __FLT_MAX__ 3.40282347e+38F
#define __FLT_MIN_10_EXP__ (-37)
#define __FLT_MIN_EXP__ (-125)
#define __FLT_MIN__ 1.17549435e-38F
#define __FLT_RADIX__ 2
#define __GNUC_GNU_INLINE__ 1
#define __GNUC_MINOR__ 2
#define __GNUC_PATCHLEVEL__ 1
#define __GNUC__ 4
#define __GXX_ABI_VERSION 1002
#define __INTMAX_MAX__ 9223372036854775807L
#define __INTMAX_TYPE__ long int
#define __INT_MAX__ 2147483647
#define __LDBL_DENORM_MIN__ 3.64519953188247460253e-4951L
#define __LDBL_DIG__ 18
#define __LDBL_EPSILON__ 1.08420217248550443401e-19L
#define __LDBL_HAS_DENORM__ 1
#define __LDBL_HAS_INFINITY__ 1
#define __LDBL_HAS_QUIET_NAN__ 1
#define __LDBL_MANT_DIG__ 64
#define __LDBL_MAX_10_EXP__ 4932
#define __LDBL_MAX_EXP__ 16384
#define __LDBL_MAX__ 1.18973149535723176502e+4932L
#define __LDBL_MIN_10_EXP__ (-4931)
#define __LDBL_MIN_EXP__ (-16381)
#define __LDBL_MIN__ 3.36210314311209350626e-4932L
#define __LITTLE_ENDIAN__ 1
#define __LONG_LONG_MAX__ 9223372036854775807LL
#define __LONG_MAX__ 9223372036854775807L
#define __LP64__ 1
#define __MACH__ 1
#define __MMX__ 1
#define __NO_INLINE__ 1
#define __PIC__ 2
#define __PTRDIFF_TYPE__ long int
#define __REGISTER_PREFIX__
#define __SCHAR_MAX__ 127
#define __SHRT_MAX__ 32767
#define __SIZE_TYPE__ long unsigned int
#define __SSE2_MATH__ 1
#define __SSE2__ 1
#define __SSE3__ 1
#define __SSE_MATH__ 1
#define __SSE__ 1
#define __SSP__ 1
#define __STDC_HOSTED__ 1
#define __STDC__ 1
#define __UINTMAX_TYPE__ long unsigned int
#define __USER_LABEL_PREFIX__ _
#define __VERSION__ "4.2.1 (Apple Inc. build 5664)"
#define __WCHAR_MAX__ 2147483647
#define __WCHAR_TYPE__ int
#define __WINT_TYPE__ int
#define __amd64 1
#define __amd64__ 1
#define __block __attribute__((__blocks__(byref)))
#define __k8 1
#define __k8__ 1
#define __pic__ 2
#define __strong
#define __tune_core2__ 1
#define __weak __attribute__((objc_gc(weak)))
#define __x86_64 1
#define __x86_64__ 1

-bob