[erlang-bugs] Segfault in erl_interface attempting to decode certain large binaries

Sverker Eriksson sverker@REDACTED
Thu Jul 24 18:35:00 CEST 2008


Hi Jonathan

I've looked at the test files you attached.

The file 'badbinary' is clearly broken. A lot of zeros is ending the file:

 > hexdump -C badbinary
:
0000ffb0 34 31 68 02 64 00 04 64 61 74 61 6d 00 00 00 08 |41h.d..datam....|
0000ffc0 54 65 72 72 65 6e 63 65 6a 6c 00 00 00 02 68 02 |Terrencejl....h.|
0000ffd0 64 00 03 72 6f 77 6d 00 00 00 24 38 37 32 61 36 |d..rowm...$872a6|
0000ffe0 62 62 61 2d 34 64 31 65 2d 34 34 64 34 2d 61 38 |bba-4d1e-44d4-a8|
0000fff0 61 62 2d 62 66 31 37 65 39 65 38 38 00 00 00 00 |ab-bf17e9e88....|
00010000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
0001e640 00 00 00 00 |....|
0001e644

The zeros are starting close to offset 0x10000 into the file.

The segfault happens when erl_interface finds the faulty format and 
fails to recover due to some bug. I will look at that.

I can't see how you could decode this data from within Erlang.

/Sverker, Erlang/OTP Ericsson


jlist@REDACTED wrote:
> All,
>
> I have a TCP interface between an Erlang system and a C system.  Both
> send/receive marshaled binary Erlang terms and I have not had any problems
> to date.
>
> Today I began doing some more serious testing with larger chunks of binary
> to be decoded in C.
>
> We ran into a bug (it seems) with erl_interface 3.5.5.4 that is causing it
> to segfault during decoding.  The backtrace looks like this:
>
> Program received signal SIGSEGV, Segmentation fault.
> [Switching to Thread 46912496233216 (LWP 4091)]
> 0x00000000004032c4 in _erl_free_term ()
> (gdb) bt
> #0  0x00000000004032c4 in _erl_free_term ()
> #1  0x000000000040496b in erl_decode_it ()
> #2  0x0000000000404937 in erl_decode_it ()
> #3  0x0000000000404c93 in erl_decode_it ()
> #4  0x0000000000404c93 in erl_decode_it ()
> #5  0x0000000000405311 in erl_decode ()
> #6  0x0000000000401b38 in main (argc=1, argv=0x7fff51ec7938) at badtest.c:28
>
> The unfortunate part is that the way this large binary term is generated
> cannot be done in any kind of sample code (it’s being pulled off an external
> database).
>
> Testing code:  http://jgray.la/erlang/erl_decode_segfault_test.tar.gz
>
> However, I have created a set of test files in C which recreate the
> segfault.  I stored the binary in a flat file (as ‘badbinary’) and have a
> testing program which reads it off disk and attempts to decode it.  To prove
> the approach is sane (and that this segfault is related to something strange
> about the decoding of this particular binary, not the size or general format
> of the binary) there is a ‘goodbinary’ file and testing program for that.
>
> To use the test code:
>
> Untar/Ungzip the file.  You may need to edit the Makefile to fix the paths
> to your erl_interface library.
>
> ‘make’ and then you can:
>
> ./badtest  (this reads ‘badbinary’ and attempts to decode, causes segfault)
> ./goodtest (this reads ‘goodbinary’ and successfully decodes it)  [nearly
> identical code to badtest.c but reads different file w/ different size]
>
> Also included is
>
> ./makegoodbin (a simple program that generates a large ETERM in an identical
> format to the badbinary but contains duplicated binary data everywhere) 
>
>
> Notes:
>
> * The marshaled binary erlang term being sent to C can be successfully
> decoded/unmarshaled from within Erlang without a problem
> * This is reproducible with many different large erlang terms generated from
> our database queries.  ‘makegoodbin.c’ creates a term identical in format to
> those causing problems, however it does not have the random distribution of
> binary sizes and content, and so I’m not able to reproduce the problem in
> this way.
> * The entire system, end-to-end including this decoding step, works
> perfectly in most cases.  However when the data goes into the 100k+ range,
> the segfaults start to happen.  That’s why I created the ‘makegoodbin’ which
> follows the same format.  Unfortunately that works even at sizes of >1MB
> adding to the confusion of the problem.
>
>
> Any help is appreciated.  Thanks.
>
> I apologize if this is a repost.  I never saw my original post hit the list
> and did not receive any responses.
>
> Jonathan Gray
> Streamy Inc.
>
> _______________________________________________
> erlang-bugs mailing list
> erlang-bugs@REDACTED
> http://www.erlang.org/mailman/listinfo/erlang-bugs
>   




More information about the erlang-bugs mailing list