[erlang-bugs] erts_port[].drv_ptr == 0, when erts_port[].status not free

Raimo Niskanen raimo+erlang-bugs@REDACTED
Thu Jul 3 17:59:57 CEST 2008


Thank you for your bug report.

We will look into your problem when the concerned developers
comes in after their vacation.

Can you give us host OS and Erlang release too?



On Tue, Jul 01, 2008 at 07:24:05PM -0500, Paul Fisher wrote:
> We have a system where we run lots of linked-in driver ports that get
> created/used/closed frequently and sometimes very quickly.  Today when
> several open_port/2, port_command/2 and port_close/1 cycles happened
> rapid succession, a SIGSEGV occurrect in erl_bif_ddl.c:
> 
> Program received signal SIGSEGV, Segmentation fault.
> [Switching to Thread 1125235040 (LWP 12087)]
> 0x0000000000449712 in erl_ddll_try_unload_2 (p=0x2aaaab11fc90,
>     name_term=659339, options=46912503328425) at beam/erl_bif_ddll.c:592
> 
> The emulator was run on a Q6600 (quad-core, 2.4Ghz), and started with +A
> 8,
> and the linked-in driver executes the bulk of its work with
> driver_async().
> There were continuously 8 driver cycles running for 5-10 seconds before
> the
> segfault occurred.
> 
> ???(gdb) where
> #0  0x0000000000449712 in erl_ddll_try_unload_2 (p=0x2aaaab11fc90,
>     name_term=659339, options=46912503328425) at beam/erl_bif_ddll.c:592
> #1  0x000000000052337f in process_main () at beam/beam_emu.c:2073
> #2  0x000000000049c213 in sched_thread_func (vesdp=0x2ae18cb74f98)
>     at beam/erl_process.c:741
> #3  0x00000000005b6818 in thr_wrapper (vtwd=0x7fff1eb77de0)
>     at common/ethread.c:474
> #4  0x00002ae18c530f1a in start_thread () from /lib/libpthread.so.0
> #5  0x00002ae18c8135d2 in clone () from /lib/libc.so.6
> #6  0x0000000000000000 in ?? ()
> 
> So the code at the point of the SIGSEGV @ erl_bif_ddll.c:592 says:
> 
>         for (j = 0; j < erts_max_ports; j++) {
> =>          if (!(erts_port[j].status &  FREE_PORT_FLAGS)
>                 && erts_port[j].drv_ptr->handle == dh) {
> 
> It appears that the code assumes that if the erts_port array entry being
> evaluated during the search has a valid (non-zero) drv_ptr value, if the
> entry is not marked as free.  At the time of the crash, this is clearly
> not
> the case:
> 
> (gdb) p j
> $8 = 896
> 
> (gdb) p erts_port[j]
> $7 = {sched = {next = 0x0, prev = 0x0, taskq = 0x0, exe_taskq = 0x0},
>   timeout_task = {counter = 0}, refc = {counter = 2}, lock = 0x81b3c8,
>   xports = 0x0, id = 14343, connected = 0, caller = 0, data = 0, bp =
> 0x0,
>   nlinks = 0x0, monitors = 0x0, bytes_in = 0, bytes_out = 0, ptimer =
> 0x0,
>   tracer_proc = 18446744073709551611, trace_flags = 0, ioq = {size = 0,
>     v_start = 0x0, v_end = 0x0, v_head = 0x0, v_tail = 0x0, v_small = {{
>         iov_base = 0x0, iov_len = 0}, {iov_base = 0x0, iov_len = 0}, {
>         iov_base = 0x0, iov_len = 0}, {iov_base = 0x0, iov_len = 0}, {
>         iov_base = 0x0, iov_len = 0}}, b_start = 0x0, b_end = 0x0,
>     b_head = 0x0, b_tail = 0x0, b_small = {0x0, 0x0, 0x0, 0x0, 0x0}},
>   dist_entry = 0x0, name = 0x0, drv_ptr = 0x0, drv_data = 0, suspended =
> 0x0,
>   linebuf = 0x0, status = 4096, control_flags = 0, reg = 0x0,
>   port_data_lock = 0x0}
> 
> (gdb) p erts_port[j].drv_ptr
> $6 = (ErlDrvEntry *) 0x0
> 
> 
> So the real questions are: 1) is whether the assumption built into this
> code is correct; and 2) if so, how did we get in the position of
> violating
> it.  I'd appreciate some insight into what could be going on here, and
> where I should can start looking.
> 
> 
> -- 
> paul
> 
> _______________________________________________
> erlang-bugs mailing list
> erlang-bugs@REDACTED
> http://www.erlang.org/mailman/listinfo/erlang-bugs

-- 

/ Raimo Niskanen, Erlang/OTP, Ericsson AB



More information about the erlang-bugs mailing list