[erlang-bugs] [BUG] gen_tcp:connect/3, 4 returns socket for closed port

Edwin Fine erlang-questions_efine@REDACTED
Wed Aug 13 08:00:47 CEST 2008


*Problem Statement*

When calling gen_tcp:connect/3 or /4 on a host/port that does not have a
running program listening on it, at random intervals gen_tcp:connect returns
an {ok, Sock} instead of the expected {error, econnrefused}. If
gen_tcp:recv(Sock, 0) is called immediately using the socket just returned,
it returns an {error, econnrefused}. Connection options used were [binary,
{packet, raw}, {active, false}]. It should be noted that the gen_tcp:connect
succeeds when there is a program listening on that sane host/port, so it's
unlikely to be a firewall issue.

*Reproducing the error*

The attached test program demonstrates the bug most quickly if you run with
more than one scheduler and with kernel poll enabled.

It was run in the shell as

gen_tcp_connect_bug:go(Host, Port).

It seemed to make no difference whether gen_tcp:connect/3 or
gen_tcp:connect/4 was called.
The destination system was another computer on the same local subnet, which
was running Windows XP.
Running the test with only one scheduler (+S 1) didn't return a false
positive in over 4 million connect() attempts (after which I terminated the
program). It may be reasonable to assume that the issue won't appear with
one scheduler.

   - erl +S 4 +K true: false positive returned in between 1 and a few
   thousand attempts.
   - erl +S 4 +K false: false positive returned in around 70,000 or more
   attempts.
   - erl +S 1 +K true or false: no false positive returned in > 4.2 million
   attempts.

The attached Wireshark text file trace shows that on every occasion, gen_tcp
sent a SYN and received an RST, even for the request that returned an open
socket, so it doesn't seem to be a TCP/IP problem. The binary version of the
trace is available if required.

This seems to suggest that the bug is related to SMP mode, and may have
something to do with 64-bit mode. I haven't got a 32-bit system to try it
on. Why it returns much quicker when using kernel poll is probably due to
timing.

*Test Environment*

   - Intel Q6600, Intel XBX2 MB, 8GB RAM, On-board Broadcom Gigabit Ethernet
   - Ubuntu Linux x86_64 Gutsy (Linux 2.6.24-16-generic #1 SMP Thu Apr 10
   12:47:45 UTC 2008 x86_64 GNU/Linux)
   - Erlang R12B-3

Regards,
Edwin Fine
-- 
For every expert there is an equal and opposite expert - Arthur C. Clarke
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-bugs/attachments/20080813/7b2c576f/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: gen_tcp_connect_bug.erl
Type: text/x-erlang
Size: 1085 bytes
Desc: not available
URL: <http://erlang.org/pipermail/erlang-bugs/attachments/20080813/7b2c576f/attachment.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: gen_tcp_connect_bug.wireshark_cap.20080813.txt.gz
Type: application/x-gzip
Size: 11680 bytes
Desc: not available
URL: <http://erlang.org/pipermail/erlang-bugs/attachments/20080813/7b2c576f/attachment-0001.bin>


More information about the erlang-bugs mailing list