[erlang-bugs] [erlang-questions] Process/FD leak in SSL R15B01

Ingela Anderton Andin Ingela.Anderton.Andin@REDACTED
Wed Oct 24 11:24:50 CEST 2012


Hi!

Loïc Hoguin wrote:
> This doesn't make a difference so far.

This would only make a differnce if you do not set active
explicitly.

Anyway I have a theory that perhaps the inet driver can hang
if you try to do recv on a socket that has been shutdown for
writing from the other side or maybe some strange race condition.
I have no evidence that this is so, for now I just think it fits the 
scenario of what seems to happen.

I think that ssl should have a new terminate clause that
avoids doing socket operations that are logically not necessary, even if
it can be considered a bug that the inet driver hangs.

So first step try this patch and see if your problem goes away. If yes
that is the solution for you and the ssl-application and we will have
to try and pinpoint the actual problem in the inet driver and fix that.


diff --git a/lib/ssl/src/ssl_connection.erl b/lib/ssl/src/ssl_connection.erl
index 1319b54..c9c162b 100644
--- a/lib/ssl/src/ssl_connection.erl
+++ b/lib/ssl/src/ssl_connection.erl
@@ -984,7 +984,7 @@ handle_info({CloseTag, Socket}, StateName,
             ok
      end,
      handle_normal_shutdown(?ALERT_REC(?FATAL, ?CLOSE_NOTIFY), 
StateName, State),
-    {stop, normal, State};
+    {stop, {shutdown, transport_closed}, State};

  handle_info({ErrorTag, Socket, econnaborted}, StateName,
             #state{socket = Socket, start_or_recv_from = StartFrom, 
role = Role,
@@ -1022,6 +1022,14 @@ terminate(_, _, #state{terminated = true}) ->
      %% we want to guarantee that Transport:close has been called
      %% when ssl:close/1 returns.
      ok;
+
+terminate({shutdown, transport_closed}, _, #state{negotiated_version = 
Version,
+                                                 send_queue = SendQueue,
+                                                 renegotiation = 
Renegotiate} = State) ->
+    handle_trusted_certs_db(State),
+    notify_senders(SendQueue),
+    notify_renegotiater(Renegotiate);
+
  terminate(Reason, connection, #state{negotiated_version = Version,
                                       connection_states = ConnectionStates,
                                       transport_cb = Transport,


Regards Ingela Erlang/OTP team - Ericsson AB


> On 10/17/2012 09:51 AM, Ingela Anderton Andin wrote:
>> Hi!
>>
>> My problem goes away with the following patch
>>
>> diff --git a/lib/ssl/src/ssl.erl b/lib/ssl/src/ssl.erl
>> index 7788f75..771bfa5 100644
>> --- a/lib/ssl/src/ssl.erl
>> +++ b/lib/ssl/src/ssl.erl
>> @@ -869,10 +869,10 @@ internal_inet_values() ->
>>
>> socket_options(InetValues) ->
>>      #socket_options{
>> -               mode   = proplists:get_value(mode, InetValues),
>> -               header = proplists:get_value(header, InetValues),
>> -               active = proplists:get_value(active, InetValues),
>> -               packet = proplists:get_value(packet, InetValues),
>> +               mode   = proplists:get_value(mode, InetValues, lists),
>> +               header = proplists:get_value(header, InetValues, 0),
>> +               active = proplists:get_value(active, InetValues, active),
>> +               packet = proplists:get_value(packet, InetValues, 0),
>>                 packet_size = proplists:get_value(packet_size, 
>> InetValues)
>>                }.
>>
>>
>> e.i.  default values where not properly handled.  I know to  little
>> about  your configuration to say if  this is your problem too.  If not
>> it would be great if you could
>> give me a way to recreate your problem.
>>
>> Regards Ingela Erlang/OTP team - Ericsson AB
>>
>>
>> Ingela Anderton Andin wrote:
>>> Hi!
>>>
>>> This is puzzling. Links seems to be intact. And the supervisor should
>>> have killed the gen_fsm-process if it gets stuck in terminate.
>>>
>>> I tried to recreate your problem, I did get a process leak problem,
>>> however it did not manifest itself in quite the same way as yours.
>>>
>>> In my case I have an active process that seems to not have received
>>> the tcp_close message. The fsm procss emulates active option as it
>>> uses active once to receive TLS packets. If I set the active option
>>> the process will terminate. At the moment I am have not found the root
>>> of why it is not working as expected e.i. if it is the emulating code
>>> that does something wrong or it perhaps is the inet driver.  Will have
>>> to keep digging.
>>>
>>> Regards Ingela Erlang/OTP team - Ericsson AB
>>>
>>>
>>> Loïc Hoguin wrote:
>>>> 103> erlang:port_info(Port).
>>>> [{name,"tcp_inet"},
>>>>  {links,[<0.18199.1670>]},
>>>>  {id,51824890},
>>>>  {connected,<0.18199.1670>},
>>>>  {input,0},
>>>>  {output,3583}]
>>>> 104> Pid.
>>>> <0.18199.1670>
>>>>
>>>> On 10/16/2012 11:55 AM, Ingela Anderton Andin wrote:
>>>>> Hi!
>>>>>
>>>>> Ok, next question can you do a port_info on the linked port?
>>>>>
>>>>> Regards Ingela Erlang/OTP Team - Ericsson AB
>>>>>
>>>>> Loïc Hoguin wrote:
>>>>>> Hey,
>>>>>>
>>>>>> Here's one:
>>>>>>
>>>>>> [{current_function,{prim_inet,recv0,3}},
>>>>>>  {initial_call,{proc_lib,init_p,5}},
>>>>>>  {status,waiting},
>>>>>>  {message_queue_len,2},
>>>>>>  {messages,[{system,{<0.1523.2358>,#Ref<0.0.9161.247946>},
>>>>>>                     get_status},
>>>>>> {system,{<0.19941.2364>,#Ref<0.0.9166.119462>},get_status}]},
>>>>>>  {links,[<0.897.0>,#Port<0.51824890>]},
>>>>>>  {dictionary,[{ssl_manager,ssl_manager},
>>>>>>               {'$ancestors',[ssl_connection_sup,ssl_sup,<0.894.0>]},
>>>>>>               {'$initial_call',{ssl_connection,init,1}}]},
>>>>>>  {trap_exit,false},
>>>>>>  {error_handler,error_handler},
>>>>>>  {priority,normal},
>>>>>>  {group_leader,<0.893.0>},
>>>>>>  {total_heap_size,10946},
>>>>>>  {heap_size,4181},
>>>>>>  {stack_size,21},
>>>>>>  {reductions,8272},
>>>>>>  {garbage_collection,[{min_bin_vheap_size,46368},
>>>>>>                       {min_heap_size,233},
>>>>>>                       {fullsweep_after,10},
>>>>>>                       {minor_gcs,1}]},
>>>>>>  {suspending,[]}]
>>>>>>
>>>>>> The two get_status were me trying to inspect and getting a timeout.
>>>>>>
>>>>>> Will try commenting the function, that was my guess also. Doesn't
>>>>>> explain the other half of the processes though which still seem to be
>>>>>> running happily despite the process owning the socket being dead for
>>>>>> days.
>>>>>>
>>>>>> On 10/16/2012 11:18 AM, Ingela Anderton Andin wrote:
>>>>>>> Hi!
>>>>>>>
>>>>>>> This sounds really strange it would be interesting to see all
>>>>>>> process_info available for the process.
>>>>>>>
>>>>>>> Something you could try is to comment out the invocation of the
>>>>>>> function
>>>>>>> workaround_transport_delivery_problems in the terminate function
>>>>>>> of the
>>>>>>>   ssl_connection-process. This function can call recv(S, 0) and
>>>>>>> sounds
>>>>>>> like the probable recv that hangs even though it should not.
>>>>>>>
>>>>>>> Regards Ingela Erlang/OTP team - Ericsson AB
>>>>>>>
>>>>>>>
>>>>>>> Loïc Hoguin wrote:
>>>>>>>> On 10/15/2012 06:09 PM, Attila Rajmund Nohl wrote:
>>>>>>>>> 2012/10/15 Loïc Hoguin <essen@REDACTED>:
>>>>>>>>> [...]
>>>>>>>>>>> lists:foldl(fun(X, Sum) -> case erlang:process_info(X) of
>>>>>>>>>>> undefined ->
>>>>>>>>>>> Sum; [{current_function, XXX}|_] -> case lists:keyfind(XXX, 1,
>>>>>>>>>>> Sum)
>>>>>>>>>>> of false
>>>>>>>>>>> -> Curr = 0; {_, Curr} -> ok end, lists:keystore(XXX, 1, Sum,
>>>>>>>>>>> {XXX,
>>>>>>>>>>> Curr +
>>>>>>>>>>> 1}) end end, [], List).
>>>>>>>>>> [{{prim_inet,recv0,3},25856},{{gen_fsm,loop,7},26574}]
>>>>>>>>>>
>>>>>>>>>> Not sure which one is the ESTABLISHED list and which one is the
>>>>>>>>>> FIN_WAIT2.
>>>>>>>>>> Of course, I can't use sys:get_status/1 on the PIDs stuck in
>>>>>>>>>> prim_inet:recv0/3 because the receive there is quite specific.
>>>>>>>>>> So I
>>>>>>>>>> can't
>>>>>>>>>> get the stacktrace. The other case doesn't seem to give anything
>>>>>>>>>> useful (for
>>>>>>>>>> my level of knowledge, anyway).
>>>>>>>>>
>>>>>>>>> You can get the stacktrace with erlang:process_info(Pid,
>>>>>>>>> backtrace).
>>>>>>>>
>>>>>>>> Thanks for the tip!
>>>>>>>>
>>>>>>>> So yeah, this one is stuck while trying to terminate.
>>>>>>>>
>>>>>>>> Program counter: 0x00007f05fd6a5608 (prim_inet:recv0/3 + 224)
>>>>>>>> CP: 0x0000000000000000 (invalid)
>>>>>>>> arity = 0
>>>>>>>>
>>>>>>>> 0x00007f052e1b1eb0 Return addr 0x00007f05a3248a98
>>>>>>>> (ssl_connection:terminate/3 + 800)
>>>>>>>> y(0)     57928
>>>>>>>> y(1)     #Port<0.51824890>
>>>>>>>>
>>>>>>>> 0x00007f052e1b1ec8 Return addr 0x00007f05a3b29670
>>>>>>>> (gen_fsm:terminate/7
>>>>>>>> + 168)
>>>>>>>> y(0)     []
>>>>>>>> y(1)     []
>>>>>>>> y(2)     []
>>>>>>>> y(3)     []
>>>>>>>> y(4)     #Port<0.51824890>
>>>>>>>> y(5)     gen_tcp
>>>>>>>>
>>>>>>>> 0x00007f052e1b1f00 Return addr 0x00007f05a3bb41d0
>>>>>>>> (proc_lib:init_p_do_apply/3 + 56)
>>>>>>>> y(0)     []
>>>>>>>> y(1)
>>>>>>>> {state,server,{#Ref<0.0.8553.184512>,<0.18913.1670>},gen_tcp,tcp,tcp_closed,tcp_error,"localhost",8443,#Port<0.51824890>,{ssl_options,[],verify_none,{#Fun<ssl.1.54384637>,[]},false,false,undefined,1,"/home/obfuscated/obfuscatedrlang/lib/obfuscatedunchat-1.6.0/priv/ssl/cert.pem",undefined,"/home/obfuscated/obfuscatedrlang/lib/obfuscatedunchat-1.6.0/priv/ssl/cert.pem",undefined,undefined,undefined,"/home/obfuscated/obfuscatedrlang/lib/obfuscatedunchat-1.6.0/priv/ssl/cacert.pem",undefined,undefined,[<<2 
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> bytes>>,<<2 bytes>>,<<2 bytes>>,<<2 bytes>>,<<2 bytes>>,<<2
>>>>>>>> bytes>>,<<2 bytes>>,<<2 bytes>>,<<2 bytes>>,<<2 bytes>>,<<2
>>>>>>>> bytes>>,<<2 bytes>>,<<2
>>>>>>>> bytes>>],#Fun<ssl.0.54384637>,true,268435456,false,[],undefined,false},{socket_options,binary,0,0,0,once},{connection_states,{connection_state,{security_parameters,<<2 
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> bytes>>,0,7,1,16,128,16,unknown,2,20,0,<<48 bytes>>,<<32
>>>>>>>> bytes>>,<<32
>>>>>>>> bytes>>,undefined},undefined,{cipher_state,<<16 bytes>>,<<16
>>>>>>>> bytes>>,undefined},<<20 bytes>>,6,true,<<12 bytes>>,<<12
>>>>>>>> bytes>>},{connection_state,{security_parameters,undefined,0,undefined,undefined,undefined,undefined,undefined,undefined,undefined,undefined,undefined,undefined,undefined,<<32 
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> bytes>>,undefined},undefined,undefined,undefined,undefined,true,undefined,undefined},{connection_state,{security_parameters,<<2 
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> bytes>>,0,7,1,16,128,16,unknown,2,20,0,<<48 bytes>>,<<32
>>>>>>>> bytes>>,<<32
>>>>>>>> bytes>>,undefined},undefined,{cipher_state,<<16 bytes>>,<<16
>>>>>>>> bytes>>,undefined},<<20 bytes>>,13,true,<<12 bytes>>,<<12
>>>>>>>> bytes>>},{connection_state,{security_parameters,undefined,0,undefined,undefined,undefined,undefined,undefined,undefined,undefined,undefined,undefined,undefined,undefined,<<32 
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> bytes>>,undefined},undefined,undefined,undefined,undefined,true,undefined,undefined}},[],<<0 
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> bytes>>,<<0 bytes>>,{<<0 bytes>>,<<0 
>>>>>>>> bytes>>},[],12308,{session,<<32
>>>>>>>> bytes>>,undefined,<<1257 bytes>>,0,<<2 bytes>>,<<48
>>>>>>>> bytes>>,true,63517514196},24599,ssl_session_cache,{3,1},undefined,false,rsa,undefined,{'RSAPrivateKey','two-prime',26952275589898844250103204854000460899755240864557148991279029405309749386179997816352098133722767607711339559172144166023544257819579600281768301281374192936110073288270279982220728800798557215504319561694659173220853716332492625335748497090542115981135939145850167689577529567336032326581411005966667046488818079418004669621093520594249388264789813277716494693460309931110183497107534349074298533842111855672958994036657571757894555006279600552417098362361837531438833518633912632305124934722467790401548511827982945839067677876435394001531838872958423949934302335970305331259903589271491819721745867851063767101711,65537,72894561061836369429952440001397404435381791098810666626550009339627280447692213756327103684977349565388024121690992163627770872919426951949943259565185707278720577541631553578110444175680062658469881781442213382568569223796242089823480188432
467 
>>>>>>>>
>>>>>>>>
>>> 00
>>>>>>>>
>>>>>>>>
>>>>> 42
>>>>>>>>
>>>>>>>>
>>>>>>> 51893513795
>>>>>>>>
>>>>>>>> 29032795180763714863835283712338222389782464852044908949559099608779063302276133782734662944988246892889439189879440128304200546026414115176004328650285319262737051106741309180479178565669060188052206153201268137707738579437817066853295089724557953207910831295502502266942720391639060038564028714207644340973116764361586980768047522941109290140269958017673,175687943987452481712482925881452602286249755596049611520304319723586191396077353211979651292293422727226888628810410185730059847194115171473543563960899974179642474882228165413574970142069071147678024461620581964093179873537217082000749319715442644574561869121959620883833054096268955094522609955548334417459,153409932282113553454184543331491108535630917003675738115171976005561046830080906444372373873384261619709596605477733172587334100633359814892598660495299494503111498422936185897687222105679379604880102069501875829087474529350923531274442007552106843129760936560325773227109973336264866827986047857553670280629,1653
778 
>>>>>>>>
>>>>>>>>
>>> 01
>>>>>>>>
>>>>>>>>
>>>>> 12
>>>>>>>>
>>>>>>>>
>>>>>>> 80640103960
>>>>>>>>
>>>>>>>> 32533996867303777118782862747708688208092956158440780252498542811490728487320772469810738971008968489145399289747151126453392805772011338882217952599871103786463875892748648418527046734444999723398379211505851743415571090610985953725319559986129544296274474768880307859595812559810466567,99372239273086723091338362047522171285756194037567212940251777246259022537354389739788150444373539745180764989177727522509078951433348961072685633082784597107680994416138776015512122203187528006871997391618377904030112283443023112892909533616156365176077807633229316646127723088806569214057154029767435353361,7323400945433254897172443884831966307350890604864415567101495881249115607826271555845873590083141172347084964771961315371798965570031721570889013199060717814694355367862624287469661994309269724835502396215672690851146331180228281917872372949595188776597872289445053324786304898777105863585382475713592590505,asn1_NOVALUE},{'DHParameter',17976931348623159077083915679378745319786029604
875 
>>>>>>>>
>>>>>>>>
>>> 60
>>>>>>>>
>>>>>>>>
>>>>> 11
>>>>>>>>
>>>>>>>>
>>>>>>> 70644442368
>>>>>>>>
>>>>>>>> 4197180216158519368947833795864925541502180565485980503646440548199239100050792877003355816639229553136239076508735759914822574862575007425302077447712589550957937778424442426617334727629299387668709205606050270810842907692932019128194467627007,2,asn1_NOVALUE},undefined,undefined,#Ref<0.0.0.13264>,{<0.1092.0>,#Ref<0.0.8553.179746>},0,<<0 
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> bytes>>,true,undefined,undefined,{[],[]},false,true}
>>>>>>>> y(2)     connection
>>>>>>>> y(3)     ssl_connection
>>>>>>>> y(4) {'DOWN',#Ref<0.0.8553.184512>,process,<0.18913.1670>,normal}
>>>>>>>> y(5)     <0.18199.1670>
>>>>>>>> y(6)     normal
>>>>>>>> y(7)     Catch 0x00007f05a3b29670 (gen_fsm:terminate/7 + 168)
>>>>>>>>
>>>>>>>> 0x00007f052e1b1f48 Return addr 0x0000000000883498 (<terminate
>>>>>>>> process
>>>>>>>> normally>)
>>>>>>>> y(0)     Catch 0x00007f05a3bb41f0 (proc_lib:init_p_do_apply/3 + 88)
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>>
>>> _______________________________________________
>>> erlang-bugs mailing list
>>> erlang-bugs@REDACTED
>>> http://erlang.org/mailman/listinfo/erlang-bugs
>>>
>>
>>
> 
> 




More information about the erlang-bugs mailing list