[erlang-questions] Investigate an infinite loop on production servers

Dmitry Kolesnikov dmkolesnikov@REDACTED
Thu May 23 07:13:44 CEST 2013


Hello,

I would aree with Bob about most probable root cause. You can use entop to check message queue length and used memory per-process.

Best Regards,
Dmitry >-|-|-*>


On 23.5.2013, at 5.21, Bob Ippolito <bob@REDACTED> wrote:

> This kind of thing tends to happen when you continuously send messages to a process faster than it can handle them. The most common case that I've seen this is where you have a lot of processes communicating with a single gen_server process. If your server has swap enabled, this may appear to make the node "freeze completely but not crash".
> 
> In the past I've diagnosed this by monitoring the message_queue_len of registered processes, but I'm sure there are tools that can help do this for you.
> 
> 
> On Wed, May 22, 2013 at 7:00 PM, Morgan Segalis <msegalis@REDACTED> wrote:
>> Hello everyone,
>> 
>> I'm having a bit of an issue with my production servers.
>> 
>> At some point, it seems to enter into an infinite loop that I can't find, or reproduce by myself on the tests servers.
>> 
>> The bug appear completely random, 1 hour, or 10 hour after restarting the Erlang node.
>> The loop will eat up all my server's memory in no time, and freeze completely the Erlang node without crashing it. (most of the time)
>> 
>> One time I got an crash dump, and tried to investigate it with cdv, but I didn't get much informations about which process or module was eating up all the memory.
>> I just know that it crashed because of the crash message : "eheap_alloc: Cannot allocate 6801972448 bytes of memory (of type "heap")."
>> 
>> I'm surely too new to Erlang to investigate something like this with cdv, I really would like some pointers on how I can understand this problem and fix it asap.
>> 
>> If you need any informations about the crash dump, let me know what you need, I'll copy/paste…
>> 
>> I'm using Erlang R16B (erts-5.10.1) [source] [64-bit] [smp:8:8] [async-threads:10] [kernel-poll:true]
>> 
>> Thank you all for your help !
>> 
>> 
>> _______________________________________________
>> erlang-questions mailing list
>> erlang-questions@REDACTED
>> http://erlang.org/mailman/listinfo/erlang-questions
> 
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@REDACTED
> http://erlang.org/mailman/listinfo/erlang-questions
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20130523/49503c97/attachment.htm>


More information about the erlang-questions mailing list