[erlang-bugs] efile_drv & async thread key

Patrik Nyblom pan@REDACTED
Thu Aug 15 18:34:21 CEST 2013


Hi Rick!

On 08/14/2013 04:48 PM, Rick Reed wrote:
> Hi Patrik!
>
> And you want the requests in the same async queue to enforce ordering per
> file descriptor or some other reason?  It seems like ordering isn't an 
> issue
> because the ultimately the file calls in erlang are synchronous, and 
> an app
> would have to enforce ordering itself anyway (we do it by sending all the
> i/o for a file through a single proc and/or setting our own per-file 
> locks).
>
Yes, one example is process exit, where close definitely should not be 
intermingled
with other file operations from other threads that are ongoing. That 
definitely happens if
you round robin the file descriptors. I remember that there has been 
other situations where
the synchronous Erlang interface is not enough, but I can not for my 
life remember them right now.
Anyway, process exit is definitely one example :)
> For the app I'm debugging now, it turns out no scheme that ties the 
> port to
> a particular thread is going to work.  The system is running at the 
> limits of
> the hardware, and the ports are long-lived.  Only perfect distribution 
> of i/o
> requests over the available threads prevents certain threads from being
> overloaded and backing up the i/o on the ports that map to it.
Well, given the current design, I'm afraid a really good hash is the 
best I can come up with :(

The I/O should be rethought and rewritten once we have dirty schedulers 
instead
of the async threads!
>
> I've been running a few of the systems overnight with a patch that 
> disables
> keying in efile_drv.  Now I'm getting a nice flat distribution of i/o 
> across the
> async threads.  Unfortunately, it hasn't completely solved my problem, but
> those systems are doing much better.
Yes, probably. It is not safe though, especially compressed files in 
combination with
processes getting exit (kill) signals during the file operations may 
core the VM.

With better distribution of the FD's maybe you can get as good results 
as with
the round robin without risks?
>
> I'm just wondering if there's some other reason that I'm missing 
> (cache/mem
> affinity, platform differences, etc.) for having to map file 
> descriptors to
> particular threads.
I don't think it helps caches that much, it's far more threads than 
cores anyway, so it's bound
to generate inter-core communication regardless.
>
> Thanks for looking into this!
Thanks for reporting!
>
> Rr

Cheers,
Patrik
>
>
> On Wed, Aug 14, 2013 at 1:36 AM, Patrik Nyblom <pan@REDACTED 
> <mailto:pan@REDACTED>> wrote:
>
>     Hi Rick!
>
>
>     On 08/14/2013 02:21 AM, Rick Reed wrote:
>>     I assume the reason for keying the file requests is to prevent a
>>     single port from
>>     soaking up all the async threads?
>     Yes, and it's also important that requests for the same file
>     "descriptor" end up in she same async queue. So we need to store a
>     fixed key in the file descriptor structure.
>
>     I think I will hash the pointer to create the key, not just shift
>     away the "zero-bits", you never know which icky patterns an
>     allocator can create that will distribute the jobs unevenly. The
>     key will only be calculated upon opening, so there will be minimal
>     performance hit due to the more complicated calculations.
>
>     Thanks for reporting - this could cause severe performance issues
>     in applications!
>
>     Cheers,
>     Patrik
>
>>
>>     Rr
>>
>>
>>     On Tue, Aug 13, 2013 at 4:52 AM, Lukas Larsson <lukas@REDACTED
>>     <mailto:lukas@REDACTED>> wrote:
>>
>>         And there it is, conclusive proof that I should not be
>>         debugging Rickard's code before lunch.
>>
>>         Found the issue, will create a fix for it. As a workaround
>>         for R16B you can use a prime number as the number of async
>>         threads :)
>>
>>         Lukas
>>
>>
>>         On 13/08/13 10:05, Lukas Larsson wrote:
>>>         Sigh, apparently I spoke too soon.
>>>
>>>         I remembered incorrectly about the change. It was in R16B
>>>         that ErlDrvPort became a ptr and it was an id before R16B.
>>>         Anyways, it is odd that the ptr is 8 bit aligned on you
>>>         system. On mine (Ubuntu 13.04, x86_64) the ptrs are not
>>>         aligned and the load is nicely distributed among async
>>>         threads. If I remember correctly you are using FreeBSD on
>>>         x86_64? I'll check if I can reproduce the behavior you are
>>>         seeing on our FreeBSD machine.
>>>
>>>         Lukas
>>>
>>>         On 13/08/13 09:40, Lukas Larsson wrote:
>>>>         Hello Rick!
>>>>
>>>>         Which version of Erlang are you using? From R16B (I think),
>>>>         the ErlDrvPort datatype no longer is a pointer to the port
>>>>         struct. Instead it is the slot id into the port table and
>>>>         those ids should contain all values. I did a quick test on
>>>>         my computer running the latest on maint on github and seem
>>>>         to get a full spread over all async threads.
>>>>
>>>>         Lukas
>>>>
>>>>         On 13/08/13 05:40, Rick Reed wrote:
>>>>>         It looks to me as though there's a bit of a problem in the
>>>>>         way efile_drv.c generates the
>>>>>         key that's used to select an async driver queue.  It uses
>>>>>         the address of the port which
>>>>>         on our system is 8-byte aligned.  Meanwhile, erl_async.c
>>>>>         does a simple mod operation
>>>>>         with the number of async threads, so the number of threads
>>>>>         that can actually be used
>>>>>         by file operations is 1/8th of the number configured.  I
>>>>>         suspect this isn't intended.
>>>>>
>>>>>         Rr
>>>>>
>>>>>
>>>>>
>>>>>         _______________________________________________
>>>>>         erlang-bugs mailing list
>>>>>         erlang-bugs@REDACTED  <mailto:erlang-bugs@REDACTED>
>>>>>         http://erlang.org/mailman/listinfo/erlang-bugs
>>>>
>>>>
>>>>
>>>>         _______________________________________________
>>>>         erlang-bugs mailing list
>>>>         erlang-bugs@REDACTED  <mailto:erlang-bugs@REDACTED>
>>>>         http://erlang.org/mailman/listinfo/erlang-bugs
>>>
>>>
>>>
>>>         _______________________________________________
>>>         erlang-bugs mailing list
>>>         erlang-bugs@REDACTED  <mailto:erlang-bugs@REDACTED>
>>>         http://erlang.org/mailman/listinfo/erlang-bugs
>>
>>
>>
>>
>>     _______________________________________________
>>     erlang-bugs mailing list
>>     erlang-bugs@REDACTED  <mailto:erlang-bugs@REDACTED>
>>     http://erlang.org/mailman/listinfo/erlang-bugs
>
>
>     _______________________________________________
>     erlang-bugs mailing list
>     erlang-bugs@REDACTED <mailto:erlang-bugs@REDACTED>
>     http://erlang.org/mailman/listinfo/erlang-bugs
>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-bugs/attachments/20130815/8363a04d/attachment.htm>


More information about the erlang-bugs mailing list