[yt-dev] Rockstar on multiple nodes

Matthew Turk matthewturk at gmail.com
Tue Nov 27 16:49:23 PST 2012


Hi Stephen,

On Tue, Nov 27, 2012 at 7:46 PM, Stephen Skory <s at skory.us> wrote:
> Hi Matt,
>
>> That might be it. When I run "top" on the nodes, I'm seeing "python
>> <defunct>" a number of times equal to NUM_WRITERS.
>
> I haven't figured it out but I've learned a little. There are a few
> places that Rockstar is fork()Ing, and I've been looking at the one
> around line 790 of client.c (underneath "else if (!strcmp(cmd,
> "rock")) {"). I've added some printfs there and what I'm seeing is
> that it is the forked processes are the ones going defunct, but they
> are not the PIDs that are reporting Network IO failures. It looks like
> none of the original unforked python tasks are going defunct before
> things hang. Could it be that the forked processes are
> quitting/finishing before they should, and that's why it's hanging?
> But in the words of Tina Turner, what does going from one to two nodes
> have to do with it?
>

What're the values of:

FORK_READERS_FROM_WRITERS
FORK_PROCESSORS_PER_MACHINE

?

-Matt

> --
> Stephen Skory
> s at skory.us
> http://stephenskory.com/
> 510.621.3687 (google voice)
> _______________________________________________
> yt-dev mailing list
> yt-dev at lists.spacepope.org
> http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org



More information about the yt-dev mailing list