[yt-dev] Rockstar on multiple nodes

Stephen Skory s at skory.us
Tue Nov 27 16:46:49 PST 2012


Hi Matt,

> That might be it. When I run "top" on the nodes, I'm seeing "python
> <defunct>" a number of times equal to NUM_WRITERS.

I haven't figured it out but I've learned a little. There are a few
places that Rockstar is fork()Ing, and I've been looking at the one
around line 790 of client.c (underneath "else if (!strcmp(cmd,
"rock")) {"). I've added some printfs there and what I'm seeing is
that it is the forked processes are the ones going defunct, but they
are not the PIDs that are reporting Network IO failures. It looks like
none of the original unforked python tasks are going defunct before
things hang. Could it be that the forked processes are
quitting/finishing before they should, and that's why it's hanging?
But in the words of Tina Turner, what does going from one to two nodes
have to do with it?

--
Stephen Skory
s at skory.us
http://stephenskory.com/
510.621.3687 (google voice)



More information about the yt-dev mailing list