[yt-users] Running Rockstar Halo Finder to create a merger tree

Brendan Griffen brendan.f.griffen at gmail.com
Wed Dec 18 10:32:23 PST 2013


On Tue, Oct 29, 2013 at 12:30 PM, Hilary Egan <hilaryye at gmail.com> wrote:

> Hi all,
>
> I'm quite confused on a number of points related to running the rockstar
> halo finder, so I hope its alright that I put all these questions into this
> one email!
>
> 1. I can't seem to run the rockstar halo finder at all without getting
> this error followed by a segmentation fault and crash.
>
> [Warning] Network IO Failure (PID XXXXXX): Connection reset by peer
> [Network] Packet receive retry count at: 1
>

I know this is an old thread so this all might be futile.

I've had this before and it can be a number if network related issues.
Primarily it may be that the default ports that the clients are using to
communicate with the server are blocked, taken, or invalid for some reason.
 If everything is running on the same machine, you may be able to try using

PARALLEL_IO_SERVER_INTERFACE = lo

This will force everything to use the local loopback address (127.0.0.1).
Often just waiting a few minutes for other instances to die often solves
the problem. You could add 'killall rockstar' to your submission script in
case there are zombie rockstar processes still running causing server
issues.

I believe kraken uses SLURM in which case the following submission script
might be helpful (using srun). e.g. start 128 instances and change
FORK_PROCESSORS_PER_MACHINE to 1 in your cfg file.

You'll have to check the hdf5 module and change a few other things but here
is a template.

#!/bin/bash
#SBATCH -n 128
#SBATCH -o job.o%j
#SBATCH -e job.e%j
#SBATCH -t 5000
#SBATCH -p queue_name
#SBATCH --mem=32gb
#SBATCH -J rockstarjob --exclusive

module load -S centos6/hdf5-1.8.11_gcc-4.8.0

rsdir=/path/to/rockstar/code/
exe=/path/to/rockstar/executable
cd $rsdir
outdir=/path/to/output/directory/

$exe -c $rsdir/cfgs/config.cfg &
#uncomment below and comment above for restarts.
#$exe -c $outdir/restart.cfg &
cd $outdir
perl -e 'sleep 1 while (!(-e "auto-rockstar.cfg"))'

srun -n 128 $exe -c auto-rockstar.cfg

You might have already moved on by now.  Hope this helps if not.

Brendan



>
> It sort of seems like this issue (
> http://lists.spacepope.org/htdig.cgi/yt-dev-spacepope.org/2012-November/002681.html)
> but I couldn't really figure out what the resolution was from the thread.
> Im attempting to run this on kraken and it doesn't matter if I use a single
> compute node or multiple, I get the same error. (I hope this isn't the
> infiniband issue the docs warned about, I couldn't figure out if that is
> how kraken is connected and I got an error that the suggested flag doesn't
> exist so I didn't press the issue.)
>
> 2. Whenever I finally do get the halo finder to work, I need the results
> to be in a form that the merger tree can use. It seems as though the
> MergerTree needs the results in the same form as the other halo finders
> give, so would getting the halo list and then dumping it as usual be the
> appropriate strategy? Ie:
>
>         rh.run()
>         halo_list = rh.halo_list()
> halo_list.dump('MergerHalos')
>
> 2.5. The docs sort of give mixed messages on whether or not I could just
> be calling MergerTree with the argument halo_finder_function =
> RockstarHaloFinder. At this point I've pretty thoroughly convinced myself
> that I can't, but it would be nice if that was clarified. (Just a
> thoroughly overwhelmed new user's perspective!)
>
> 3. I'm a little confused as to whether or not I have to use a
> TimeSeriesData object rather than the usual single time output when
> instantiating the halo finder. Under "Rockstar Halo Finding" it uses
> TimeSeriesData, unlike the rest of the examples, but under the subheading
> "Output Analysis" it just uses pf. The "Output Analysis" example also
> doesn't call the run() method, which leads me to believe something else
> entirely is going on, but its not quite clear.
>
> Thanks!
> -Hilary
>
> _______________________________________________
> yt-users mailing list
> yt-users at lists.spacepope.org
> http://lists.spacepope.org/listinfo.cgi/yt-users-spacepope.org
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.spacepope.org/pipermail/yt-users-spacepope.org/attachments/20131218/2b8dbba4/attachment.htm>


More information about the yt-users mailing list