[Yt-dev] Projection speed improvement patch

Fri Nov 6 07:35:48 PST 2009

Hi john,

> Thanks so much for taking a look at my data and examining the memory usage
> of analyzing a dataset of this size.  I'll have to give it another shot on
> ranger.  I can also see how I/O performance is on the Altix here at
> Princeton, which has a local RAID (just like red).

Awesome.  This is all with the hierarchy-opt branch in mercurial, but
I think I am going to port it back to trunk in the very near future,
now that I have tested it on a number of different datasets.

> You said that I could do projections on my laptop once the computation is
> done on a large machine.  I know the projection structure is stored in the
> .yt file, but are the projected fields also stored in the .yt file?  Or do I
> have to have the data on my laptop?

All the fields are stored as well.  Here's the .yt file for the
projections I made of your data:

[mjturk at login-4-0 RS0064]$ h5ls -r restart0064.yt
/DataFields              Dataset {36}
/Projections             Group
/Projections/0           Group
/Projections/0/Density_Density Dataset {9355404}
/Projections/0/Temperature_Density Dataset {9355404}
/Projections/0/VelocityMagnitude_Density Dataset {9355404}
/Projections/0/pdx       Dataset {9355404}
/Projections/0/pdy       Dataset {9355404}
/Projections/0/px        Dataset {9355404}
/Projections/0/py        Dataset {9355404}
/Projections/0/weight_field_Density Dataset {9355404}

So it's stored as DataType/Axis/Field.  Here we are also storing the
weight field, so that if you add a new field, the weight doesn't need
to be projected again.

To make a portable dataset of projections, you'll need the parameter
file and either the .hierarchy or .harrays file (in the repo I have
tried very hard to keep it so that the instantiation only touches
those two files) and the .yt file, and you can do something like:

pf = EnzoStaticOutput("restart0064", data_style="enzo_packed_3d")
pc = PlotCollection(pf, center=[0.5, 0.5, 0.5])
pc.add_projection("Density",0,"Density")

This is slightly wordier, in that you have to specify the data_style,
but I think that can be addressed as well.

HOWEVER, it occurred to me this morning that the FixedResolutionBuffer
should be able to accept the .yt file by itself, without the parameter
file or anything like that, and that maybe that should be used for
portable projections.  That would reduce the memory overhead, so I'm
going to take a quick look into that this morning.

-Matt