[yt-dev] Experimental sub-chunking for grids

Thu Jun 19 07:00:46 PDT 2014

Hi all,

Kacper and I have been going back and forth a bit on getting FLASH
memory usage down.  I thought I knew the reason it was high, but turns
out, I didn't.  So after some experimentation, I think the reason is
that the IO chunking was pulling the full dataset, since FLASH has
only one "file" which was only giving it one method of subchunking.

The heuristic right now is how many "grids" can be in a single chunk,
and I've set this to be 1000.  For FLASH this isn't so bad, but I
wonder if Enzo and other patch datasets that have many grids in a
single file might end up suffering.  I don't know if it's too common
for those data types to have >>1000 grids in a file (my outputs never
did) but if it is, then there will be overhead to reading, iterating,
reading, during data IO.

Anyway, we're playing around with it here:

https://bitbucket.org/yt_analysis/yt/pull-request/962/wip-reduce-memory-usage

So if you want, try this out, let us know if it improves or degrades things.

-Matt