Hey Matt,<div><br></div><div>Thanks for optimizing the code--I know it isn't a glorious job, but it will help us all in the end. I think your change looks great, but I'm curious: is this the sort of operation that is going to consistently come up in the future, or a one-off change you're making in the enzo front end? If it is something we should be using for future casting, perhaps we should make a helper function to do this, so that everyone can consistently use this optimized code instead of the more common .astype cast. Other than that, I'm +1 on this change.</div>
<div><br></div><div>Cameron</div><div><br><div class="gmail_quote">On Thu, Dec 6, 2012 at 12:50 PM, Matthew Turk <span dir="ltr"><<a href="mailto:matthewturk@gmail.com" target="_blank">matthewturk@gmail.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="im">On Thu, Dec 6, 2012 at 2:44 PM, Nathan Goldbaum <<a href="mailto:nathan12343@gmail.com">nathan12343@gmail.com</a>> wrote:<br>
> Pardon my ignorance, but is the case that computations done in 64 bit mode in enzo are normally saved to disk as 32 bit floats? If so, is there a setting I can change to make sure that my enzo datasets are always written to disk with double precision?<br>
<br>
</div>I disabled that particular anti-feature some time ago, with the<br>
"New_Grid_WriteGrid.C" stuff. Now in Enzo, you write to disk exactly<br>
what you store in memory.<br>
<div class="HOEnZb"><div class="h5"><br>
><br>
> Since most enzo calculations are done in 64 bit anyway and this change allows some pretty significant speedups, I'm +1 on this change.<br>
><br>
> On Dec 6, 2012, at 11:30 AM, Matthew Turk wrote:<br>
><br>
>> Hi all,<br>
>><br>
>> I've been doing some benchmarking of various operations in the Enzo<br>
>> frontend in yt 2.x. I don't believe other frontends suffer from this,<br>
>> for the main reason that they're all 64 bit everywhere.<br>
>><br>
>> The test dataset is about ten gigs, with a bunch of grids. I'm<br>
>> extracting a surface, which means from a practical standpoint that I'm<br>
>> filling ghost zones for every grid inside the region of interest.<br>
>> There are many places in yt that we either upcast to 64-bit floats or<br>
>> that we assume 64-bits. Basically, nearly all yt-defined Cython or C<br>
>> operations assume 64-bit floats.<br>
>><br>
>> There's a large quantity of Enzo data out there that is float32 on<br>
>> disk, which gets passed into yt, where it gets handed around until it<br>
>> is upcast. There are two problems here: 1) We have a tendency to use<br>
>> "astype" instead of "asarray", which means the data is *always*<br>
>> duplicated. 2) We often do this repeatedly for the same set of grid<br>
>> data; nowhere is this more true than when generating ghost zones.<br>
>><br>
>> So for the dataset I've been working on, ghost zones are a really<br>
>> intense prospect. And the call to .astype("float64") actually<br>
>> completely dominated the operation. This comes from both copying the<br>
>> data, as well as casting. I found two different solutions.<br>
>><br>
>> The original code:<br>
>><br>
>> g_fields = [grid[field].astype("float64") for field in fields]<br>
>><br>
>> This is bad even if you're using float64 data types, since it will<br>
>> always copy. So it has to go. The total runtime for this dataset was<br>
>> 160s, and the most-expensive function was "astype" at 53 seconds.<br>
>><br>
>> So as a first step, I inserted a cast to "float64" if the dtype of an<br>
>> array inside the Enzo IO system was "float32". This way, all arrays<br>
>> were upcast automatically. This led me to see zero performance<br>
>> improvement. So I checked further and saw the "always copy" bit in<br>
>> astype, which I was ignorant of. This option:<br>
>><br>
>> g_fields = [np.asarray(grid[field], "float64") for field in fields]<br>
>><br>
>> is much faster, and saves a bunch of time. But 7 seconds is still<br>
>> spent inside "np.array", and total runtime is 107.5 seconds. This<br>
>> option is the fasted:<br>
>><br>
>> g_fields = []<br>
>> for field in fields:<br>
>> gf = grid[field]<br>
>> if gf.dtype != "float64": gf = gf.astype("float64")<br>
>> g_fields.append(gf)<br>
>><br>
>> and now total runtime is 95.6 seconds, with the dominant cost *still*<br>
>> in _get_data_from_grid. At this point I am much more happy with the<br>
>> performance, although still quite disappointed, and I'll be doing<br>
>> line-by-line next to figure out any more micro-optimizations.<br>
>><br>
>> Now, the change to _get_data_from_grid *itself* will greatly impact<br>
>> performance for 64-bit datasets. But also updating the io.py to<br>
>> upcast-on-read datasets that are 32-bit will help speed things up<br>
>> considerably for 32-bit datasets as well. The downside is that it<br>
>> will be difficult to get back raw, unmodified 32-bit data from the<br>
>> grids, rather than 32-bit data that has been cast to 64-bits.<br>
>><br>
>> Is this an okay change to make?<br>
>><br>
>> [+-1][01]<br>
>><br>
>> -Matt<br>
>> _______________________________________________<br>
>> yt-dev mailing list<br>
>> <a href="mailto:yt-dev@lists.spacepope.org">yt-dev@lists.spacepope.org</a><br>
>> <a href="http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org" target="_blank">http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org</a><br>
><br>
> _______________________________________________<br>
> yt-dev mailing list<br>
> <a href="mailto:yt-dev@lists.spacepope.org">yt-dev@lists.spacepope.org</a><br>
> <a href="http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org" target="_blank">http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org</a><br>
_______________________________________________<br>
yt-dev mailing list<br>
<a href="mailto:yt-dev@lists.spacepope.org">yt-dev@lists.spacepope.org</a><br>
<a href="http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org" target="_blank">http://lists.spacepope.org/listinfo.cgi/yt-dev-spacepope.org</a><br>
</div></div></blockquote></div><br></div>