[yt-dev] chunk size undefined before fill

Douglas Harvey Rudd drudd at uchicago.edu
Thu Mar 14 11:16:32 PDT 2013


> Yup, you're right.  Sorry about that.  I want to take the opportunity
> also to thank you for your work on this -- as I've said before, opaque
> data object support is going to be a huge win for us, and I appreciate
> you sorting through all the myopic, "Matt didn't think this would be a
> problem" sections.  :)
> 

Of course, changing code in this way will always uncover implicit assumptions, 
and these are very difficult to pick up in unit tests (since most of those tests will
reflect the same assumptions).

> 
> Yes, something like this, but is there going to be a case where
> _current_chunk is None?  Or will this always be called after
> _identify_base_chunk?  Incidentally, I am now wondering if we should
> just get rid of reliance on .size in the first place.  It's not clear
> to me that there is a case when we absolutely need it, except *inside*
> the IO handlers, and those should all be able to guess or not rely on
> the size anyway.  Right?
> 

I don't know, but I also didn't know what to do in that situation, so I let it fall
back to returning an undefined quantity (_size is None).  We can identify a
base chunk, but then it will simply fail due to the size being queried before
the read.

>> def _chunked_read(self, chunk):
>>        # There are several items that need to be swapped out
>>        # field_data, size, shape
>>        old_field_data, self.field_data = self.field_data, YTFieldData()
>>        old_chunk, self._current_chunk = self._current_chunk, chunk
>>        self._size = None
>>        old_locked, self._locked = self._locked, False
>>        yield
>>        self.field_data = old_field_data
>>        self._current_chunk = old_chunk
>>        self._size = None
>>        self._locked = old_locked
>> 
>> That may just push the problem further down the call stack, but at least means this code won't ask for .get_size until needed.
> 
> I actually don't see the diff here, what changed in this function?
> 

I removed the code that copied the old size and restored it after the yield.  Instead, the _size is always
invalidated any time _current_chunk is changed, and then will be repopulated as needed.  I'm assuming
here that size is trivially computed from the chunks.  

Doug


More information about the yt-dev mailing list