[yt-dev] Areas of collaboration between glue and yt

Tue Jun 14 12:58:25 PDT 2016

Hi everyone,

A few of us (Austin Gilbert, Penny Qian, Matt Turk, John Zuhone, and
myself) had a discussion by email about possible avenues for collaboration
between glue and yt, and we have decided it would make more sense to
discuss this in the open.

To get started, I've included copies of the relevant parts of the emails
below (oldest at the top). Please feel free to chime in!

We're also planning to have a Google Hangout to discuss this - I've created
a Doodle, which you can fill out if you want to join the discussion:

http://doodle.com/poll/53xkt6iiurukcgx4

In addition to this list, we'll be using the yt project slack site (with a
special channel for glue) to discuss some of these ideas day-to-day, so
just reply to me off-list if you want to be included in the slack channel,
and I'll pass on the requests to the yt team.

Cheers,
Tom

---

*Austin Gilbert:*
I have heard through the yt project network that you are looking into
making it compatible with glueviz. As a big fan of user interfaces for data
interaction, this is really exciting to me, and I would like to be as
helpful as possible. Could you tell me what work is needed in order to make
yt fully functional with glue? Aside from existing widgets in the
framework, are you planning on adding some specific to yt? Let me know,
because I am eager to help and to get this up and running.

*Penny Qian:*

Many thanks for the exciting initiative for the collaboration! Regarding
your first question, honestly it is not super clear to me what exactly
should we do to interface yt and glue. Given that Glue is featured with a
GUI, while yt is a powerful data analysis and visualization package. Could
we probably explore the following possibilities:

   - Using Glue to free astronomers from writing python code: load data
   into glue, do the analysis in yt (develop a GUI panel for the options if
   needed), then render the analyzed data with the existing widgets in Glue;
   - Extend the data visualization capability of Glue: it is well known
   that yt has implement a lot of fancy visualizations. Could we load the data
   with Glue’s GUI (the data might actually be handled by yt), do the analysis
   in yt (with a GUI panel in Glue if needed), render the data in yt, and then
   present the data in Glue as a widget?
   - If we develop a yt widget in Glue, we could implement the linked-view
   functionality in this widget, allowing the users to manipulate the data
   visually in Glue and then carry out the data analysis back in yt.

As for the second question, were you talking about adding some specific
widget into the Glue framework, so that Glue can render the data from yt?

*Tom Robitaille:*

Thanks Austin and Penny for getting the conversation started on a yt <->
glue collaboration! :)

One of the places where I think that it would be great to collaborate
between yt and glue is to develop an abstract data layer in glue that
better separates data access and computation from the interactive
visualization, and leverage yt as a data access and computation layers.
I'll describe a little what I mean by this below.

One of the main issues with glue currently is that it is:

- not well suited to deal with large datasets in general
- not well suited to dealing with non-regular cartesian data (even if not
too large)

Currently, glue loads data into Data classes, and viewers then access the
data directly and do computations (for example calculate a histogram of all
values). Calculating what sections of datasets fall inside subsets is also
done outside of the data objects and is not done in a 'smart' way in that
all the data has to be accessed, and the entire subset computed straight
away.

In practice, what this means is that we have a FITS reader that understands
memory mapping, but as soon as you do something like compute a histogram of
all pixel values, all the data has to be read, and you lose the ability to
deal with large files. Similarly, if the user makes a selection in the
cube, often the whole cube has to be read in to determine which pixels have
been selected.

A better mechanism would be to develop what I refer to as an abstract
data/computation layer, which means that we define an API that any data
object needs to have for data access, but also include things like
computation of things like fixed resolution buffers, or selection of
subsets. The idea would be that one can then implement a much wider variety
of data objects - for example a data object that would behind the scenes be
powered by yt, but also a data object that actually communicates with a
remote computer cluster on which the data is stored.

The interactive visualization part of glue would then not need to worry
about the details of the data access - it would essentially say 'I need a
fixed resolution buffer with these dimensions', or 'I need a histogram',
and this would be delegated to the data object.

Of course, yt is perfectly suited to this since it *already* provides a
data abstraction layer - so this would be a matter of defining an API for
glue data objects, then writing a wrapper for yt. In future, one could even
imaging running glue on a laptop, and having a data object that
communicates with a cluster that is running yt.

The end result would be that researchers could *load up a large simulation
in glue and be able to do the kind of linked data visualization that glue
can normally do*, which I think would be extremely powerful.

Of course, related to what Penny said, I think there are a couple of other
avenues for collaboration:

- When using the 3D viewers, we could have an 'export to yt' option which
provides a yt script to produce a production-quality 3D visualization (the
VisPy viewers we have look ok but I don't think the static output from
these is anywhere near as nice as what yt can do). This would simply be a
matter of writing a plugin for a yt exporter.

- It would be fun to investigate the new yt OpenGL rendering and see how
this compares to what we currently use (VisPy), and potentially develop a
new viewer based on the yt OpenGL renderer.

I think it would be great to discuss all of these ideas, and would like to
suggest that we have a Google Hangout in the short term. I'll send out
another email with a link to a Doodle poll!

*Austin Gilbert:*

I think a google hangout would be a great way to get started and ensure we
have a unified plan. Additionally, we should go ahead and move this to a
public email list. I would also like to recommend a Slack channel for day
to day communications; the yt community has been using it for a while now
to great effect.

In regards to the data abstraction layer you have described, I think that
YT is definitely well suited to working with data objects and selecting
regions of data in the case of large file sets. The data objects currently
supported in YT enable smart file reading: when you create a subregion of
data, only that data is read from disk, so very large datasets are not
entirely un-manageable  Additionally, YT has a wide number of frontends for
different data formats so incorporating it into data objects could enable a
whole new community to utilize glue. I think YT could accomplish what you
are thinking and glue can accomplish what I'm thinking.

On the user side, I want to make sure that if you incorporate YT, YT users
still get the capabilities of the program they are used to working with
alongside the linking capabilities and user interface that glue provides.
For me this primarily looks like ensuring glue has the ability to utilize
YT's standard plotting measures in some form of widget. I also like the
idea of including the opengl features that yt can offer.

I will certainly let others in the yt community know about the hangout to
discuss what could happen.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.spacepope.org/pipermail/yt-dev-spacepope.org/attachments/20160614/6754daea/attachment.htm>