[yt-dev] Packaging.

Matthew Turk matthewturk at gmail.com
Thu Aug 29 12:26:52 PDT 2013


Hi all,

We need to figure out yt packaging.  This is becoming increasingly
hard, particularly as the number of dependencies grows.  (The upgrade
to IPython 1.0 and Matplotlib 1.3.0 has caused several issues, which
spurred this discussion.)

As it stands, we mainly provide yt through the install script.  Every
time a new version comes out, we check compatibility, we update the
install script, and we deploy that.  Unfortunately, as packages evolve
externally to yt, this results in occasional breakages, new (implicit)
dependencies, and complexity that goes super-exponentially.  I like
the install script, and it is what I use, but I think we need to
re-strategize.  It was built many years ago when packaging was a
different landscape, and when we needed a way to get a relatively
small number of dependencies onto a relatively small set of system
types.

Every day, it seems, brings another problem with the install script.
Not all of these are our fault.  But more importantly, I don't think
we should be spending our time on them, when we can only bandaid
something for so long before it's not workable.

That being said, installation is the single biggest impediment to
people using yt, so we need to ensure it is still easy and simple.

There are a few options for other installation procedures.  I would
like to retain a stripped down version of the install script for ease
and simplicity, but removing many of the optional installs and
focusing instead on the core packages.

So here are the options.  I'd prefer we choose *one* as the primary
method, and then we (potentially) demonstrate how to use the others.
As a note, part of this process will also be the relicensing as BSD
and shoring up our source-based installations, ensuring that they are
correctly packaged, following best-practices guidelines for Python
source.  I believe I may have dropped the ball somewhat on that front.

 * Conda / Anaconda: This package manager is gaining traction, and I
think that once relicensing is done we stand a good chance of being
included in the base install.  This would mean that someone could
download Conda and just use it.  Even without that inclusion, however,
I've heard good things.  Conda is based on binary distributions, but
we could also manage our own packaging (potentially in an automated
way) and update with some frequency.  Conda is also somewhat tied to
the Wakari platform, and being part of Conda would mean being
available on the IPython-in-the-cloud that is Wakari.  I believe this
works well on supers.
 * Canopy: This is the Enthought package manager, which Sam has had
some good experience with it.  I do not have a feeling for how it
works on supers.
 * Source-only: This is the way some packages are managed, but it is
essentially giving up, and while I think it is a good way to go
forward, I'm not sure we'll ever be trivially pip-installable.
 * Keep trying to plug holes as they come up in the install script.

What I think would be very productive is to hear people's experiences
with these package managers.  Sam, Nathan, anybody?

Focusing on a platform-specific manager (brew, macports, apt, rpm) is
a non-starter; they are good options, and we should develop a protocol
for supporting platform-specific packaging systems, but they
bottleneck quite seriously on person-time and we should think
carefully before we tie ourselves to one.

-Matt

PS The period in the subject line was editorial.  I'd very much like
to settle on a path for all of this stuff; packaging remains one of
the hardest issues in scientific python, as Software Carpentry has
noted time and again.  We're now pushing the install script, which is
great for clusters, but it's a remnant of a time before packaging in
Python was as mature as it is now, and before we had as many corner
cases as we do now -- not because they didn't exist, but because we
didn't have enough users to see them.



More information about the yt-dev mailing list