[yt-svn] commit/yt-doc: sskory: Updating the merger tree docs to reflect recent changes.

Tue Mar 27 10:23:29 PDT 2012

1 new commit in yt-doc:


https://bitbucket.org/yt_analysis/yt-doc/changeset/93ccf62959c4/
changeset:   93ccf62959c4
user:        sskory
date:        2012-03-27 17:24:08
summary:     Updating the merger tree docs to reflect recent changes.
affected #:  1 file

diff -r 45a9e83d961a1addf095dc5abaadcc8a92962338 -r 93ccf62959c400ad94c560ab9ce4de9e3ec8936a source/analysis_modules/merger_tree.rst

--- a/source/analysis_modules/merger_tree.rst
+++ b/source/analysis_modules/merger_tree.rst
@@ -31,6 +31,8 @@
 Clearly, another requirement is that Python has the
 `sqlite3 library <http://docs.python.org/library/sqlite3.html>`_
 installed.
+This should be built along with everything else yt needs
+if the ``install_script.sh`` was used.
 
 The merger tree can be calculated in parallel, and if necessary, it will run
 the halo finding in parallel as well. Please see the note below about the
@@ -77,18 +79,19 @@
 at the same time (`see more here <http://www.sqlite.org/lockingv3.html#how_to_corrupt>`_).
 NFS disks can store files on multiple physical hard drives, and it can take time
 for changes made by one task to appear to all the parallel tasks.
+Only one task of the merger tree ever interacts with the database,
+so these dangers are minimal,
+but in general it's a good idea to know something about the disk used to
+store the database.
 
-The Merger Tree takes extra caution to ensure that every task sees the exact
-same version of the database before writing to it, and only one task
-ever writes to the database at a time.
-This is accomplished by using MPI Barriers and md5 hashing of the database
-between writes.
 In general, it is recommended to keep the database on a 'real disk' 
-(/tmp for example, if all the tasks are on the same SMP node) if possible,
+(/tmp for example, if all the tasks are on the same SMP node,
+or RAM disk for extra speed) if possible,
 but it should work on a NFS disk as well.
-If the database must be stored on a NFS disk, the documentation for the NFS protocol
-should be consulted to see what settings are available that can minimize the potential for
-file replication problems of the database.
+If a temporary disk is used to store the database while it's being built,
+remember to copy the file to a permanent disk after the merger tree script
+is finished.
+
 
 Running and Using the Halo Merger Tree
 --------------------------------------
@@ -155,16 +158,18 @@
 If the halos are to be found during the course of building the merger tree,
 run with an appropriate number of tasks to the size of the dataset and the
 halo finder used.
-The merger tree itself, which compares halo membership in parallel very effectively,
-is almost completely constrained by the
-read/write times of the SQLite file.
+The speed of the merger tree itself,
+which compares halo membership in parallel very effectively,
+is almost completely constrained by the read/write times of the SQLite file.
 In tests with the halos pre-located, there is not much speedup beyond two MPI tasks.
 There is no negative effect with running the merger tree with more tasks (which is
 why if halos are to be found by the merger tree, the merger tree should be
-run with as many tasks as that step requires), but there is no benefit.
+run with as many tasks as that step requires), and indeed if the simulation
+is a large one, running in parallel does provide memory parallelism,
+which is important.
 
-How The Database Is Handled
----------------------------
+How The Database Is Handled In Analysis Restarts
+------------------------------------------------
 
 The Merger Tree is designed to allow the merger tree database to be built
 incrementally.
@@ -178,6 +183,12 @@
 referencing the same database as before.
 By referencing the same database as before, work does not need to be repeated.
 
+If the merger tree process is interrupted before completion (say, if the 
+jobs walltime is exceeded and the scheduler kills it), just run the exact
+same job again.
+The merger tree will check to see what work has already been completed, and
+resume where it left off.
+
 Additional Parameters
 ~~~~~~~~~~~~~~~~~~~~~
 
@@ -197,10 +208,6 @@
     rebuild the database regardless of whether or not the halo files or
     database exist on disk already.
     Default: False.
-  * ``sleep`` (float) - The amount of time in seconds tasks waits between
-    checks to make sure the SQLite database file is globally-identical.
-    This time is used to allow a parallel file system to synch up globally.
-    The value may not be negative or zero. Default: 1.
   * ``index`` (bool) - Whether to add an index to the SQLite file. True makes
     SQL searches faster at the cost of additional disk space. Default=True.

Repository URL: https://bitbucket.org/yt_analysis/yt-doc/

--

This is a commit notification from bitbucket.org. You are receiving
this because you have the service enabled, addressing the recipient of
this email.