summaryrefslogtreecommitdiff
path: root/README
diff options
context:
space:
mode:
authorulrich.bauer@gmail.com <ulrich.bauer@gmail.com@8e3bb3c2-eed4-f18f-5264-0b6c94e6926d>2013-02-28 19:31:04 +0000
committerulrich.bauer@gmail.com <ulrich.bauer@gmail.com@8e3bb3c2-eed4-f18f-5264-0b6c94e6926d>2013-02-28 19:31:04 +0000
commitc817c81d9d79fe77b0729216b7db5a7eab56a23c (patch)
tree44fb7b6b7e41bdf3f555e843338797de86cc3589 /README
parentf32c606eee06544900ebe684daca14b8ac26231e (diff)
initial checkin
git-svn-id: https://phat.googlecode.com/svn/trunk@2 8e3bb3c2-eed4-f18f-5264-0b6c94e6926d
Diffstat (limited to 'README')
-rw-r--r--README121
1 files changed, 121 insertions, 0 deletions
diff --git a/README b/README
new file mode 100644
index 0000000..a165026
--- /dev/null
+++ b/README
@@ -0,0 +1,121 @@
+PHAT (Persistent Homology Algorithm Toolbox) - Version 1.0
+Copyright 2013 IST Austria
+
+Authors:
+
+Ulrich Bauer (ulrich.bauer@ist.ac.at)
+Michael Kerber (mkerber@mpi-inf.mpg.de)
+Jan Reininghaus (jan.reininghaus@ist.ac.at)
+
+Description:
+
+This software library contains methods for computing the persistence pairs of a
+filtered cell complex represented by an ordered boundary matrix with Z_2 coefficients.
+For an introduction to persistent homology, see the textbook [1]. This software package
+contains code for several algorithmic variants:
+
+- The standard algorithm (see [1], p.153)
+- The "row algorithm" from [2] (called pHrow in there)
+- The "twist algorithm", as described in [3] (default algorithm)
+- The "chunk algorithm" presented in [4]
+
+The last two algorithms exploit the special structure of the boundary matrix
+to take shortcuts in the computation. The chunk algorithm makes use of multiple
+CPU cores if it is compiled with OpenMP support.
+
+All algorithms are implemented as classes with a common interface:
+they provide a member function "reduce" which takes a Boundary_matrix object
+as input (to be defined below) and manipulate it to reduced form.
+From this reduced form one can then easily extract the persistence pairs.
+Alternatively, the algorithms can compute the associated "Persistence_pairs"
+directly using the "get_pairs" method.
+
+The algorithm classes, as well as the Boundary_matrix class, are templates
+that take a "Representation" class as parameter. This representation defines
+how columns of the matrix are represented and how low-level operations
+(e.g., column additions) are performed. The right choice of the representation
+class can be as important for the performance of the program as choosing the
+algorithm. We provide the following choices of representation classes:
+
+- "vector_vector": Each column is represented as a sorted std::vector of integers,
+ containing the indices of the non-zero entries of the column. The matrix
+ itself is a std::vector of such columns.
+- "vector_set": Each column is a std::set of integers, with the same meaning
+ as before. The matrix is stored as a std::vector of such columns.
+- "sparse_pivot_column" (default representation): The matrix is stored as in the
+ vector_vector representation. However, when a column is manipulated, it is first
+ converted into a std::set, using an extra data field called the "pivot column".
+ When another column is manipulated later, the pivot column is converted back to
+ the std::vector representation. This can lead to speed improvements when many columns
+ are added to a given pivot column consecutively. In a multicore setup, there is one
+ pivot column per core.
+- "full_pivot_column": The same idea as in the sparse version. However,
+ instead of a std::set, the pivot column is expanded into a bit vector of size n
+ (the dimension of the matrix). To avoid costly initializations,
+ the class remembers which entries have been manipulated for a pivot column
+ and updates only those entries when another column becomes the pivot.
+
+There are two ways to interface with the library:
+
+A) using files:
+ 1) write the boundary matrix / filtration into a file "input" (see below for the file format).
+ 2) compile src/phat.cpp and run it:
+ phat input output (if you use the binary encoding, see below)
+ phat --ascii input output (if you use ascii encoding, see below)
+ 3) read the resulting persistence pairs into your program
+
+B) using the C++ library interface:
+ 0) include all headers found in src/phat.cpp
+ 1) define a boundary matrix object, e.g.
+ phat::boundary_matrix< full_pivot_column > boundary_matrix;
+ 2) set the number of columns:
+ boundary_matrix.set_num_cols(...);
+ 3) initialize each column using
+ boundary_matrix.set_col(...) and boundary_matrix.set_dim(...)
+ 4) define an object to hold the result:
+ phat::persistence_pairs pairs;
+ 5) run an algorithm like this:
+ phat::compute_persistence_pairs< phat::chunk_reduction >( pairs, boundary_matrix );
+ 6) use pairs.get_num_pairs() and pairs.get_pair(...) to examine the result
+
+ A simple example that demonstrates this functionality can be found in src/simple_example.cpp
+
+File Formats:
+
+The library supports input and output in ascii and binary format
+through the methods "[load|save]_[ascii|binary]" in the classes boundary_matrix
+and persistence_pairs. The file formats are defined as follows:
+
+boundary_matrix - ascii:
+ The file represents the filtration of the cell complex, containing one cell
+ per line (empty lines and lines starting with "#" are ignored). A cell is given by
+ a sequence of integers, separated by spaces, where the first integer denotes the
+ dimension of the cell, and all following integers give the indices
+ of the cells that form its boundary (the index of a cell is its position
+ in the filtration, starting with 0).
+ A sample file "single_triangle.dat" can be found in the examples folder.
+
+boundary_matrix - binary:
+ I binary format, the file is simply interpreted as a sequence of 64 bit signed integer
+ numbers. The first number is interpreted as the number of cells of the complex. The
+ descriptions of the cells is expected to follow, with the first number representing the
+ dimension of the cell, the next number, say N, representing the size of the boundary,
+ followed by N numbers denoting the indices of the boundary cells.
+ A sample file "single_triangle.bin" can be found in the examples folder.
+
+persistence_pairs - ascii:
+ The file contains the persistence pairs, sorted by birth index. The first integer in the
+ file is equal to the number of pairs. It is followed by pairs of integers encode the
+ respective birth and death indices.
+ A sample file "single_triangle_persistence_pairs.dat" can be found in the examples folder.
+
+persistence_pairs - binary:
+ Same as ascii format, see above. Only now the integers are encoded as 64bit signed integers.
+ A sample file "single_triangle_persistence_pairs.bin" can be found in the examples folder.
+
+References:
+
+[1] H.Edelsbrunner, J.Harer: Computational Topology, An Introduction. American Mathematical Society, 2010, ISBN 0-8218-4925-5
+[2] V.de Silva, D.Morozov, M.Vejdemo-Johansson: Dualities in persistent (co)homology. Inverse Problems 27, 2011
+[3] C.Chen, M.Kerber: Persistent Homology Computation With a Twist. 27th European Workshop on Computational Geometry, 2011.
+[4] U.Bauer, M.Kerber, J.Reininghaus: Clear and Compress: Computing Persistent Homology in Chunks. arxiv: ????.????