From c817c81d9d79fe77b0729216b7db5a7eab56a23c Mon Sep 17 00:00:00 2001 From: "ulrich.bauer@gmail.com" Date: Thu, 28 Feb 2013 19:31:04 +0000 Subject: initial checkin git-svn-id: https://phat.googlecode.com/svn/trunk@2 8e3bb3c2-eed4-f18f-5264-0b6c94e6926d --- README | 121 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 121 insertions(+) create mode 100644 README (limited to 'README') diff --git a/README b/README new file mode 100644 index 0000000..a165026 --- /dev/null +++ b/README @@ -0,0 +1,121 @@ +PHAT (Persistent Homology Algorithm Toolbox) - Version 1.0 +Copyright 2013 IST Austria + +Authors: + +Ulrich Bauer (ulrich.bauer@ist.ac.at) +Michael Kerber (mkerber@mpi-inf.mpg.de) +Jan Reininghaus (jan.reininghaus@ist.ac.at) + +Description: + +This software library contains methods for computing the persistence pairs of a +filtered cell complex represented by an ordered boundary matrix with Z_2 coefficients. +For an introduction to persistent homology, see the textbook [1]. This software package +contains code for several algorithmic variants: + +- The standard algorithm (see [1], p.153) +- The "row algorithm" from [2] (called pHrow in there) +- The "twist algorithm", as described in [3] (default algorithm) +- The "chunk algorithm" presented in [4] + +The last two algorithms exploit the special structure of the boundary matrix +to take shortcuts in the computation. The chunk algorithm makes use of multiple +CPU cores if it is compiled with OpenMP support. + +All algorithms are implemented as classes with a common interface: +they provide a member function "reduce" which takes a Boundary_matrix object +as input (to be defined below) and manipulate it to reduced form. +From this reduced form one can then easily extract the persistence pairs. +Alternatively, the algorithms can compute the associated "Persistence_pairs" +directly using the "get_pairs" method. + +The algorithm classes, as well as the Boundary_matrix class, are templates +that take a "Representation" class as parameter. This representation defines +how columns of the matrix are represented and how low-level operations +(e.g., column additions) are performed. The right choice of the representation +class can be as important for the performance of the program as choosing the +algorithm. We provide the following choices of representation classes: + +- "vector_vector": Each column is represented as a sorted std::vector of integers, + containing the indices of the non-zero entries of the column. The matrix + itself is a std::vector of such columns. +- "vector_set": Each column is a std::set of integers, with the same meaning + as before. The matrix is stored as a std::vector of such columns. +- "sparse_pivot_column" (default representation): The matrix is stored as in the + vector_vector representation. However, when a column is manipulated, it is first + converted into a std::set, using an extra data field called the "pivot column". + When another column is manipulated later, the pivot column is converted back to + the std::vector representation. This can lead to speed improvements when many columns + are added to a given pivot column consecutively. In a multicore setup, there is one + pivot column per core. +- "full_pivot_column": The same idea as in the sparse version. However, + instead of a std::set, the pivot column is expanded into a bit vector of size n + (the dimension of the matrix). To avoid costly initializations, + the class remembers which entries have been manipulated for a pivot column + and updates only those entries when another column becomes the pivot. + +There are two ways to interface with the library: + +A) using files: + 1) write the boundary matrix / filtration into a file "input" (see below for the file format). + 2) compile src/phat.cpp and run it: + phat input output (if you use the binary encoding, see below) + phat --ascii input output (if you use ascii encoding, see below) + 3) read the resulting persistence pairs into your program + +B) using the C++ library interface: + 0) include all headers found in src/phat.cpp + 1) define a boundary matrix object, e.g. + phat::boundary_matrix< full_pivot_column > boundary_matrix; + 2) set the number of columns: + boundary_matrix.set_num_cols(...); + 3) initialize each column using + boundary_matrix.set_col(...) and boundary_matrix.set_dim(...) + 4) define an object to hold the result: + phat::persistence_pairs pairs; + 5) run an algorithm like this: + phat::compute_persistence_pairs< phat::chunk_reduction >( pairs, boundary_matrix ); + 6) use pairs.get_num_pairs() and pairs.get_pair(...) to examine the result + + A simple example that demonstrates this functionality can be found in src/simple_example.cpp + +File Formats: + +The library supports input and output in ascii and binary format +through the methods "[load|save]_[ascii|binary]" in the classes boundary_matrix +and persistence_pairs. The file formats are defined as follows: + +boundary_matrix - ascii: + The file represents the filtration of the cell complex, containing one cell + per line (empty lines and lines starting with "#" are ignored). A cell is given by + a sequence of integers, separated by spaces, where the first integer denotes the + dimension of the cell, and all following integers give the indices + of the cells that form its boundary (the index of a cell is its position + in the filtration, starting with 0). + A sample file "single_triangle.dat" can be found in the examples folder. + +boundary_matrix - binary: + I binary format, the file is simply interpreted as a sequence of 64 bit signed integer + numbers. The first number is interpreted as the number of cells of the complex. The + descriptions of the cells is expected to follow, with the first number representing the + dimension of the cell, the next number, say N, representing the size of the boundary, + followed by N numbers denoting the indices of the boundary cells. + A sample file "single_triangle.bin" can be found in the examples folder. + +persistence_pairs - ascii: + The file contains the persistence pairs, sorted by birth index. The first integer in the + file is equal to the number of pairs. It is followed by pairs of integers encode the + respective birth and death indices. + A sample file "single_triangle_persistence_pairs.dat" can be found in the examples folder. + +persistence_pairs - binary: + Same as ascii format, see above. Only now the integers are encoded as 64bit signed integers. + A sample file "single_triangle_persistence_pairs.bin" can be found in the examples folder. + +References: + +[1] H.Edelsbrunner, J.Harer: Computational Topology, An Introduction. American Mathematical Society, 2010, ISBN 0-8218-4925-5 +[2] V.de Silva, D.Morozov, M.Vejdemo-Johansson: Dualities in persistent (co)homology. Inverse Problems 27, 2011 +[3] C.Chen, M.Kerber: Persistent Homology Computation With a Twist. 27th European Workshop on Computational Geometry, 2011. +[4] U.Bauer, M.Kerber, J.Reininghaus: Clear and Compress: Computing Persistent Homology in Chunks. arxiv: ????.???? -- cgit v1.2.3