summaryrefslogtreecommitdiff log msg author committer range
 ```1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 ``` ``````Persistent Homology Algorithm Toolkit (PHAT) ============================================ This is a Python interface for the `Persistent Homology Algorithm Toolkit`_, a software library that contains methods for computing the persistence pairs of a filtered cell complex represented by an ordered boundary matrix with Z\ :sub:`2` coefficients. For an introduction to persistent homology, see the textbook [1]_. This software package contains code for several algorithmic variants: * The "standard" algorithm (see [1]_, p.153) * The "row" algorithm from [2]_ (called pHrow in that paper) * The "twist" algorithm, as described in [3]_ (default algorithm) * The "chunk" algorithm presented in [4]_ * The "spectral sequence" algorithm (see [1]_, p.166) All but the standard algorithm exploit the special structure of the boundary matrix to take shortcuts in the computation. The chunk and the spectral sequence algorithms make use of multiple CPU cores if PHAT is compiled with OpenMP support. All algorithms are implemented as function objects that manipulate a given ``boundary_matrix`` (to be defined below) object to reduced form. From this reduced form one can then easily extract the persistence pairs. Alternatively, one can use the ``compute_persistence_pairs function`` which takes an algorithm as a parameter, reduces the given ``boundary_matrix`` and stores the resulting pairs in a given ``persistence_pairs`` object. The ``boundary_matrix`` class takes a "Representation" class as a parameter. This representation defines how columns of the matrix are represented and how low-level operations (e.g., column additions) are performed. The right choice of the representation class can be as important for the performance of the program as choosing the algorithm. We provide the following choices of representation classes: * ``vector_vector``: Each column is represented as a sorted ``std::vector`` of integers, containing the indices of the non-zero entries of the column. The matrix itself is a ``std::vector`` of such columns. * ``vector_heap``: Each column is represented as a heapified ``std::vector`` of integers, containing the indices of the non-zero entries of the column. The matrix itself is a ``std::vector`` of such columns. * ``vector_set``: Each column is a ``std::set`` of integers, with the same meaning as above. The matrix is stored as a ``std::vector`` of such columns. * ``vector_list``: Each column is a sorted ``std::list`` of integers, with the same meaning as above. The matrix is stored as a ``std::vector`` of such columns. * ``sparse_pivot_column``: The matrix is stored as in the vector_vector representation. However, when a column is manipulated, it is first converted into a ``std::set``, using an extra data field called the "pivot column". When another column is manipulated later, the pivot column is converted back to the ``std::vector`` representation. This can lead to significant speed improvements when many columns are added to a given pivot column consecutively. In a multicore setup, there is one pivot column per thread. * ``heap_pivot_column``: The same idea as in the sparse version. Instead of a ``std::set``, the pivot column is represented by a ``std::priority_queue``. * ``full_pivot_column``: The same idea as in the sparse version. However, instead of a ``std::set``, the pivot column is expanded into a bit vector of size n (the dimension of the matrix). To avoid costly initializations, the class remembers which entries have been manipulated for a pivot column and updates only those entries when another column becomes the pivot. * ``bit_tree_pivot_column`` (default representation): Similar to the ``full_pivot_column`` but the implementation is more efficient. Internally it is a bit-set with fast iteration over nonzero elements, and fast access to the maximal element. Installation ------------ Suppose you have checked out the PHAT repository at location \$PHAT. Then you can: cd \$PHAT/python ln -s ../include include # (or copy, we just need \$PHAT/include to be in the current folder as well) pip install pybind11 python setup.py install This will install PHAT for whatever Python installation your ``python`` executable is associated with. Please ensure you use the ``pip`` that comes from the same directory where your ``python`` executable lives! Currently, the PHAT Python bindings are known to work on: * Linux with Python 2.7 (tested on Ubuntu 14.04 with system Python) * Linux with Python 3.5 (tested on Ubuntu 14.04 with Anaconda) * Mac OS X with Python 2.7 (tested on El Capitan with Anaconda) * Mac OS X with Python 3.5 (tested on El Capitan with Anaconda) Other configurations are untested. Please let us know if there is a platform you'd like us to support, we will do so if we can. Sample usage ------------ We will build an ordered boundary matrix of this simplicial complex consisting of a single triangle:: 3 |\\ | \\ | \\ | \\ 4 5| \\ | \\ | 6 \\ | \\ |________\\ 0 2 1 Now the code:: import phat # define a boundary matrix with the chosen internal representation boundary_matrix = phat.boundary_matrix(representation = phat.representations.vector_vector) # set the respective columns -- (dimension, boundary) pairs boundary_matrix.columns = [ (0, []), (0, []), (1, [0,1]), (0, []), (1, [1,3]), (1, [0,3]), (2, [2,4,5])] # or equivalently, boundary_matrix = phat.boundary_matrix(representation = ..., columns = ...) # would combine the creation of the matrix and the assignment of the columns # print some information of the boundary matrix: print("\nThe boundary matrix has %d columns:" % len(boundary_matrix.columns)) for col in boundary_matrix.columns: s = "Column %d represents a cell of dimension %d." % (col.index, col.dimension) if (col.boundary): s = s + " Its boundary consists of the cells " + " ".join([str(c) for c in col.boundary]) print(s) print("Overall, the boundary matrix has %d entries." % len(boundary_matrix)) pairs = boundary_matrix.compute_persistence_pairs() pairs.sort() print("\nThere are %d persistence pairs: " % len(pairs)) for pair in pairs: print("Birth: %d, Death: %d" % pair) References: .. [1] H.Edelsbrunner, J.Harer: Computational Topology, An Introduction. American Mathematical Society, 2010, ISBN 0-8218-4925-5 .. [2] V.de Silva, D.Morozov, M.Vejdemo-Johansson: Dualities in persistent (co)homology. Inverse Problems 27, 2011 .. [3] C.Chen, M.Kerber: Persistent Homology Computation With a Twist. 27th European Workshop on Computational Geometry, 2011. .. [4] U.Bauer, M.Kerber, J.Reininghaus: Clear and Compress: Computing Persistent Homology in Chunks. arXiv:1303.0477_ .. _arXiv:1303.0477: http://arxiv.org/pdf/1303.0477.pdf .. _`Persistent Homology Algorithm Toolkit`: https://bitbucket.org/phat/phat-code ``````