Introduction ===== This package is a cleaned up subset of the hogdepodge of Python glue that I regularly use to massage data into and out of persistent homology and other TDA computations. It does *not* compute anything itself, and exists primarily to marshal data into and out of the formats used by the excellent [DIPHA](https://github.com/DIPHA/dipha) software, in order to make it easier to pre- and post-process data in Python. Caveats ----- I decided to clean up and release a subset of these scripts to make life slightly easier for those who compute persistent homology and prefer to manipulate data in Python. The scripts come with the following caveats: - The scripts were too messy to release as they were, and the cleaning up process means that this is now essentially untested software again. Beware. - There is nothing of substance here. This is just glue around DIPHA. - I use Python as modern Perl, and my experience with it is limited to quickly manipulating data without too much thought for writing structured software. Beware. - I make no attempt to accurately reflect the full capabilities of DIPHA. - Since Python will let you write `Interval("banana", 5)`, so will I, and you and your persistence generator "from banana to 5" can go solve problems down the road. Installation ----- The scripts require NumPy and Matplotlib. If you want them to manage running DIPHA as well, then you of course need working dipha and mpirun executables. If the environment variables DIPHA and/or MPIRUN are set, their values specify these executables. If not, those are searched for in PATH. The package can be installed using standard Python tools, for example by doing `python setup.py install --user`- Todo/missing ---- * Support for general filtered complexes (DIPHA's `DIPHA_WEIGHTED_BOUNDARY_MATRIX`) to allow for non-flag complexes. This has high priority, and should be added soon. * More flexible plotting. Use ===== I primarily use this scripts to convert to and from DIPHA's file formats. The `DiphaRunner` class is available for those who would like to avoid manually running DIPHA. I am not aiming for great performance, efficiency or generality with these scripts. If you you want either, you are better off writing the DIPHA files yourself. The scripts are intended to be easy to use. Examples ----- ### Saving a complete weighted graph in DIPHA's format This is as simple as import phstuff.diphawrapper as dipha import numpy as np weights = np.random.uniform(0, 1, (100, 100)) # The graph. dipha.save_weight_matrix("weights.dipha", weights) DIPHA can now be run on "weights.dipha". ### Loading a DIPHA barcode file To load a dimension-`1` persistence diagram DIPHA stored in "out.dipha", do: import phstuff.diphawrapper as dipha barcode = dipha.load_barcode("out.dipha") for interval in barcode[1]: print(interval) ### Excluding edges/simplices above a certain weight If we want to exclude all edges with weights above some threshold in the filtration, we can either use `save_edge_list` or `save_masked_weight_matrix`. An example of the latter is shown below: import phstuff.diphawrapper as dipha import numpy as np import numpy.ma as ma weights = np.random.uniform(0, 1, (100, 100)) masked = ma.masked_greater(weights, 0.5) # All weights above 0.5 are # masked out and will not be # present in the graph, # effectively ending the # filtration at 0.5. dipha.save_masked_weight_matrix("weights.dipha", masked) For more information about masked arrays, see [the NumPy documentation](https://docs.scipy.org/doc/numpy/reference/maskedarray.generic.html). Do remember that the masked entries are the edges that will *not* be present in the graph. ### Running DIPHA from Python If you prefer to control the entire computation from Python, these scripts can generate the necessary temporary files and run DIPHA on them. For this, the mpirun and dipha executables must be in the PATH environment variable, or they must be specified through the MPIRUN and DIPHA environment variables. weights = np.random.uniform(0, 1, (100, 100)) dipharunner = dipha.DiphaRunner(2) # Compute up to 2-simplices. dipharunner.weight_matrix(weights) dipharunner.run() for interval in dipharunner.barcode[1]: print(interval) fig = plt.figure() ax = fig.add_subplot(1,1,1) bc.plot(ax, dipharunner.barcode[1], weights.min(), weights.max()) plt.show() ### PH of an alpha complex made with CGAL TODO.