summaryrefslogtreecommitdiff
path: root/src/python/doc
diff options
context:
space:
mode:
Diffstat (limited to 'src/python/doc')
-rw-r--r--src/python/doc/installation.rst11
-rw-r--r--src/python/doc/perslay_params.md95
-rw-r--r--src/python/doc/representations.rst77
-rw-r--r--src/python/doc/representations_sum.inc24
4 files changed, 188 insertions, 19 deletions
diff --git a/src/python/doc/installation.rst b/src/python/doc/installation.rst
index 4eefd415..639378ed 100644
--- a/src/python/doc/installation.rst
+++ b/src/python/doc/installation.rst
@@ -371,6 +371,14 @@ PyTorch
`PyTorch <https://pytorch.org/>`_ is currently only used as a dependency of
`PyKeOps`_, and in some tests.
+TensorFlow
+----------
+
+:class:`~gudhi.tensorflow.perslay` from the :doc:`persistence representations </representations>` module
+requires `TensorFlow <https://https://www.tensorflow.org/install/>`_.
+
+`TensorFlow <https://www.tensorflow.org>`_ is also used in some automatic differentiation tests.
+
Scikit-learn
------------
@@ -393,6 +401,8 @@ mathematics, science, and engineering.
:class:`~gudhi.point_cloud.knn.KNearestNeighbors` can use the Python package
`SciPy <http://scipy.org>`_ as a backend if explicitly requested.
+<<<<<<< HEAD
+=======
TensorFlow
----------
@@ -401,6 +411,7 @@ and :doc:`Rips complex </rips_complex_tflow_itf_ref>` modules require `TensorFlo
for incorporating them in neural nets.
`TensorFlow <https://www.tensorflow.org>`_ is also used in some automatic differentiation tests.
+>>>>>>> 3e0e47b81ba488f6893933d8685fc1e7eec0e501
Bug reports and contributions
*****************************
diff --git a/src/python/doc/perslay_params.md b/src/python/doc/perslay_params.md
new file mode 100644
index 00000000..58537939
--- /dev/null
+++ b/src/python/doc/perslay_params.md
@@ -0,0 +1,95 @@
+PersLay parameters
+------------------
+
+In the following description of PersLay parameters, each parameter, or dictionary key, that contains `_init` in its name is optimized and learned by PersLay during training. If you do not want to optimize the vectorization, set the keys **train_vect** and **train_weight** to False.
+
+* The following keys are mandatory:
+ + layer
+ Either "PermutationEquivariant", "Image", "Landscape", "BettiCurve", "Entropy", "Exponential", "Rational" or "RationalHat". Type of the PersLay layer. "Image" is for `persistence images <https://arxiv.org/abs/1507.06217>`_, "Landscape" is for `persistence landscapes <http://www.jmlr.org/papers/volume16/bubenik15a/bubenik15a.pdf>`_, "Exponential", "Rational" and "RationalHat" are for `structure elements <http://jmlr.org/beta/papers/v20/18-358.html>`_, "PermutationEquivariant" is for the original DeepSet layer, defined in `this article <https://arxiv.org/abs/1703.06114>`_, "BettiCurve" is for `Betti curves <https://www.jstage.jst.go.jp/article/tjsai/32/3/32_D-G72/_pdf>`_ and "Entropy" is for `entropy <https://arxiv.org/abs/1803.08304>`_.
+ + perm_op
+ Either "sum", "mean", "max", "topk". Permutation invariant operation.
+ + keep
+ Number of top values to keep. Used only if **perm_op** is "topk".
+ + pweight
+ Either "power", "grid", "gmix" or None. Weight function to be applied on persistence diagram points. If "power", this function is a (trainable) coefficient times the distances to the diagonal of the points to a certain power. If "grid", this function is piecewise-constant and defined with pixel values of a grid. If "gmix", this function is defined as a mixture of Gaussians. If None, no weighting is applied.
+ + final_model
+ A Tensorflow / Keras model used to postprocess the persistence diagrams in each channel. Use "identity" if you don't want to postprocess.
+* Depending on what **pweight** is, the following additional keys are requested:
+ + if **pweight** is "power":
+ - pweight_init
+ Initializer of the coefficient of the power weight function. It can be either a single value, or a random initializer from tensorflow, such as tensorflow.random_uniform_initializer(0., 1.).
+ - pweight_power
+ Integer used for exponentiating the distances to the diagonal of the persistence diagram points.
+ + if **pweight** is "grid":
+ - pweight_size
+ Grid size of the grid weight function. It is a tuple of integer values, such as (10,10).
+ - pweight_bnds
+ Grid boundaries of the grid weight function. It is a tuple containing two tuples, each containing the minimum and maximum values of each axis of the plane. Example: ((-0.01, 1.01), (-0.01, 1.01)).
+ - pweight_init
+ Initializer for the pixel values of the grid weight function. It can be either a numpy array of values, or a random initializer from tensorflow, such as tensorflow.random_uniform_initializer(0., 1.).
+ + if **pweight** is "gmix":
+ - pweight_num
+ Number of Gaussian functions of the mixture of Gaussians weight function.
+ - pweight_init
+ Initializer of the means and variances of the mixture of Gaussians weight function. It can be either a numpy array of values, or a random initializer from tensorflow, such as tensorflow.random_uniform_initializer(0., 1.).
+* Depending on what **layer** is, the following additional keys are requested:
+ + if **layer** is "PermutationEquivariant":
+ - lpeq
+ Sequence of permutation equivariant operations, as defined in [the DeepSet article](). It is a list of tuples of the form (*dim*, *operation*). Each tuple defines a permutation equivariant function of dimension *dim* and second permutation operation *operation* (string, either "max", "min", "sum" or None). Second permutation operation is optional and is not applied if *operation* is set to None. Example: [(150, "max"), (75, None)].
+ - lweight_init
+ Initializer for the weight matrices of the permutation equivariant operations. It can be either a numpy array of values, or a random initializer from tensorflow, such as tensorflow.random_uniform_initializer(0., 1.).
+ - lbias_init
+ Initializer for the biases of the permutation equivariant operations. It can be either a numpy array of values, or a random initializer from tensorflow, such as tensorflow.random_uniform_initializer(0., 1.).
+ - lgamma_init
+ Initializer for the Gamma matrices of the permutation equivariant operations. It can be either a numpy array of values, or a random initializer from tensorflow, such as tensorflow.random_uniform_initializer(0., 1.).
+ + if **layer** is "Image":
+ - image_size
+ Persistence image size. It is a tuple of integer values, such as (10,10).
+ - image_bnds
+ Persistence image boundaries. It is a tuple containing two tuples, each containing the minimum and maximum values of each axis of the plane. Example: ((-0.01, 1.01), (-0.01, 1.01)).
+ - lvariance_init
+ Initializer for the bandwidths of the Gaussian functions centered on the persistence image pixels. It can be either a single value, or a random initializer from tensorflow, such as tensorflow.random_uniform_initializer(0., 3.).
+ + if **layer** is "Landscape":
+ - lsample_num
+ Number of samples of the diagonal that will be evaluated on the persistence landscapes.
+ - lsample_init
+ Initializer of the samples of the diagonal. It can be either a numpy array of values, or a random initializer from tensorflow, such as tensorflow.random_uniform_initializer(0., 1.).
+ + if **layer** is "BettiCurve":
+ - lsample_num
+ Number of samples of the diagonal that will be evaluated on the Betti curves.
+ - lsample_init
+ Initializer of the samples of the diagonal. It can be either a numpy array of values, or a random initializer from tensorflow, such as tensorflow.random_uniform_initializer(0., 1.).
+ - theta
+ Sigmoid parameter used for approximating the piecewise constant functions associated to the persistence diagram points.
+ + if **layer** is "Entropy":
+ - lsample_num
+ Number of samples on the diagonal that will be evaluated on the persistence entropies.
+ - lsample_init
+ Initializer of the samples of the diagonal. It can be either a numpy array of values, or a random initializer from tensorflow, such as tensorflow.random_uniform_initializer(0., 1.).
+ - theta
+ Sigmoid parameter used for approximating the piecewise constant functions associated to the persistence diagram points.
+ + if **layer** is "Exponential":
+ - lnum
+ Number of exponential structure elements that will be evaluated on the persistence diagram points.
+ - lmean_init
+ Initializer of the means of the exponential structure elements. It can be either a numpy array of values, or a random initializer from tensorflow, such as tensorflow.random_uniform_initializer(0., 1.).
+ - lvariance_init
+ Initializer of the bandwidths of the exponential structure elements. It can be either a numpy array of values, or a random initializer from tensorflow, such as tensorflow.random_uniform_initializer(3., 3.).
+ + if **layer** is "Rational":
+ - lnum
+ Number of rational structure elements that will be evaluated on the persistence diagram points.
+ - lmean_init
+ Initializer of the means of the rational structure elements. It can be either a numpy array of values, or a random initializer from tensorflow, such as tensorflow.random_uniform_initializer(0., 1.).
+ - lvariance_init
+ Initializer of the bandwidths of the rational structure elements. It can be either a numpy array of values, or a random initializer from tensorflow, such as tensorflow.random_uniform_initializer(3., 3.).
+ - lalpha_init
+ Initializer of the exponents of the rational structure elements. It can be either a numpy array of values, or a random initializer from tensorflow, such as tensorflow.random_uniform_initializer(3., 3.).
+ + if **layer** is "RationalHat":
+ - lnum
+ Number of rational hat structure elements that will be evaluated on the persistence diagram points.
+ - lmean_init
+ Initializer of the means of the rational hat structure elements. It can be either a numpy array of values, or a random initializer from tensorflow, such as tensorflow.random_uniform_initializer(0., 1.).
+ - lr_init
+ Initializer of the threshold of the rational hat structure elements. It can be either a numpy array of values, or a random initializer from tensorflow, such as tensorflow.random_uniform_initializer(3., 3.).
+ - q
+ Norm parameter.
diff --git a/src/python/doc/representations.rst b/src/python/doc/representations.rst
index b0477197..2d66fa68 100644
--- a/src/python/doc/representations.rst
+++ b/src/python/doc/representations.rst
@@ -8,10 +8,16 @@ Representations manual
.. include:: representations_sum.inc
-This module, originally available at https://github.com/MathieuCarriere/sklearn-tda and named sklearn_tda, aims at bridging the gap between persistence diagrams and machine learning, by providing implementations of most of the vector representations for persistence diagrams in the literature, in a scikit-learn format. More specifically, it provides tools, using the scikit-learn standard interface, to compute distances and kernels on persistence diagrams, and to convert these diagrams into vectors in Euclidean space.
+This module aims at bridging the gap between persistence diagrams and machine learning, by providing implementations of most of the vector representations for persistence diagrams in the literature, in a scikit-learn format. More specifically, it provides tools, using the scikit-learn standard interface, to compute distances and kernels on persistence diagrams, and to convert these diagrams into vectors in Euclidean space. Moreover, this module also contains `PersLay <http://proceedings.mlr.press/v108/carriere20a.html>`_, which is a general neural network layer for performing deep learning with persistence diagrams, implemented in TensorFlow.
A diagram is represented as a numpy array of shape (n,2), as can be obtained from :func:`~gudhi.SimplexTree.persistence_intervals_in_dimension` for instance. Points at infinity are represented as a numpy array of shape (n,1), storing only the birth time. The classes in this module can handle several persistence diagrams at once. In that case, the diagrams are provided as a list of numpy arrays. Note that it is not necessary for the diagrams to have the same number of points, i.e., for the corresponding arrays to have the same number of rows: all classes can handle arrays with different shapes.
+This `notebook <https://github.com/GUDHI/TDA-tutorial/blob/master/Tuto-GUDHI-representations.ipynb>`_ explains how to
+efficiently combine machine learning and topological data analysis with the
+:doc:`representations module<representations>` in a scikit-learn fashion. This `notebook <https://github.com/MathieuCarriere/tda-tutorials/blob/perslay/Tuto-GUDHI-perslay-expe.ipynb>`_
+and `this one <https://github.com/MathieuCarriere/tda-tutorials/blob/perslay/Tuto-GUDHI-perslay-visu.ipynb>`_ explain how to use PersLay.
+
+
Examples
--------
@@ -30,8 +36,6 @@ This example computes the first two Landscapes associated to a persistence diagr
l=Landscape(num_landscapes=2,resolution=10).fit_transform(diags)
print(l)
-The output is:
-
.. testoutput::
[[1.02851895 2.05703791 2.57129739 1.54277843 0.89995409 1.92847304
@@ -45,13 +49,62 @@ Various kernels
This small example is also provided
:download:`diagram_vectorizations_distances_kernels.py <../example/diagram_vectorizations_distances_kernels.py>`
-Machine Learning and Topological Data Analysis
-^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+PersLay
+^^^^^^^
-This `notebook <https://github.com/GUDHI/TDA-tutorial/blob/master/Tuto-GUDHI-representations.ipynb>`_ explains how to
-efficiently combine machine learning and topological data analysis with the
-:doc:`representations module<representations>`.
+.. testcode::
+ import numpy as np
+ import tensorflow as tf
+ from sklearn.preprocessing import MinMaxScaler
+ import gudhi.representations as gdr
+ import gudhi.tensorflow as gdtf
+
+ diagrams = [np.array([[0.,4.],[1.,2.],[3.,8.],[6.,8.]])]
+ diagrams = gdr.DiagramScaler(use=True, scalers=[([0,1], MinMaxScaler())]).fit_transform(diagrams)
+ diagrams = tf.RaggedTensor.from_tensor(tf.constant(diagrams, dtype=tf.float32))
+
+ rho = tf.identity
+ phi = gdtf.GaussianPerslayPhi((100, 100), ((-.5, 1.5), (-.5, 1.5)), .1)
+ weight = gdtf.PowerPerslayWeight(1.,0.)
+ perm_op = tf.math.reduce_sum
+
+ perslay = gdtf.Perslay(phi=phi, weight=weight, perm_op=perm_op, rho=rho)
+ vectors = perslay(diagrams)
+ print(vectors)
+
+.. testoutput::
+
+ tf.Tensor(
+ [[[[1.7266072e-16]
+ [4.1706043e-09]
+ [1.1336876e-08]
+ [8.5738821e-12]
+ [2.1243891e-14]]
+
+ [[4.1715076e-09]
+ [1.0074080e-01]
+ [2.7384272e-01]
+ [3.0724244e-02]
+ [7.6157507e-05]]
+
+ [[8.0382870e-06]
+ [1.5802664e+00]
+ [8.2997030e-01]
+ [1.2395413e+01]
+ [3.0724116e-02]]
+
+ [[8.0269419e-06]
+ [1.3065740e+00]
+ [9.0923014e+00]
+ [6.1664842e-02]
+ [1.3949171e-06]]
+
+ [[9.0331329e-13]
+ [1.4954816e-07]
+ [1.5145997e-04]
+ [1.0205092e-06]
+ [7.8093526e-16]]]], shape=(1, 5, 5, 1), dtype=float32)
Preprocessing
-------------
@@ -80,3 +133,11 @@ Metrics
:members:
:special-members:
:show-inheritance:
+
+PersLay
+-------
+.. automodule:: gudhi.tensorflow.perslay
+ :members:
+ :special-members:
+ :show-inheritance:
+
diff --git a/src/python/doc/representations_sum.inc b/src/python/doc/representations_sum.inc
index 4298aea9..cce91975 100644
--- a/src/python/doc/representations_sum.inc
+++ b/src/python/doc/representations_sum.inc
@@ -1,14 +1,16 @@
.. table::
:widths: 30 40 30
- +------------------------------------------------------------------+----------------------------------------------------------------+-------------------------------------------------------------+
- | .. figure:: | Vectorizations, distances and kernels that work on persistence | :Author: Mathieu Carrière, Martin Royer |
- | img/sklearn-tda.png | diagrams, compatible with scikit-learn. | |
- | | | :Since: GUDHI 3.1.0 |
- | | | |
- | | | :License: MIT |
- | | | |
- | | | :Requires: `Scikit-learn <installation.html#scikit-learn>`_ |
- +------------------------------------------------------------------+----------------------------------------------------------------+-------------------------------------------------------------+
- | * :doc:`representations` |
- +------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------+
+ +------------------------------------------------------------------+----------------------------------------------------------------+------------------------------------------------------------------------------------------------------------+
+ | .. figure:: | Vectorizations, distances and kernels that work on persistence | :Author: Mathieu Carrière, Martin Royer |
+ | img/sklearn-tda.png | diagrams, compatible with scikit-learn and tensorflow. | |
+ | | | :Since: GUDHI 3.1.0 |
+ | | | |
+ | | | :License: MIT |
+ | | | |
+ | | | :Requires: `Scikit-learn <installation.html#scikit-learn>`_, `TensorFlow <installation.html#tensorflow>`_ |
+ | | | |
+ +------------------------------------------------------------------+----------------------------------------------------------------+------------------------------------------------------------------------------------------------------------+
+ | * :doc:`representations` |
+ +------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
+