From 4cdc7f03fb5917134ba8886b026c8990f56bcfeb Mon Sep 17 00:00:00 2001
From: tlacombe <lacombe1993@gmail.com>
Date: Tue, 31 Mar 2020 11:21:27 +0200
Subject: merged doc from barycenters to wasserstein distance

---
 src/python/doc/wasserstein_distance_sum.inc  | 10 +--
 src/python/doc/wasserstein_distance_user.rst | 91 ++++++++++++++++++++++++++--
 2 files changed, 92 insertions(+), 9 deletions(-)

diff --git a/src/python/doc/wasserstein_distance_sum.inc b/src/python/doc/wasserstein_distance_sum.inc
index a97f428d..09424de2 100644
--- a/src/python/doc/wasserstein_distance_sum.inc
+++ b/src/python/doc/wasserstein_distance_sum.inc
@@ -3,11 +3,11 @@
 
    +-----------------------------------------------------------------+----------------------------------------------------------------------+------------------------------------------------------------------+
    | .. figure::                                                     | The q-Wasserstein distance measures the similarity between two       | :Author: Theo Lacombe                                            |
-   |      ../../doc/Bottleneck_distance/perturb_pd.png               | persistence diagrams. It's the minimum value c that can be achieved  |                                                                  |
-   |      :figclass: align-center                                    | by a perfect matching between the points of the two diagrams (+ all  | :Introduced in: GUDHI 3.1.0                                      |
-   |                                                                 | diagonal points), where the value of a matching is defined as the    |                                                                  |
-   |      Wasserstein distance is the q-th root of the sum of the    | q-th root of the sum of all edge lengths to the power q. Edge lengths| :Copyright: MIT                                                  |
-   |      edge lengths to the power q.                               | are measured in norm p, for :math:`1 \leq p \leq \infty`.            |                                                                  |
+   |      ../../doc/Bottleneck_distance/perturb_pd.png               | persistence diagrams using the sum of all edges lengths (instead of  |                                                                  |
+   |      :figclass: align-center                                    | the maximum). It allows to define sophisticated objects such as      | :Introduced in: GUDHI 3.1.0                                      |
+   |                                                                 | barycenters of a family of persistence diagrams.                     |                                                                  |
+   |      Wasserstein distance is the q-th root of the sum of the    |                                                                      | :Copyright: MIT                                                  |
+   |      edge lengths to the power q.                               |                                                                      |                                                                  |
    |                                                                 |                                                                      | :Requires: Python Optimal Transport (POT) :math:`\geq` 0.5.1     |
    +-----------------------------------------------------------------+----------------------------------------------------------------------+------------------------------------------------------------------+
    | * :doc:`wasserstein_distance_user`                              |                                                                                                                                         |
diff --git a/src/python/doc/wasserstein_distance_user.rst b/src/python/doc/wasserstein_distance_user.rst
index a9b21fa5..6de05afc 100644
--- a/src/python/doc/wasserstein_distance_user.rst
+++ b/src/python/doc/wasserstein_distance_user.rst
@@ -9,10 +9,16 @@ Definition
 
 .. include:: wasserstein_distance_sum.inc
 
-Functions
----------
-This implementation uses the Python Optimal Transport library and is based on
-ideas from "Large Scale Computation of Means and Cluster for Persistence
+The q-Wasserstein distance is defined as the minimal value 
+by a perfect matching between the points of the two diagrams (+ all  
+diagonal points), where the value of a matching is defined as the    
+q-th root of the sum of all edge lengths to the power q. Edge lengths
+are measured in norm p, for :math:`1 \leq p \leq \infty`.            
+
+Distance Functions
+------------------
+This first implementation uses the Python Optimal Transport library and is based
+on ideas from "Large Scale Computation of Means and Cluster for Persistence
 Diagrams via Optimal Transport" :cite:`10.5555/3327546.3327645`.
 
 .. autofunction:: gudhi.wasserstein.wasserstein_distance
@@ -84,3 +90,80 @@ The output is:
     point 1 in dgm1 is matched to point 2 in dgm2
     point 2 in dgm1 is matched to the diagonal
     point 1 in dgm2 is matched to the diagonal
+
+
+Barycenters
+-----------
+
+A Frechet mean (or barycenter) is a generalization of the arithmetic 
+mean in a non linear space such as the one of persistence diagrams.  
+Given a set of persistence diagrams :math:`\mu_1 \dots \mu_n`, it is 
+defined as a minimizer of the variance functional, that is of        
+:math:`\mu \mapsto \sum_{i=1}^n d_2(\mu,\mu_i)^2`.                   
+where :math:`d_2` denotes the Wasserstein-2 distance between         
+persistence diagrams.                                                
+It is known to exist and is generically unique. However, an exact    
+computation is in general untractable. Current implementation        
+available is based on (Turner et al., 2014), 
+:cite:`turner2014frechet`
+and uses an EM-scheme to 
+provide a local minimum of the variance functional (somewhat similar 
+to the Lloyd algorithm to estimate a solution to the k-means         
+problem). The local minimum returned depends on the initialization of
+the barycenter.                                                      
+The combinatorial structure of the algorithm limits its              
+scaling on large scale problems (thousands of diagrams and of points 
+per diagram).                                                        
+
+.. figure::                                                    
+     ./img/barycenter.png                                      
+     :figclass: align-center                                   
+                                                               
+     Illustration of Frechet mean between persistence          
+     diagrams.                                                 
+
+
+.. autofunction:: gudhi.barycenter.lagrangian_barycenter
+
+Basic example
+-------------
+
+This example computes the Frechet mean (aka Wasserstein barycenter) between 
+four persistence diagrams.
+It is initialized on the 4th diagram.
+As the algorithm is not convex, its output depends on the initialization and 
+is only a local minimum of the objective function.
+Initialization can be either given as an integer (in which case the i-th 
+diagram of the list is used as initial estimate) or as a diagram. 
+If None, it will randomly select one of the diagram of the list 
+as initial estimate.
+Note that persistence diagrams must be submitted as 
+(n x 2) numpy arrays and must not contain inf values.
+
+
+.. testcode::
+
+    import gudhi.barycenter
+    import numpy as np
+
+    dg1 = np.array([[0.2, 0.5]])
+    dg2 = np.array([[0.2, 0.7]])
+    dg3 = np.array([[0.3, 0.6], [0.7, 0.8], [0.2, 0.3]])
+    dg4 = np.array([])
+    pdiagset = [dg1, dg2, dg3, dg4]
+    bary = gudhi.wasserstein.barycenter.lagrangian_barycenter(pdiagset=pdiagset,init=3)
+
+    message = "Wasserstein barycenter estimated:"    
+    print(message)
+    print(bary)
+
+The output is:
+
+.. testoutput::
+
+    Wasserstein barycenter estimated:
+    [[0.27916667 0.55416667]
+     [0.7375     0.7625    ]
+     [0.2375     0.2625    ]]
+
+
-- 
cgit v1.2.3