quickstart proof reading

author: Rémi Flamary <remi.flamary@gmail.com> 2019-07-02 16:40:30 +0200
committer: Rémi Flamary <remi.flamary@gmail.com> 2019-07-02 16:40:30 +0200
commit: ef00ce42616fe7adf747c23a5590a83b62171a36 (patch)
tree: 33eceb68c112464459087b3287ee8545c6f455d1 /docs
parent: 85cc12bc7731077846bb77346797165c098fc4ec (diff)
1 files changed, 88 insertions, 85 deletions
diff --git a/docs/source/quickstart.rst b/docs/source/quickstart.rst
index 0dcd7ff..b726149 100644
--- a/docs/source/quickstart.rst
+++ b/docs/source/quickstart.rst
@@ -3,7 +3,9 @@ Quick start guide
 =================
 
 In the following we provide some pointers about which functions and classes 
-to use for different problems related to optimal transport (OT).
+to use for different problems related to optimal transport (OT) and machine
+learning. We refer when we can to concrete examples in the documentation that
+are also available as notebooks on the POT Github.
 
 This document is not a tutorial on numerical optimal transport. For this we strongly
 recommend to read the very nice book [15]_ . 
@@ -16,7 +18,8 @@ Optimal transport and Wasserstein distance
     In POT, most functions that solve OT or regularized OT problems have two
     versions that return the OT matrix or the value of the optimal solution. For
     instance :any:`ot.emd` return the OT matrix and :any:`ot.emd2` return the
-    Wassertsein distance.
+    Wassertsein distance. This approach has been implemented in practice for all
+    solvers that return an OT matrix (even Gromov-Wasserstsein)
 
 Solving optimal transport
 ^^^^^^^^^^^^^^^^^^^^^^^^^
@@ -32,7 +35,8 @@ as
 where :
 
 - :math:`M\in\mathbb{R}_+^{m\times n}` is the metric cost matrix defining the cost to move mass from bin :math:`a_i` to bin :math:`b_j`.
-- :math:`a` and :math:`b` are histograms (positive, sum to 1) that represent the weights of each samples in the source an target distributions.
+- :math:`a` and :math:`b` are histograms on the simplex (positive, sum to 1) that represent the
+weights of each samples in the source an target distributions.
 
 Solving the linear program above can be done using the function :any:`ot.emd`
 that will return the optimal transport matrix :math:`\gamma^*`:
@@ -43,7 +47,7 @@ that will return the optimal transport matrix :math:`\gamma^*`:
     # M is the ground cost matrix
     T=ot.emd(a,b,M) # exact linear program
 
-The method used for solving the OT problem is the network simplex, it is
+The method implemented for solving the OT problem is the network simplex, it is
 implemented in C from  [1]_. It has a complexity of :math:`O(n^3)` but the
 solver is quite efficient and uses sparsity of the solution.
 
@@ -54,15 +58,16 @@ solver is quite efficient and uses sparsity of the solution.
     - :any:`auto_examples/plot_OT_1D` 
     - :any:`auto_examples/plot_OT_L1_vs_L2` 
 
+
 Computing Wasserstein distance
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 
-The value of the OT solution is often more of interest that the OT matrix :
+The value of the OT solution is often more of interest than the OT matrix :
 
-    .. math::
-        OT(a,b)=\min_\gamma \quad \sum_{i,j}\gamma_{i,j}M_{i,j}
+.. math::
+    OT(a,b)=\min_\gamma \quad \sum_{i,j}\gamma_{i,j}M_{i,j}
 
-        s.t. \gamma 1 = a; \gamma^T 1= b; \gamma\geq 0
+    s.t. \gamma 1 = a; \gamma^T 1= b; \gamma\geq 0
 
 
 It can computed from an already estimated OT matrix with
@@ -92,7 +97,6 @@ the :math:`W_1` wasserstein distance can be done directly with  :any:`ot.emd2`
 when providing :code:`M=ot.dist(xs,xt, metric='euclidean')` to use the euclidean
 distance.
 
- 
 
 .. hint::
     Examples of use for :any:`ot.emd2` are available in the following examples:
@@ -111,15 +115,15 @@ For instance when the samples are in 1D, then the OT problem can be solved in
 function :any:`ot.emd_1d` and   :any:`ot.emd2_1d` to return respectively the OT
 matrix and value. Note that since the solution is very sparse the :code:`sparse`
 parameter of :any:`ot.emd_1d` allows for solving and returning the solution for
-very large problems. Note that in order to computed directly the :math:`W_p`
+very large problems. Note that in order to compute directly the :math:`W_p`
 Wasserstein distance in 1D we provide the function :any:`ot.wasserstein_1d` that
 takes :code:`p` as a parameter. 
 
-Another specials for estimating OT and Monge mapping is between Gaussian
+Another special case for estimating OT and Monge mapping is between Gaussian
 distributions. In this case there exists a close form solution given in Remark
 2.29 in [15]_ and the Monge mapping is an affine function and can be
 also computed from the covariances and means of the source and target
-distributions. In this case when the finite sample dataset is supposed gaussian, we provide 
+distributions. In the case when the finite sample dataset is supposed gaussian, we provide 
 :any:`ot.da.OT_mapping_linear` that returns the parameters for the Monge
 mapping.
 
@@ -129,8 +133,7 @@ Regularized Optimal Transport
 
 Recent developments have shown the interest of regularized OT both in terms of
 computational and statistical properties.
-
-We address in this section the regularized OT problem that can be expressed as
+We address in this section the regularized OT problems that can be expressed as
 
 .. math::
     \gamma^* = arg\min_\gamma \quad \sum_{i,j}\gamma_{i,j}M_{i,j} + \lambda\Omega(\gamma)
@@ -148,7 +151,6 @@ We discuss in the following specific algorithms that can be used depending on
 the regularization term.
 
 
-
 Entropic regularized OT
 ^^^^^^^^^^^^^^^^^^^^^^^
 
@@ -162,7 +164,8 @@ regularization has the following expression
 
 The use of the regularization term above in the optimization problem has a very
 strong impact. First it makes the problem smooth which leads to new optimization
-procedures such as L-BFGS (see :any:`ot.smooth` ). Next it makes the problem
+procedures such as the well known Sinkhorn algorithm [2]_ or L-BFGS (see
+:any:`ot.smooth` ). Next it makes the problem
 strictly convex meaning that there will be a unique solution. Finally the
 solution of the resulting optimization problem can be expressed as:
 
@@ -172,13 +175,13 @@ solution of the resulting optimization problem can be expressed as:
 
 where :math:`u` and :math:`v` are vectors and :math:`K=\exp(-M/\lambda)` where
 the :math:`\exp` is taken component-wise. In order to solve the optimization
-problem, on can use an alternative projection algorithm that can be very
+problem, on can use an alternative projection algorithm called Sinkhorn-Knopp that can be very
 efficient for large values if regularization. 
 
-The main function is POT are  :any:`ot.sinkhorn` and
+The Sinkhorn-Knopp algorithm is implemented in :any:`ot.sinkhorn` and
 :any:`ot.sinkhorn2` that return respectively the OT matrix and the value of the
 linear term. Note that the regularization parameter :math:`\lambda` in the
-equation above is given to those function with the parameter :code:`reg`.
+equation above is given to those functions with the parameter :code:`reg`.
 
     >>> import ot
     >>> a=[.5,.5]
@@ -188,10 +191,7 @@ equation above is given to those function with the parameter :code:`reg`.
     array([[ 0.36552929,  0.13447071],
         [ 0.13447071,  0.36552929]])
 
-
-
-More details about the algorithm used is given in the following note.
-
+More details about the algorithms used are given in the following note.
 
 .. note::
     The main function to solve entropic regularized OT is :any:`ot.sinkhorn`.
@@ -211,7 +211,7 @@ More details about the algorithm used is given in the following note.
     In addition to all those variants of sinkhorn, we have another
     implementation solving the problem in the smooth dual or semi-dual in
     :any:`ot.smooth`. This solver uses the :any:`scipy.optimize.minimize`
-    function to solve the smooth problem with :code:`L-BFGS` algorithm. Tu use
+    function to solve the smooth problem with :code:`L-BFGS-B` algorithm. Tu use
     this solver, use functions :any:`ot.smooth.smooth_ot_dual` or
     :any:`ot.smooth.smooth_ot_semi_dual` with parameter :code:`reg_type='kl'` to
     choose entropic/Kullbach Leibler regularization.
@@ -227,13 +227,13 @@ OT matrix and loss for empirical distributions with respectively
 :any:`ot.bregman.empirical_sinkhorn` and :any:`ot.bregman.empirical_sinkhorn2`.
 
 
-
-
 Finally note that we also provide in :any:`ot.stochastic` several implementation
-of stochastic solvers for entropic regularized OT [18]_ [19]_.  
+of stochastic solvers for entropic regularized OT [18]_ [19]_.  Those pure Python
+implementations are not optimized for speed but provide a roust implementation
+of algorithms in [18]_ [19]_.
 
 .. hint::
-    Examples of use for :any:`ot.sinkhorn` are available in the following examples:
+    Examples of use for :any:`ot.sinkhorn` are available in :
 
     - :any:`auto_examples/plot_OT_2D_samples`
     - :any:`auto_examples/plot_OT_1D` 
@@ -246,7 +246,7 @@ Other regularization
 
 While entropic OT is the most common and favored in practice, there exist other
 kind of regularization. We provide in POT two specific solvers for other
-regularization terms: namely quadratic regularization and group lasso
+regularization terms, namely quadratic regularization and group lasso
 regularization. But we also provide in :any:`ot.optim`  two generic solvers that allows solving any
 smooth regularization in practice. 
 
@@ -261,14 +261,14 @@ regularization of the form
 
 this regularization term has a similar effect to entropic regularization in
 densifying the OT matrix but it keeps some sort of sparsity that is lost with
-entropic regularization as soon as :math:`\lambda>0` [17]_. This problem cen be
+entropic regularization as soon as :math:`\lambda>0` [17]_. This problem can be
 solved with POT using solvers from :any:`ot.smooth`, more specifically
 functions :any:`ot.smooth.smooth_ot_dual` or
 :any:`ot.smooth.smooth_ot_semi_dual` with parameter :code:`reg_type='l2'` to 
 choose the quadratic regularization.
 
 .. hint::
-    Examples of quadratic regularization are available in the following examples:
+    Examples of quadratic regularization are available in :
 
     - :any:`auto_examples/plot_OT_1D_smooth`
     - :any:`auto_examples/plot_optim_OTreg`
@@ -288,17 +288,17 @@ where :math:`\mathcal{G}` contains non overlapping groups of lines in the OT
 matrix. This regularization proposed in [5]_ will promote sparsity at the group level and for
 instance will force target samples to get mass from a small number of groups.
 Note that the exact OT solution is already sparse so this regularization does
-not make sens if it is not combined with others such as entropic. Depending on
+not make sens if it is not combined with entropic regularization. Depending on
 the choice of :code:`p` and :code:`q`, the problem can be solved with different
 approaches.  When :code:`q=1` and :code:`p<1` the problem is non convex but can
-be solved using an efficient majoration minimization approach  with
+be solved using an efficient majoration minimization approach with
 :any:`ot.sinkhorn_lpl1_mm`. When :code:`q=2` and :code:`p=1` we recover the
-convex gourp lasso and we provide a solver using generalized conditional
+convex group lasso and we provide a solver using generalized conditional
 gradient algorithm [7]_ in function
 :any:`ot.da.sinkhorn_l1l2_gl`.
 
 .. hint::
-    Examples of group Lasso regularization are available in the following examples:
+    Examples of group Lasso regularization are available in :
 
     - :any:`auto_examples/plot_otda_classes` 
     - :any:`auto_examples/plot_otda_d2`
@@ -309,7 +309,7 @@ Generic solvers
 
 Finally we propose in POT generic solvers that can be used to solve any
 regularization as long as you can provide a function computing the
-regularization and a function computing its gradient.
+regularization and a function computing its gradient (or sub-gradient).
 
 In order to solve 
 
@@ -319,13 +319,14 @@ In order to solve
         s.t. \gamma 1 = a; \gamma^T 1= b; \gamma\geq 0
 
 you can use function :any:`ot.optim.cg` that will use a conditional gradient as
-proposed in [6]_ . you need to provide the regularization function as parameter
+proposed in [6]_ . You need to provide the regularization function as parameter
 ``f`` and its gradient as parameter  ``df``. Note that the conditional gradient relies on
 iterative solving of a linearization of the problem using the exact
-:any:`ot.emd` so it can be  slow in practice. Still it always returns a
+:any:`ot.emd` so it can be  slow in practice. But, being an interior point
+algorithm,  it always returns a
 transport matrix that does not violates the marginals.
 
-Another solver is proposed to solve the problem
+Another generic solver is proposed to solve the problem
 
 .. math::
     \gamma^* = arg\min_\gamma \quad \sum_{i,j}\gamma_{i,j}M_{i,j}+ \lambda_e\Omega_e(\gamma) + \lambda\Omega(\gamma)
@@ -333,11 +334,12 @@ Another solver is proposed to solve the problem
         s.t. \gamma 1 = a; \gamma^T 1= b; \gamma\geq 0
 
 where :math:`\Omega_e` is the entropic regularization. In this case we use a
-generalized conditional gradient [7]_ implemented in :any:`ot.optim.gcg`  that does not linearize the entropic term and
+generalized conditional gradient [7]_ implemented in :any:`ot.optim.gcg`  that
+does not linearize the entropic term but
 relies on :any:`ot.sinkhorn` for its iterations. 
 
 .. hint::
-    Example of generic solvers are available in the following example:
+    An example of generic solvers are available in :
 
     - :any:`auto_examples/plot_optim_OTreg` 
 
@@ -347,7 +349,7 @@ Wasserstein Barycenters
 
 A Wasserstein barycenter is a distribution that minimize its Wasserstein
 distance with respect to other distributions [16]_. It corresponds to minimizing the
-following problem by seaching a distribution :math:`\mu` 
+following problem by searching a distribution :math:`\mu` such that
 
 .. math::
     \min_\mu \quad \sum_{k} w_kW(\mu,\mu_k)
@@ -371,7 +373,6 @@ Barycenters with fixed support
 When optimizing a barycenter with a fixed support, the optimization problem can
 be expressed as
 
-
 .. math::
     \min_a \quad \sum_{k} w_k W(a,b_k)
 
@@ -379,36 +380,36 @@ where :math:`b_k` are also weights in the simplex. In the non-regularized case,
 the problem above is a classical linear program. In this case we propose a
 solver :any:`ot.lp.barycenter` that rely on generic LP solvers. By default the
 function uses :any:`scipy.optimize.linprog`, but more efficient LP solvers from
-cvxopt can be also used by changing parameter :code:`solver`. Note that these
-solver require to solve a very large linear program and can be very slow in
+cvxopt can be also used by changing parameter :code:`solver`. Note that this problem
+requires to solve a very large linear program and can be very slow in
 practice. 
 
 Similarly to the OT problem, OT barycenters can be computed in the regularized
-case. When using entropic regularization the problem can be solved with a
+case. When using entropic regularization is used, the problem can be solved with a
 generalization of the sinkhorn algorithm based on bregman projections [3]_. This
 algorithm is provided in function :any:`ot.bregman.barycenter` also available as
 :any:`ot.barycenter`. In this case, the algorithm scales better to large
 distributions and rely only on matrix multiplications that can be performed in
 parallel.
 
-In addition to teh speedup brought by regularization, one can also greatly
+In addition to the speedup brought by regularization, one can also greatly
 accelerate the estimation of Wasserstein barycenter when the support has a
 separable structure [21]_. In the case of 2D images for instance one can replace
-the matrix vector production in teh bregman projections by convolution
+the matrix vector production in the Bregman projections by convolution
 operators. We provide an implementation of this algorithm in function
 :any:`ot.bregman.convolutional_barycenter2d`.
 
 .. hint::
-    Example of Wasserstein (:any:`ot.lp.barycenter`) and regularized Wasserstein
-    barycenter (:any:`ot.bregman.barycenter`) computation are available in the following examples:
+    Examples of Wasserstein (:any:`ot.lp.barycenter`) and regularized Wasserstein
+    barycenter (:any:`ot.bregman.barycenter`) computation are available in :
 
     - :any:`auto_examples/plot_barycenter_1D` 
     - :any:`auto_examples/plot_barycenter_lp_vs_entropic` 
 
-    Example of convolutional barycenter
+    An example of convolutional barycenter
     (:any:`ot.bregman.convolutional_barycenter2d`) computation 
     for 2D images is available
-    in the following example:
+    in :
 
     - :any:`auto_examples/plot_convolutional_barycenter`
 
@@ -425,15 +426,15 @@ corresponds to  solving the following optimization problem:
 
     s.t. \quad \mu=\sum_{i=1}^n a_i\delta_{x_i}
 
-We provide an alternating solver based on [20]_ in
+We provide a solver based on [20]_ in
 :any:`ot.lp.free_support_barycenter`. This function minimize the problem and
-return an optimal support :math:`\{x_i\}` for uniform or given weights
+return a locally optimal support :math:`\{x_i\}` for uniform or given weights
 :math:`a`.
 
  .. hint::
 
-    Example of the fee support barycenter estimation is available
-    in the following example:
+    An example of the free support barycenter estimation is available
+    in :
 
     - :any:`auto_examples/plot_free_support_barycenter`
 
@@ -449,7 +450,8 @@ distribution but that minimizes the transport loss. The existence and uniqueness
 optimal mapping is still an open problem in the general case but has been proven
 for smooth distributions by Brenier in his eponym `theorem
 <https://who.rocq.inria.fr/Jean-David.Benamou/demiheure.pdf>`__. We provide in
-:any:`ot.da` several solvers for Monge mapping estimation and domain adaptation. 
+:any:`ot.da` several solvers for smooth Monge mapping estimation and domain
+adaptation from discrete distributions. 
 
 Monge Mapping estimation
 ^^^^^^^^^^^^^^^^^^^^^^^^
@@ -468,14 +470,14 @@ For a more general mapping estimation we also provide the barycentric mapping
 proposed in [6]_ . It is implemented in the class :any:`ot.da.EMDTransport` and
 other transport based classes in :any:`ot.da` . Those classes are discussed more
 in the following but follow an interface similar to sklearn classes. Finally a
-method proposed in [8]_ that estimate a continuous mapping approximating the
+method proposed in [8]_ that estimates a continuous mapping approximating the
 barycentric mapping is provided in :any:`ot.da.joint_OT_mapping_linear` for
 linear mapping and :any:`ot.da.joint_OT_mapping_kernel` for non linear mapping.
 
  .. hint::
 
-    Example of the linear Monge mapping estimation is available
-    in the following example:
+    An example of the linear Monge mapping estimation is available
+    in :
 
     - :any:`auto_examples/plot_otda_linear_mapping`
 
@@ -489,12 +491,14 @@ transport labeled source samples onto the target distribution with no labels.
 
 We provide several classes based on :any:`ot.da.BaseTransport` that provide
 several OT and mapping estimations. The interface of those classes is similar to
-classifiers in sklearn toolbox. At initialization several parameters (for
-instance regularization parameter) can be set. Then one needs to estimate the
+classifiers in sklearn toolbox. At initialization, several parameters such as
+ regularization parameter value can be set. Then one needs to estimate the
 mapping with function :any:`ot.da.BaseTransport.fit`. Finally one can map the
 samples from source to target with  :any:`ot.da.BaseTransport.transform` and
-from target to source with :any:`ot.da.BaseTransport.inverse_transform`. Here is
-an example for class :any:`ot.da.EMDTransport` 
+from target to source with :any:`ot.da.BaseTransport.inverse_transform`.
+
+Here is
+an example for class :any:`ot.da.EMDTransport` :
 
 .. code::
 
@@ -503,12 +507,11 @@ an example for class :any:`ot.da.EMDTransport`
 
     Mapped_Xs= ot_emd.transform(Xs=Xs)
 
-A list
-of the provided implementation is given in the following note.
+A list of the provided implementation is given in the following note.
 
 .. note::
 
-    Here is a list of the mapping classes inheriting from
+    Here is a list of the OT mapping classes inheriting from
     :any:`ot.da.BaseTransport`
     
     * :any:`ot.da.EMDTransport` : Barycentric mapping with EMD transport
@@ -523,7 +526,7 @@ of the provided implementation is given in the following note.
 
 .. hint::
 
-    Example of the use of OTDA classes are available in the following exmaples:
+    Example of the use of OTDA classes are available in :
 
     - :any:`auto_examples/plot_otda_color_images`
     - :any:`auto_examples/plot_otda_mapping`
@@ -533,7 +536,7 @@ of the provided implementation is given in the following note.
 Other applications
 ------------------
 
-We discuss in the following several implementations that has been used and
+We discuss in the following several OT related problems and tools that has been
 proposed in the OT and machine learning community.
 
 Wasserstein Discriminant Analysis
@@ -551,7 +554,7 @@ consist in finding a linear projector optimizing the following criterion
 where :math:`\#` is the push-forward operator, :math:`OT_e` is the entropic OT
 loss  and :math:`\mu_i` is the
 distribution of samples from class :math:`i`.  :math:`P` is also constrained to
-be in the Stiefel manifold. WDA can be solved in pot using function
+be in the Stiefel manifold. WDA can be solved in POT using function
 :any:`ot.dr.wda`. It requires to have installed :code:`pymanopt` and
 :code:`autograd` for manifold optimization and automatic differentiation
 respectively. Note that we also provide the Fisher discriminant estimator in
@@ -564,7 +567,7 @@ respectively. Note that we also provide the Fisher discriminant estimator in
 
 .. hint::
 
-    An example of the use of WDA is available in the following example:
+    An example of the use of WDA is available in :
 
     - :any:`auto_examples/plot_WDA`
 
@@ -582,7 +585,7 @@ problem:
     s.t. \quad  \gamma\geq 0
 
 
-where KL is the Kullback-Leibler divergence. This formulation allwos for
+where KL is the Kullback-Leibler divergence. This formulation allows for
 computing approximate mapping between distributions that do not have the same
 amount of mass. Interestingly the problem can be solved with a generalization of
 the Bregman projections algorithm [10]_. We provide a solver for unbalanced OT
@@ -594,7 +597,7 @@ is available in :any:`ot.barycenter_unbalanced`.
 .. hint::
 
     Examples of the use of :any:`ot.sinkhorn_unbalanced` and
-    :any:`ot.barycenter_unbalanced`  are available in:
+    :any:`ot.barycenter_unbalanced` are available in :
 
     - :any:`auto_examples/plot_UOT_1D`
     - :any:`auto_examples/plot_UOT_barycenter_1D`
@@ -609,46 +612,46 @@ from the two distributions. [13]_ proposed instead to realign the metric spaces
 by computing a transport between distance matrices. The Gromow Wasserstein
 alignement between two distributions can be expressed as the one minimizing:
 
-
 .. math::
     GW = \min_\gamma \sum_{i,j,k,l} L(C1_{i,k},C2_{j,l})*\gamma_{i,j}*\gamma_{k,l}
 
     s.t. \gamma 1 = a; \gamma^T 1= b; \gamma\geq 0
 
 where ::math:`C1` is the distance matrix between samples in the source
-distribution and :math:`C2` the one between samples in the target,   :math:`L(C1_{i,k},C2_{j,l})` is a measure of similarity between
+distribution and :math:`C2` the one between samples in the target,
+:math:`L(C1_{i,k},C2_{j,l})` is a measure of similarity between
 :math:`C1_{i,k}` and :math:`C2_{j,l}` often chosen as
 :math:`L(C1_{i,k},C2_{j,l})=\|C1_{i,k}-C2_{j,l}\|^2`. The optimization problem
 above is a non-convex quadratic program but we provide a solver that finds a
 local minimum using conditional gradient in :any:`ot.gromov.gromov_wasserstein`.
-There also exist an entropic regularized variant of GW that has been proposed in
+There also exists an entropic regularized variant of GW that has been proposed in
 [12]_ and we provide an implementation of their algorithm in
 :any:`ot.gromov.entropic_gromov_wasserstein`.
 
 Note that similarly to Wasserstein distance GW allows for the definition of GW
-barycenters that cen be expressed as
+barycenters that can be expressed as
 
 .. math::
     \min_{C\geq 0} \quad \sum_{k} w_k GW(C,Ck)
 
 where :math:`Ck` is the distance matrix between samples in distribution
-:math:`k`. Note that interestingly the barycenter is defined a a symmetric
+:math:`k`. Note that interestingly the barycenter is defined as a symmetric
 positive matrix. We provide a block coordinate optimization procedure in 
 :any:`ot.gromov.gromov_barycenters` and
 :any:`ot.gromov.entropic_gromov_barycenters` for non-regularized and regularized
 barycenters respectively.
 
 Finally note that recently a fusion between Wasserstein and GW, coined Fused
-Groimov-Wasserstein (FGW) has been proposed
+Gromov-Wasserstein (FGW) has been proposed
 in [24]_. It allows to compute a similarity between objects that are only partly in
 the same space. As such it can be used to measure similarity between labeled
 graphs for instance and also provide computable barycenters.
-The implementations of FGW is provided in functions
+The implementations of FGW and FGW barycenter is provided in functions
 :any:`ot.gromov.fused_gromov_wasserstein` and :any:`ot.gromov.fgw_barycenters`.
 
 .. hint::
 
-    Examples of computation of GW, regularized G and FGW are provided in :
+    Examples of computation of GW, regularized G and FGW are available in :
 
     - :any:`auto_examples/plot_gromov`
     - :any:`auto_examples/plot_fgw`   
@@ -663,7 +666,7 @@ GPU acceleration
 ^^^^^^^^^^^^^^^^
 
 We provide several implementation of our OT solvers in :any:`ot.gpu`. Those
-implementation use the :code:`cupy` toolbox that obviously need to be installed.   
+implementations use the :code:`cupy` toolbox that obviously need to be installed.   
 
 
 .. note::
@@ -677,7 +680,7 @@ implementation use the :code:`cupy` toolbox that obviously need to be installed.
     -  :any:`ot.gpu.sinkhorn_lpl1_mm` : computation of sinkhorn + group lasso
 
 Note that while the :any:`ot.gpu` module has been designed to be compatible with
-POT,  calling its function with numpy array will incur a large overhead due to
+POT,  calling its function with :any:`numpy`  arrays will incur a large overhead due to
 the memory copy of the array on GPU prior to computation and conversion of the
 array after computation. To avoid this overhead, we provide functions
 :any:`ot.gpu.to_gpu` and :any:`ot.gpu.to_np` that perform the conversion
@@ -697,7 +700,7 @@ FAQ
 
 1. **How to solve a discrete optimal transport problem ?**
 
-    The solver for discrete  is the function :py:mod:`ot.emd` that returns
+    The solver for discrete OT is the function :py:mod:`ot.emd` that returns
     the OT transport matrix. If you want to solve a regularized OT you can 
     use :py:mod:`ot.sinkhorn`.
 
@@ -711,7 +714,7 @@ FAQ
        T=ot.emd(a,b,M) # exact linear program
        T_reg=ot.sinkhorn(a,b,M,reg) # entropic regularized OT
 
-    More detailed examples can be seen on this
+    More detailed examples can be seen on this example: 
     :doc:`auto_examples/plot_OT_2D_samples`
author	Rémi Flamary <remi.flamary@gmail.com>	2019-07-02 16:40:30 +0200
committer	Rémi Flamary <remi.flamary@gmail.com>	2019-07-02 16:40:30 +0200
commit	ef00ce42616fe7adf747c23a5590a83b62171a36 (patch)
tree	33eceb68c112464459087b3287ee8545c6f455d1 /docs
parent	85cc12bc7731077846bb77346797165c098fc4ec (diff)