summaryrefslogtreecommitdiff
path: root/src/python/doc/datasets.rst
blob: 2d11a19d1cadff62176b177ffbe2c0f13002c6a6 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
:orphan:

.. To get rid of WARNING: document isn't included in any toctree

================
Datasets manual
================

Datasets generators
===================

We provide the generation of different customizable datasets to use as inputs for Gudhi complexes and data structures.

Points generators
------------------

The module **points** enables the generation of random points on a sphere, random points on a torus and as a grid.

Points on sphere
^^^^^^^^^^^^^^^^

The function **sphere** enables the generation of random i.i.d. points uniformly on a (d-1)-sphere in :math:`R^d`.
The user should provide the number of points to be generated on the sphere :code:`n_samples` and the ambient dimension :code:`ambient_dim`.
The :code:`radius` of sphere is optional and is equal to **1** by default.
Only random points generation is currently available.

The generated points are given as an array of shape :math:`(n\_samples, ambient\_dim)`.

Example
"""""""

.. code-block:: python

   from gudhi.datasets.generators import points
   from gudhi import AlphaComplex

   # Generate 50 points on a sphere in R^2
   gen_points = points.sphere(n_samples = 50, ambient_dim = 2, radius = 1, sample = "random")

   # Create an alpha complex from the generated points
   alpha_complex = AlphaComplex(points = gen_points)
   
.. autofunction:: gudhi.datasets.generators.points.sphere

Points on a flat torus
^^^^^^^^^^^^^^^^^^^^^^

You can also generate points on a torus.

Two functions are available and give the same output: the first one depends on **CGAL** and the second does not and consists of full python code.

On another hand, two sample types are provided: you can either generate i.i.d. points on a d-torus in :math:`R^{2d}` *randomly* or on a *grid*.

First function: **ctorus**
"""""""""""""""""""""""""""

The user should provide the number of points to be generated on the torus :code:`n_samples`, and the dimension :code:`dim` of the torus on which points would be generated in :math:`R^{2dim}`.
The :code:`sample` argument is optional and is set to **'random'** by default.
In this case, the returned generated points would be an array of shape :math:`(n\_samples, 2*dim)`.
Otherwise, if set to **'grid'**, the points are generated on a grid and would be given as an array of shape:

.. math::

   ( ⌊n\_samples^{1 \over {dim}}⌋^{dim}, 2*dim )

**Note 1:** The output array first shape is rounded down to the closest perfect :math:`dim^{th}` power.

**Note 2:** This version is recommended when the user wishes to use **'grid'** as sample type, or **'random'** with a relatively small number of samples (~ less than 150).

Example
"""""""
.. code-block:: python

   from gudhi.datasets.generators import points

   # Generate 50 points randomly on a torus in R^6
   gen_points = points.ctorus(n_samples = 50, dim = 3)
   
   # Generate 27 points on a torus as a grid in R^6
   gen_points = points.ctorus(n_samples = 50, dim = 3, sample = 'grid')

.. autofunction:: gudhi.datasets.generators.points.ctorus

Second function: **torus**
"""""""""""""""""""""""""""

The user should provide the number of points to be generated on the torus :code:`n_samples` and the dimension :code:`dim` of the torus on which points would be generated in :math:`R^{2dim}`.
The :code:`sample` argument is optional and is set to **'random'** by default.
The other allowed value of sample type is **'grid'**.

**Note:** This version is recommended when the user wishes to use **'random'** as sample type with a great number of samples and a low dimension.

Example
"""""""
.. code-block:: python

   from gudhi.datasets.generators import points

   # Generate 50 points randomly on a torus in R^6
   gen_points = points.torus(n_samples = 50, dim = 3)
   
   # Generate 27 points on a torus as a grid in R^6
   gen_points = points.torus(n_samples = 50, dim = 3, sample = 'grid')


.. autofunction:: gudhi.datasets.generators.points.torus


Fetching datasets
=================

We provide some ready-to-use datasets that are not available by default when getting GUDHI, and need to be fetched explicitly.

By **default**, the fetched datasets directory is set to a folder named **'gudhi_data'** in the **user home folder**.
Alternatively, it can be set using the **'GUDHI_DATA'** environment variable.

.. autofunction:: gudhi.datasets.remote.fetch_bunny

.. figure:: ./img/bunny.png
     :figclass: align-center

     3D Stanford bunny with 35947 vertices.


.. autofunction:: gudhi.datasets.remote.fetch_spiral_2d

.. figure:: ./img/spiral_2d.png
     :figclass: align-center

     2D spiral with 114562 vertices.

.. autofunction:: gudhi.datasets.remote.clear_data_home