2D manifold
Fig. 1: Two dimensional manifold non-linearly embedded in three dimensions.

Introduction

When dealing with datasets comprising high-dimensional points, it is usually advantageous to discover some data structure. A fundamental information needed to this aim is the minimum number of parameters required to describe the data while minimizing the information loss. This number, usually called intrinsic dimension ($\texttt{i.d.}$), can be interpreted as the dimension of the manifold from which the input data are supposed to be drawn. Given a dataset $\mathbf{X}_N = \{\mathbf{x}_i \}\subseteq\Re^D$, the feature $\mathbf{x}_i$ are generally viewed as points constrained to lie on a low dimensional manifold $\mathcal{M}\subseteq\Re^d$ (generally assumed to be smooth or locally smooth) embedded in a higher dimensional space $\Re^D$, where the manifold dimensionality $d$ is the $\texttt{i.d.}$ to be estimated. A well-known simple example is shown in Figure 1. The development of automatic intrinsic dimensionality estimators has gained considerable attention due to its relevance in several application fields. However, most of the proposed solutions prove to be not robust on noisy datasets, and provide unreliable results when the intrinsic dimensionality of the input dataset is high and the manifold where the points are assumed to lie is nonlinearly embedded in a higher dimensional space. To deal with these problems, we have proposed two algorithms, $\texttt{MiND_KL}$ [3] and $\texttt{DANCo}$ [2], which are shortly described below.

MiND_KL

$\texttt{MiND_KL}$ compares the empirical pdf of the neighborhood distances computed on the dataset with the distribution of the neighborhood distances computed from points uniformly drawn from hyperspheres of known increasing dimensionality. The id estimate is the dimensionality that minimizes the Kullback-Leibler divergence between the two distributions.

DANCo

Motivated by the promising results achieved by $\texttt{MiND_KL}$, we proposed an extension, namely, $\texttt{DANCo}$, which further reduces the underestimation effect by combining an estimator employing normalized nearest-neighbor distances with one employing mutual angles. More precisely, $\texttt{DANCo}$ compares the statistics estimated on $\mathbf{X}_N$ with those estimated on (uniformly drawn) synthetic datasets of known $\texttt{i.d.}$. The comparisons are performed by two Kullback-Leibler divergences applied to the distribution of normalized nearest neighbor distances and the distribution of pairwise angles. The $\texttt{i.d.}$ estimate is the one minimizing the sum of the two divergences. A fast implementation of $\texttt{DANCo}$ ($\texttt{Fast-DANCo}$) is also developed. We are now using our algorithms to evaluate:
  • the $\texttt{i.d.}$ of representation found by internal layers of deep neural netwoks trained to recognize object and faces.
  • the $\texttt{i.d.}$ of data taken from a variety of sensors and used to model human cognitive/emotional states.
Table 1 shows a comparison between perfomance of our algorithms and other well-known state of the art techniques (for further references see [1]).

$\textbf{Dataset}$ $d$ $\texttt{MLE}$ $\texttt{kNNG}_{1}$ $\texttt{kNNG}_{2}$ $\texttt{BPCA}$ $\texttt{Hein}$ $\texttt{CD}$ $\texttt{MiND}_{\texttt{KL}}$ $\texttt{DANCo}$ $\texttt{MLSVD}$
$\mathcal{M}_1$ $\textit{10.00}$ 9.10 9.16 9.89 5.45 9.45 9.12 10.30 10.09 $\textbf{10.00}$
$\mathcal{M}_2$ $\textit{3.00}$ 2.88 2.95 3.03 $\textbf{3.00}$ $\textbf{3.00}$ 2.88 $\textbf{3.00}$ $\textbf{3.00}$ $\textbf{3.00}$
$\mathcal{M}_3$ $\textit{4.00}$ 3.83 3.75 3.82 $\textbf{4.00}$ $\textbf{4.00}$ 3.23 $\textbf{4.00}$ $\textbf{4.00}$ 2.08
$\mathcal{M}_4$ $\textit{4.00}$ 3.95 4.05 4.76 4.25 $\textbf{4.00}$ 3.88 4.15 $\textbf{4.00}$ 8.00
$\mathcal{M}_5$ $\textit{2.00}$ 1.97 1.96 2.06 $\textbf{2.00}$ $\textbf{2.00}$ 1.98 $\textbf{2.00}$ $\textbf{2.00}$ $\textbf{2.00}$
$\mathcal{M}_6$ $\textit{6.00}$ 6.39 6.46 11.24 12.00 $\textbf{5.95}$ 5.91 6.50 7.00 12.00
$\mathcal{M}_7$ $\textit{2.00}$ 1.96 1.97 2.09 $\textbf{2.00}$ $\textbf{2.00}$ 1.93 2.07 $\textbf{2.00}$ 2.35
$\mathcal{M}_9$ $\textit{20.00}$ 14.64 15.25 10.59 13.55 15.50 13.75 19.15 19.71 $\textbf{20.00}$
$\mathcal{M}_{10a}$ $\textit{10.00}$ 8.26 8.62 10.21 5.20 8.90 8.09 9.85 9.86 $\textbf{10.00}$
$\mathcal{M}_{10b}$ $\textit{17.00}$ 12.87 13.69 15.38 9.46 13.85 12.30 16.25 16.62 $\textbf{17.00}$
$\mathcal{M}_{10c}$ $\textit{24.00}$ 16.96 17.67 21.42 13.3 17.95 15.58 22.55 24.28 $\textbf{24.00}$
$\mathcal{M}_{10d}$ $\textit{70.00}$ 36.49 39.67 40.31 71.00 38.69 31.4 64.38 70.52 $\textbf{70.00}$
$\mathcal{M}_{11}$ $\textit{2.00}$ 2.21 1.95 2.03 1.55 2.00 2.19 $\textbf{2.00}$ $\textbf{2.00}$ 1.00
$\mathcal{M}_{12}$ $\textit{20.00}$ 15.82 16.40 24.89 13.7 15.00 11.26 19.35 19.90 $\textbf{20.00}$
$\mathcal{M}_{13}$ $\textit{1.00}$ $\textbf{1.00}$ 0.97 1.07 5.70 $\textbf{1.00}$ 1.14 $\textbf{1.00}$ $\textbf{1.00}$ $\textbf{1.00}$
$\mathcal{M}_{N1}$ $\textit{18.00}$ 12.25 14.26 19.8 36.00 14.10 10.40 17.76 18.76 $\textbf{18.00}$
$\mathcal{M}_{N2}$ $\textit{24.00}$ 14.72 17.62 26.87 48.00 17.76 12.43 23.76 25.76 $\textbf{24.00}$
$\mathcal{M}_{beta}$ $\textit{10.00}$ 6.36 6.45 14.77 19.7 4.00 3.05 7.00 7.00 $\textbf{10.00}$
$\mathcal{M}_{P3}$ $\textit{3.00}$ 2.89 2.93 3.12 7.00 2.00 2.43 $\textbf{3.00}$ $\textbf{3.00}$ 1.00
$\mathcal{M}_{P6}$ $\textit{6.00}$ 4.96 4.98 5.82 7.00 2.66 3.58 5.04 $\textbf{6.00}$ 1.00
$\mathcal{M}_{P9}$ $\textit{9.00}$ 6.35 6.89 8.04 10.95 2.85 4.55 7.00 $\textbf{8.00}$ 1.00
$\texttt{MPE}$ 17.29 14.50 16.79 62.62 19.92 25.96 5.55 $\textbf{3.70}$ 26.34
Table 1: Results achieved on the synthetic datasets. The bottom row reports the mean percentage errore achieved by each algorithm; anyhow, for each test dataset the best approximation results are highlighted in boldface.

References

[1] P Campadelli, E Casiraghi, C Ceruti, and A Rozza. Intrinsic dimension estimation: Relevant techniques and a benchmark framework. 2015.
[2] C. Ceruti, S. Bassis, A Rozza, G. Lombardi, E. Casiraghi, and P. Campadelli. DANCo: an intrinsic Dimensionalty estimator exploiting Angle and Norm Concentration. Pattern recognition, 2014.
[3] G. Lombardi, A. Rozza, C. Ceruti, E. Casiraghi, and P. Campadelli. Mini- mum neighbor distance estimators of intrinsic dimension. Proc. of ECML- PKDD, 6912:374–389, 2011.