summaryrefslogtreecommitdiffstats
path: root/uniqueness.tex
blob: 68d7d2d72d83c8fcf8e296b9b0f32150ecf760dc (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
\section{Skeleton uniqueness}
\label{sec:uniqueness}

The most obvious concern raised by trying to use skeletons to
recognize people is their uniqueness. Are skeletons consistently
and sufficiently pairwise distinct to have reasonable hope of using
them for people recognition?

\subsection{Face recognition benchmark}

A good way to understand the uniqueness of a metric is to look at the
performance it gives for the \emph{pair-matching problem}. In this
problem you are given two measurements of the metric and you want to
decide whether they come from the same individual (matched pair) or
from two different individuals (unmatched pair).

The \emph{Labeled Faces in the wild} \cite{lfw} database is specifically suited
to study the face pair matching problem and has been used to benchmark
several face recognition algorithms. Raw data of this benchmark is
publicly available and has been derived as follows: the database is
split into 10 subsets. From each of these subsets, 300 matched pairs and 300
unmatched pairs are randomly chosen. Each algorithm runs 10 separate leave-one-out cross
validation experiments on these sets of pairs. Averaging the number of true positives
and false positives across the 10 experiments for a
given threshold then yields one point on the true-positive vs
false-positive curve (also known as ROC). 

\subsection{Experiment design}

In order to run an experiment similar to the one used in the face
pair-matching problem, we use the Goldman Osteological Data Set
\cite{deadbodies}. This data set consists of osteometric measurements
of 1538 skeletons dating from throughout the Holocene. We keep from
these measurements the lengths of six bones (radius, humerus, femur,
tibia, left coxae, right coxae). Because of missing values, this
reduces the size of the dataset to 1191.

From this data set, 1191 matched pairs and 1191 unmatched
pairs are generated. The exact measurements of the bones are never directly
accessible, but are always perturbed by a noise whose variance depends
on the collection protocol. This is accounted for by adding
independent random Gaussian noise to each constituents of the pairs.

\subsection{Results}

The pair-matching problem is then solved by using a proximity
threshold algorithm: for a given threshold, a pair will be classified
as \emph{matched} if the Euclidean distance of its two constituents is
lower than the threshold and \emph{unmatched} otherwise.

This algorithm does not require any training, so it is run on the
whole set of pairs without doing cross-validation. Figure
\ref{fig:roc} shows the ROC of the proximity threshold algorithm for
varying variance of the noise added to the data.


%%% Local Variables: 
%%% mode: latex
%%% TeX-master: "kinect"
%%% End: