summaryrefslogtreecommitdiffstats
path: root/uniqueness.tex
diff options
context:
space:
mode:
Diffstat (limited to 'uniqueness.tex')
-rw-r--r--uniqueness.tex58
1 files changed, 57 insertions, 1 deletions
diff --git a/uniqueness.tex b/uniqueness.tex
index 765242b..9a89c6c 100644
--- a/uniqueness.tex
+++ b/uniqueness.tex
@@ -1,3 +1,59 @@
\section{Skeleton uniqueness}
-\subsection{Face recognition benchmark} \ No newline at end of file
+The most obvious concern raised by trying to use skeletons to
+recognize people is their uniqueness. Are skeletons consistently
+and sufficiently pairwise distinct to have reasonable hope of using
+them for people recognition?
+
+\subsection{Face recognition benchmark}
+
+A good way to understand the uniqueness of a metric is to look at the
+performance it gives for the \emph{pair-matching problem}. In this
+problem you are given two measurements of the metric and you want to
+decide whether they come from the same individual (matched pair) or
+from two different individuals (unmatched pair).
+
+The \emph{Labeled Faces in the wild} \cite{lfw} database is specifically suited
+to study the face pair matching problem and has been used to benchmark
+several face recognition algorithms. Raw data of this benchmark is
+publicly available and has been derived as follows: the database is
+split into 10 subsets. From each of these subsets, 300 matched pairs and 300
+unmatched pairs are randomly chosen. Each algorithm runs 10 separate leave-one-out cross
+validation experiments on these sets of pairs. Averaging the number of true positives
+and false positives across the 10 experiments for a
+given threshold then yields one point on the true-positive vs
+false-positive curve (also known as ROC).
+
+\subsection{Experiment design}
+
+In order to run an experiment similar to the one used in the face
+pair-matching problem, we use the Goldman Osteological Data Set
+\cite{deadbodies}. This data set consists of osteometric measurements
+of 1538 skeletons dating from throughout the Holocene. We keep from
+these measurements the lengths of six bones (radius, humerus, femur,
+tibia, left coxae, right coxae). Because of missing values, this
+reduces the size of the dataset to 1191.
+
+From this data set, 1191 matched pairs and 1191 unmatched
+pairs are generated. The exact measurements of the bones are never directly
+accessible, but are always perturbed by a noise whose variance depends
+on the collection protocol. This is accounted for by adding
+independent random Gaussian noise to each constituents of the pairs.
+
+\subsection{Results}
+
+The pair-matching problem is then solved by using a proximity
+threshold algorithm: for a given threshold, a pair will be classified
+as \emph{matched} if the Euclidean distance of its two constituents is
+lower than the threshold and \emph{unmatched} otherwise.
+
+This algorithm does not require any training, so it is run on the
+whole set of pairs without doing cross-validation. Figure
+\ref{fig:roc} shows the ROC of the proximity threshold algorithm for
+varying variance of the noise added to the data.
+
+
+%%% Local Variables:
+%%% mode: latex
+%%% TeX-master: "kinect"
+%%% End: