experimental.tex


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60

\section{Experiment design}

We conduct a real-life uncontrolled experiment using the Kinect to test to the
algorithm.  First we discuss the signal outputs of the Kinect.  Second we
describe the environment in which we collect the data. Finally, we interpret
the data.

\subsection{Kinect} The Kinect outputs three primary signals in real-time: a
color image stream, a depth image stream, and microphone output.  For our
purposes, we focus on the depth image stream.  As the Kinect was designed to
interface directly with the Xbox 360~\cite{xbox}, the tools to interact with it
on a PC are limited.  Libfreenect~\cite{libfreenect} is a reverse engineered
driver which gives access to the raw depth images from the Kinect.  This raw
data could be used to implement the algorithms \eg of
Plagemann~\etal{}~\cite{plagemann:icra10}.  Alternatively,
OpenNI~\cite{openni}, a framework sponsored by PrimeSense~\cite{primesense},
the company behind the technology of the Kinect, offers figure detection and
skeleton fitting algorithms on top of raw access to the data streams.  However,
the skeleton fitting algorithm of OpenNI requires each individual to strike a
specific pose for calibration.  More recently, the Kinect for Windows
SDK~\cite{kinect-sdk} was released, and its skeleton fitting algorithm operates
in real-time without calibration.  Given that the Kinect for Windows SDK is the
state-of-the-art, we use it to perform our data collection.

\subsection{Environment}

\begin{itemize}
\item 1 week
\item 23 people
\end{itemize}

\subsection{Data set}

The original dataset consists of the sequence of all the frames where
a skeleton was detected by the Microsoft SDK. For each frames the
following data is available:
\begin{itemize}
\item the 3D coordinates of 20 body joints
\item the z-value: this is the distance from the detected skeleton to
  the camera
\end{itemize}
For some of frames, one or several joints were occluded by another
part of the body. In those cases, the coordinates of these joints are
either absent from the frame or present but tagged as \emph{Inferred}
by the Microsoft SDK. It means that even though the joint was not
present on the frame, the skeleton-fitting algorithm was able to guess
its location.

Each frame also has a skeleton ID number. If this numbers stays the
same across several frames, it means that the skeleton-fitting
algorithm was able to detect the skeleton in a contiguous way. This
allows us to define the concept of a \emph{run}: a sequence of frames
with the same skeleton ID.


%%% Local Variables: 
%%% mode: latex
%%% TeX-master: "kinect"
%%% End: