diff options
| author | unknown <Brano@Toshibicka.(none)> | 2012-03-04 02:36:36 -0800 |
|---|---|---|
| committer | unknown <Brano@Toshibicka.(none)> | 2012-03-04 02:36:36 -0800 |
| commit | 024f8d1577ffcea50fb4f0cdfcb26a9cf2e91c47 (patch) | |
| tree | 9e5be1cd522b2ceb9517349597cbfb309a617ebc /algorithm.tex | |
| parent | a0651121b5bf29743e19e69dd0421fedb0a8b2fd (diff) | |
| download | kinect-024f8d1577ffcea50fb4f0cdfcb26a9cf2e91c47.tar.gz | |
Model justification
Diffstat (limited to 'algorithm.tex')
| -rw-r--r-- | algorithm.tex | 15 |
1 files changed, 11 insertions, 4 deletions
diff --git a/algorithm.tex b/algorithm.tex index 8bdd5a9..cdc4f09 100644 --- a/algorithm.tex +++ b/algorithm.tex @@ -12,7 +12,7 @@ A mixture of Gaussians \cite{bishop06pattern} is a generative probabilistic mode P(\bx, y) = \cN(\bx | \bar{\bx}_y, \Sigma) P(y), \label{eq:mixture of Gaussians} \end{align} -where $P(y)$ is the probability of class $y$ and $\cN(\bx | \bar{\bx}_y, \Sigma)$ is a class conditional that is modeled as a multivariate normal distribution. The distribution is centered at a vector $\bar{\bx}_y$, and the density of the class is described by the covariance matrix $\Sigma$. The decision boundary between any two classes $y_1$ and $y_2$, $P(\bx, y_1) \! = \! P(\bx, y_2)$, is linear when all class conditionals have the same covariance matrix $\Sigma$ \cite{bishop06pattern}. Thus, the mixture of Gaussians model is a probabilistic version of the nearest-neighbor (NN) classifier from Section~\ref{sec:uniqueness}. +where $P(y)$ is the probability of class $y$ and $\cN(\bx | \bar{\bx}_y, \Sigma)$ is a multivariate normal distribution, which is known as a class conditional. The mean of the distribution is $\bar{\bx}_y$ and the variance around $\bar{\bx}_y$ is captured by the covariance matrix $\Sigma$. When all class conditionals have the same covariance matrix $\Sigma$, the decision boundary between any two classes $y$ is linear \cite{bishop06pattern}. In this setting, the mixture of Gaussians model can be viewed as a probabilistic formulation of the nearest-neighbor (NN) classifier from Section~\ref{sec:uniqueness}. The mixture of Gaussians model has many advantages. First, the model can be easily learned using maximum-likelihood (ML) estimation \cite{bishop06pattern}. In particular, $P(y)$ is the frequency of class $y$ in training data, $\bar{\bx}_y$ is the expectation of $\bx$ given $y$, and the covariance matrix $\Sigma$ is estimated as a weighted sum $\Sigma = \sum_y P(y) \Sigma_y$, where $\Sigma_y$ is the covariance matrix corresponding to class $y$. Second, the inference in the model can be performed in a closed form. In particular, the predicted label is given by $\hat{y} = \arg\max_y P(y | \bx)$, where: \begin{align} @@ -23,7 +23,14 @@ The mixture of Gaussians model has many advantages. First, the model can be easi \end{align} In practice, the prediction $\hat{y}$ is accepted when the classifier is confident. In other words, $P(\hat{y} | \bx) \! > \! \delta$, where $\delta \in (0, 1)$ is a threshold that controls the precision and recall of the classifier. In general, the higher the threshold $\delta$, the lower the recall and the higher the precision. -In this work, we use the mixture of Gaussians model for skeleton recognition. In this problem, the feature vector $\bx$ are skeleton measurements and each person corresponds to one class $y$. +In this work, we use the mixture of Gaussians model for skeleton recognition. Skeleton measurements are represented by a vector $\bx$ and each person is assigned to one class $y$. To verify that our approach is suitable for skeleton recognition, we plot for each skeleton feature (Section~\ref{sec:experiment}) the histogram of differences between the feature and its mean value in the corresponding class (Figure~\ref{fig:marginals}). All distributions look approximately normal. This indicates that the class conditionals $P(\bx | y)$ are multivariate normal and our generative model may be nearly optimal. + +\begin{figure}[t] + \centering + \includegraphics[height=4.4in, angle=90, bb=4.5in 1.5in 6.5in 7in]{graphics/Marginals} + \caption{The histograms of differences between 9 skeleton features (Section~\ref{sec:experiment}) and their mean value for the corresponding person.} + \label{fig:marginals} +\end{figure} \subsection{Sequential hypothesis testing} @@ -38,6 +45,6 @@ The mixture of Gaussians model can be extended to temporal inference through seq \end{align} In practice, the prediction $\hat{y} = \arg\max_y P(y | \bx^{(1)}, \dots, \bx^{(t)})$ is accepted when the classifier is confident. In other words, $P(\hat{y} | \bx^{(1)}, \dots, \bx^{(t)}) > \delta$, where the threshold $\delta \in (0, 1)$ controls the precision and recall of the predictor. In general, the higher the threshold $\delta$, the higher the precision and the lower the recall. -Sequential hypothesis testing is a common technique for smoothing temporal predictions. In particular, note that the prediction at time $t$ depends on all data up to time $t$. This reduces the variance of predictions, especially when input data are noisy, such as in real-world skeleton recognition. +Sequential hypothesis testing is a common technique for smoothing temporal predictions. In particular, note that the prediction at time $t$ depends on all data up to time $t$. This reduces the variance of predictions, especially when input data are noisy, such as in the domain of skeleton recognition. -In skeleton recognition, the sequence $\bx^{(1)}, \dots, \bx^{(t)}$ are skeleton measurements of a person walking towards the camera, for instance. If the camera detects more people, we use tracking in the camera to identify individual skeleton sequences. +In skeleton recognition, the sequence $\bx^{(1)}, \dots, \bx^{(t)}$ are skeleton measurements of a person walking towards the camera, for instance. If the camera detects more people, we use tracking to distinguish individual skeleton sequences. |
