diff options
| author | Stratis Ioannidis <stratis@stratis-Latitude-E6320.(none)> | 2013-02-11 09:37:09 -0800 |
|---|---|---|
| committer | Stratis Ioannidis <stratis@stratis-Latitude-E6320.(none)> | 2013-02-11 09:37:09 -0800 |
| commit | 8c09cfd7da709aab03fb004b58ecd8e1eb4fb553 (patch) | |
| tree | fe24e8514094cfcd172fce175bf6df60d0031d9a /problem.tex | |
| parent | 114a6b8eac3e6addebe84b831c5eafbec7bc9ef4 (diff) | |
| download | recommendation-8c09cfd7da709aab03fb004b58ecd8e1eb4fb553.tar.gz | |
muthu
Diffstat (limited to 'problem.tex')
| -rw-r--r-- | problem.tex | 30 |
1 files changed, 22 insertions, 8 deletions
diff --git a/problem.tex b/problem.tex index 668cd57..3ad3270 100644 --- a/problem.tex +++ b/problem.tex @@ -13,7 +13,7 @@ For example, each $i$ may correspond to a human subject; the feature vector $x_i$ may correspond to a normalized vector of her age, weight, gender, income, \emph{etc.}, and the measurement $y_i$ may capture some biometric information (\emph{e.g.}, her red cell blood count, a genetic marker, -etc.). The magnitude of the coefficient $\beta_i$ captures the effect that feature $i$ has on the measured variable, and its sign captures whether the corellation is positive or negative. +etc.). The magnitude of the coefficient $\beta_i$ captures the effect that feature $i$ has on the measured variable, and its sign captures whether the correlation is positive or negative. The purpose of these experiments is to allow \E\ to estimate the model $\beta$. In particular, @@ -35,12 +35,15 @@ This optimization, commonly known as \emph{ridge regression}, includes an additi %The estimator $\hat{\beta}$ is unbiased, \emph{i.e.}, $\expt{\hat{\beta}} = \beta$ (where the expectation is over the noise variables $\varepsilon_i$). Furthermore, $\hat{\beta}$ is a multidimensional normal random variable with mean $\beta$ and covariance matrix $(X_S\T{X_S})^{-1}$. Note that the estimator $\hat{\beta}$ is a linear map of $y_S$; as $y_S$ is a multidimensional normal r.v., so is $\hat{\beta}$ (the randomness coming from -the noise terms $\varepsilon_i$). In particular, $\hat{\beta}$ has %mean $\beta$% (\emph{i.e.}, it is an \emph{unbiased estimator}) and - covariance -$(R+\T{X_S}X_S)^{-1}$. +the noise terms $\varepsilon_i$). +%In particular, $\hat{\beta}$ has %mean $\beta$% (\emph{i.e.}, it is an \emph{unbiased estimator}) and +% covariance +%$(R+\T{X_S}X_S)^{-1}$. Let $V:2^\mathcal{N}\to\reals$ be a \emph{value function}, quantifying how informative a set of experiments $S$ is in estimating $\beta$. The classical experimental design problem amounts to finding a set $S$ that maximizes $V(S)$ subject to the constraint $|S|\leq k$. -A variety of different value functions are used in literature~\cite{pukelsheim2006optimal}; almost all make use of the covariance $(R+\T{X_S}X_S)^{-1}$ of the estimator $\hat{\beta}$. A value function that has natural advantages is the \emph{information gain}: %\emph{$D$-optimality criterion}: %which yields the following optimization problem +A variety of different value functions are used in literature~\cite{pukelsheim2006optimal}. +%; almost all make use of the covariance $(R+\T{X_S}X_S)^{-1}$ of the estimator $\hat{\beta}$. +A value function that has natural advantages is the \emph{information gain}: %\emph{$D$-optimality criterion}: %which yields the following optimization problem \begin{align} V(S)= I(\beta;y_S) = \entropy(\beta)-\entropy(\beta\mid y_S). \label{informationgain} \end{align} @@ -52,7 +55,7 @@ Under the linear model \eqref{model}, and the Gaussian prior, the information ga \begin{align} V(S) &= \frac{1}{2}\log\det(R+ \T{X_S}X_S) \label{dcrit} %\\ \end{align} -which is indeed a function of the covariance matrix $(R+\T{X_S}X_S)^{-1}$. +%which is indeed a function of the covariance matrix $(R+\T{X_S}X_S)^{-1}$. %defined as $-\infty$ when $\mathrm{rank}(\T{X_S}X_S)<d$. %As $\hat{\beta}$ is a multidimensional normal random variable, the %$D$-optimality criterion is equal (up to a constant) to the negative of the @@ -66,14 +69,17 @@ which is indeed a function of the covariance matrix $(R+\T{X_S}X_S)^{-1}$. %\end{align} %There are several reasons %In addition, the maximization of convex relaxations of this function is a well-studied problem \cite{boyd}. -Our analysis will focus on the case of a \emph{homotropic} prior, in which the prior covariance is the identity matrix, \emph{i.e.}, $$R=I_d\in \reals^{d\times d}.$$ Intuitively, this corresponds to the simplest prior, in which no direction of $\reals^d$ is a priori favored; equivalently, it also corresponds to the case where ridge regression estimation \eqref{ridge} performed by $\E$ has a penalty term $\norm{\beta}_2^2$. +Our analysis will focus on the case of a \emph{homotropic} prior, in which the prior covariance is the identity matrix, \emph{i.e.}, $R=I_d\in \reals^{d\times d}.$ Intuitively, this corresponds to the simplest prior, in which no direction of $\reals^d$ is a priori favored; equivalently, it also corresponds to the case where ridge regression estimation \eqref{ridge} performed by $\E$ has a penalty term $\norm{\beta}_2^2$. In Section 5, we will address other $R$'s. %Note that \eqref{dcrit} is a submodular set function, \emph{i.e.}, %$V(S)+V(T)\geq V(S\cup T)+V(S\cap T)$ for all $S,T\subseteq \mathcal{N}$; it is also monotone, \emph{i.e.}, $V(S)\leq V(T)$ for all $S\subset T$. \subsection{Budgeted-Feasible Experimental Design: Full Information Case} -We depart from the above classic experimental design setting by assuming that each experiment is associated with a cost $c_i\in\reals_+$. Moreover, the experimenter $\E$ is limited by a budget $B\in \reals_+$. +Beyond the cardinality constraint in classical experimental design discussed above, is +the budgeted version. +%depart from the above classic experimental design setting by assuming that +Each experiment is associated with a cost $c_i\in\reals_+$. Moreover, the experimenter $\E$ is limited by a budget $B\in \reals_+$. The cost $c_i$ can capture, \emph{e.g.}, the amount the subject $i$ deems sufficient to incentivize her participation in the experiment. In the full-information case, the experiment costs are common knowledge; as such, the optimization problem that the experimenter wishes to solve is: @@ -104,6 +110,14 @@ for all $S\subseteq\mathcal{N}$, since the matrix $\T{X_S}X_S$ is positive semi- the optimal value achievable in the full-information case. \subsection{Experimental Design Problem: Strategic Case} +We study the strategic case when $c_i$'s are specified by the subjects and they are strategic. +In this case, the costs $c_i$ are {\em not} common knowledge and they are reported by the +% their reporting can be manipulated by the +experiment subjects. + Moreover, though a subject may lie about her true cost $c_i$, she cannot lie about $x_i$ (\emph{i.e.}, all public features are verifiable prior to the experiment) nor $y_i$ (\emph{i.e.}, the subject cannot falsify her measurement). +The subjects report $c_i$ in order to maximize their expected {\em utility} which is +$p_i-c_i$. + When the experiment subjects are strategic, the experimental design problem becomes a special case of a \emph{budget feasible reverse auction}, as introduced by \citeN{singer-mechanisms}. %comprises a set of items $\mathcal{N}=\{1,\ldots,n\}$ as well as a single buyer. Each item has a cost $c_i\in\reals_+$. Moreover, the buyer has a positive value function $V:2^{\mathcal{N}}\to \reals_+$, as well as a budget $B\in \reals_+$. In the \emph{full information case}, the costs $c_i$ are common %knowledge; the objective of the buyer in this context is to select a set $S$ |
