From 8c09cfd7da709aab03fb004b58ecd8e1eb4fb553 Mon Sep 17 00:00:00 2001
From: Stratis Ioannidis <stratis@stratis-Latitude-E6320.(none)>
Date: Mon, 11 Feb 2013 09:37:09 -0800
Subject: muthu

---
 problem.tex | 30 ++++++++++++++++++++++--------
 1 file changed, 22 insertions(+), 8 deletions(-)

(limited to 'problem.tex')

diff --git a/problem.tex b/problem.tex
index 668cd57..3ad3270 100644
--- a/problem.tex
+++ b/problem.tex
@@ -13,7 +13,7 @@ For example, each $i$ may correspond to a human subject; the
 feature vector $x_i$ may correspond to a normalized vector of her age, weight,
 gender, income, \emph{etc.}, and the measurement $y_i$ may capture some
 biometric information (\emph{e.g.}, her red cell blood count, a genetic marker,
-etc.). The magnitude of the coefficient $\beta_i$ captures the effect that feature $i$ has on the measured variable, and its sign captures whether the corellation is positive or negative.
+etc.). The magnitude of the coefficient $\beta_i$ captures the effect that feature $i$ has on the measured variable, and its sign captures whether the correlation is positive or negative.
 
 
 The purpose of these experiments is to allow \E\  to estimate the model $\beta$. In particular,
@@ -35,12 +35,15 @@ This optimization, commonly known as \emph{ridge regression}, includes an additi
 %The estimator $\hat{\beta}$ is unbiased, \emph{i.e.}, $\expt{\hat{\beta}} = \beta$ (where the expectation is over the noise variables $\varepsilon_i$). Furthermore, $\hat{\beta}$ is a multidimensional normal random variable with mean $\beta$ and covariance matrix $(X_S\T{X_S})^{-1}$. 
 Note that the estimator $\hat{\beta}$ is a linear map of $y_S$; as $y_S$ is
 a multidimensional normal r.v., so is $\hat{\beta}$ (the randomness coming from
-the noise terms $\varepsilon_i$). In particular, $\hat{\beta}$ has %mean $\beta$% (\emph{i.e.}, it is an \emph{unbiased estimator}) and
- covariance
-$(R+\T{X_S}X_S)^{-1}$.
+the noise terms $\varepsilon_i$). 
+%In particular, $\hat{\beta}$ has %mean $\beta$% (\emph{i.e.}, it is an \emph{unbiased estimator}) and
+% covariance
+%$(R+\T{X_S}X_S)^{-1}$.
 
 Let $V:2^\mathcal{N}\to\reals$ be a \emph{value function}, quantifying how informative a set of experiments $S$ is in estimating $\beta$. The classical experimental design problem amounts to finding a set $S$ that maximizes $V(S)$ subject to the constraint $|S|\leq k$. 
-A variety of different value functions are used in literature~\cite{pukelsheim2006optimal}; almost all make use of the covariance $(R+\T{X_S}X_S)^{-1}$ of the estimator $\hat{\beta}$. A value function that has natural advantages is the \emph{information gain}:  %\emph{$D$-optimality criterion}: %which yields the following optimization problem
+A variety of different value functions are used in literature~\cite{pukelsheim2006optimal}.
+%; almost all make use of the covariance $(R+\T{X_S}X_S)^{-1}$ of the estimator $\hat{\beta}$. 
+A value function that has natural advantages is the \emph{information gain}:  %\emph{$D$-optimality criterion}: %which yields the following optimization problem
 \begin{align}
    V(S)= I(\beta;y_S) = \entropy(\beta)-\entropy(\beta\mid y_S). \label{informationgain}
 \end{align}
@@ -52,7 +55,7 @@ Under the linear model \eqref{model}, and the Gaussian prior, the information ga
 \begin{align}
  V(S) &= \frac{1}{2}\log\det(R+ \T{X_S}X_S) \label{dcrit} %\\
 \end{align}
-which is indeed a function of the covariance matrix $(R+\T{X_S}X_S)^{-1}$.
+%which is indeed a function of the covariance matrix $(R+\T{X_S}X_S)^{-1}$.
 %defined as $-\infty$ when $\mathrm{rank}(\T{X_S}X_S)<d$. 
 %As $\hat{\beta}$ is a multidimensional normal random variable, the
 %$D$-optimality criterion is equal (up to a constant) to the negative of the
@@ -66,14 +69,17 @@ which is indeed a function of the covariance matrix $(R+\T{X_S}X_S)^{-1}$.
 %\end{align}
 %There are several  reasons
  %In addition, the maximization of convex relaxations of this function is a  well-studied problem \cite{boyd}.
-Our analysis will focus on the case of a \emph{homotropic} prior, in which the prior covariance is the identity matrix, \emph{i.e.}, $$R=I_d\in \reals^{d\times d}.$$ Intuitively, this corresponds to the simplest prior, in which no direction of $\reals^d$ is a priori favored; equivalently, it also corresponds to the case where ridge regression estimation \eqref{ridge} performed by $\E$ has a penalty term $\norm{\beta}_2^2$.
+Our analysis will focus on the case of a \emph{homotropic} prior, in which the prior covariance is the identity matrix, \emph{i.e.}, $R=I_d\in \reals^{d\times d}.$ Intuitively, this corresponds to the simplest prior, in which no direction of $\reals^d$ is a priori favored; equivalently, it also corresponds to the case where ridge regression estimation \eqref{ridge} performed by $\E$ has a penalty term $\norm{\beta}_2^2$. In Section 5, we will address other $R$'s. 
 
  %Note that \eqref{dcrit} is a submodular set function, \emph{i.e.}, 
 %$V(S)+V(T)\geq V(S\cup T)+V(S\cap T)$ for all $S,T\subseteq \mathcal{N}$; it is also monotone, \emph{i.e.}, $V(S)\leq V(T)$ for all $S\subset T$.
 
 \subsection{Budgeted-Feasible Experimental Design: Full Information Case}
 
-We depart from the above classic experimental design setting by assuming that each experiment is associated with a cost $c_i\in\reals_+$. Moreover, the experimenter $\E$ is limited by a budget $B\in \reals_+$. 
+Beyond the cardinality constraint in classical experimental design discussed above, is 
+the budgeted version. 
+%depart from the above classic experimental design setting by assuming that 
+Each experiment is associated with a cost $c_i\in\reals_+$. Moreover, the experimenter $\E$ is limited by a budget $B\in \reals_+$. 
 The cost $c_i$ can capture, \emph{e.g.}, the amount the subject $i$ deems sufficient to
 incentivize her participation in the experiment. 
 In the full-information case, the experiment costs are common knowledge; as such, the optimization problem that the experimenter wishes to solve is: 
@@ -104,6 +110,14 @@ for all $S\subseteq\mathcal{N}$, since the matrix $\T{X_S}X_S$ is positive semi-
 the optimal value achievable in the full-information case. 
 
 \subsection{Experimental Design Problem: Strategic Case}
+We study the strategic case when $c_i$'s are specified by the subjects and they are strategic. 
+In this case, the costs $c_i$ are {\em not} common knowledge and they are reported by the 
+% their reporting can be manipulated by the 
+experiment subjects.
+ Moreover, though a subject may lie about her true cost $c_i$, she cannot lie about $x_i$ (\emph{i.e.}, all public features are verifiable prior to the experiment) nor $y_i$ (\emph{i.e.}, the subject cannot falsify her measurement).
+The subjects report $c_i$ in order to maximize their expected {\em utility} which is 
+$p_i-c_i$.
+
 
 When the experiment subjects are strategic, the experimental design problem becomes a special case of  a \emph{budget feasible reverse auction}, as introduced by \citeN{singer-mechanisms}.  %comprises a set of items $\mathcal{N}=\{1,\ldots,n\}$ as well as a single buyer. Each item has a cost $c_i\in\reals_+$. Moreover, the buyer has a positive value function  $V:2^{\mathcal{N}}\to \reals_+$, as well as a budget $B\in \reals_+$. In the \emph{full information case}, the costs $c_i$ are common
 %knowledge; the objective of the buyer in this context is to select a set $S$
-- 
cgit v1.2.3-70-g09d2