diff options
Diffstat (limited to 'problem.tex')
| -rw-r--r-- | problem.tex | 24 |
1 files changed, 16 insertions, 8 deletions
diff --git a/problem.tex b/problem.tex index cec7b47..668cd57 100644 --- a/problem.tex +++ b/problem.tex @@ -2,7 +2,8 @@ \subsection{Linear Regression and Experimental Design} The theory of experimental design \cite{pukelsheim2006optimal,atkinson2007optimum,chaloner1995bayesian} considers the following formal setting. % studies how an experimenter \E\ should select the parameters of a set of experiments she is about to conduct. In general, the optimality of a particular design depends on the purpose of the experiment, \emph{i.e.}, the quantity \E\ is trying to learn or the hypothesis she is trying to validate. Due to their ubiquity in statistical analysis, a large literature on the subject focuses on learning \emph{linear models}, where \E\ wishes to fit a linear function to the data she has collected. -Putting cost considerations aside, suppose that an experimenter \E\ wishes to conduct $k$ among $n$ possible experiments. Each experiment $i\in\mathcal{N}\defeq \{1,\ldots,n\}$ is associated with a set of parameters (or features) $x_i\in \reals^d$, normalized so that $\|x_i\|_2\leq 1$. Denote by $S\subseteq \mathcal{N}$, where $|S|=k$, the set of experiments selected; upon its execution, experiment $i\in S$ reveals an output variable (the ``measurement'') $y_i$, related to the experiment features $x_i$ through a linear function, \emph{i.e.}, +%Putting cost considerations aside +Suppose that an experimenter \E\ wishes to conduct $k$ among $n$ possible experiments. Each experiment $i\in\mathcal{N}\defeq \{1,\ldots,n\}$ is associated with a set of parameters (or features) $x_i\in \reals^d$, normalized so that $\|x_i\|_2\leq 1$. Denote by $S\subseteq \mathcal{N}$, where $|S|=k$, the set of experiments selected; upon its execution, experiment $i\in S$ reveals an output variable (the ``measurement'') $y_i$, related to the experiment features $x_i$ through a linear function, \emph{i.e.}, \begin{align}\label{model} \forall i\in\mathcal{N},\quad y_i = \T{\beta} x_i + \varepsilon_i \end{align} @@ -71,6 +72,7 @@ Our analysis will focus on the case of a \emph{homotropic} prior, in which the p %$V(S)+V(T)\geq V(S\cup T)+V(S\cap T)$ for all $S,T\subseteq \mathcal{N}$; it is also monotone, \emph{i.e.}, $V(S)\leq V(T)$ for all $S\subset T$. \subsection{Budgeted-Feasible Experimental Design: Full Information Case} + We depart from the above classic experimental design setting by assuming that each experiment is associated with a cost $c_i\in\reals_+$. Moreover, the experimenter $\E$ is limited by a budget $B\in \reals_+$. The cost $c_i$ can capture, \emph{e.g.}, the amount the subject $i$ deems sufficient to incentivize her participation in the experiment. @@ -84,20 +86,26 @@ In the full-information case, the experiment costs are common knowledge; as such \end{align}\label{edp} \end{subequations} \end{center} -\EDP{}, as defined above, is NP-hard; to see this, note that \textsc{Knapsack} reduces to EDP with dimension $d=1$ by mapping the weight of each item, say, $w_i$, to an experiment with $x_i=w_i$. Moreover, \eqref{modified} is submodular, monotone and satisfies $V(\emptyset) = 0$. +\EDP{}, as defined above, is NP-hard; to see this, note that \textsc{Knapsack} +reduces to EDP with dimension $d=1$ by mapping the weight of each item, say, +$w_i$, to an experiment with $x_i^2=w_i$.% Moreover, \eqref{modified} is submodular, monotone and satisfies $V(\emptyset) = 0$. Note that \eqref{modified} is a submodular set function, \emph{i.e.}, -$V(S)+V(T)\geq V(S\cup T)+V(S\cap T)$ for all $S,T\subseteq \mathcal{N}$. It is also monotone, \emph{i.e.}, $V(S)\leq V(T)$ for all $S\subset T$, with $V(\emptyset)=0$. Finally, it is also non-negative, \emph{i.e.}, $V(S)\geq 0$ for all $S\subset\mathcal{N}$, as the matrix $\T{X_S}X_S$ is positive semi-definite for all $S\subseteq \mathcal{N}$. +$V(S)+V(T)\geq V(S\cup T)+V(S\cap T)$ for all $S,T\subseteq \mathcal{N}$. It is +also monotone, \emph{i.e.}, $V(S)\leq V(T)$ for all $S\subset T$, with +$V(\emptyset)=0$. Finally, it is also non-negative, \emph{i.e.}, $V(S)\geq 0$ +for all $S\subseteq\mathcal{N}$, since the matrix $\T{X_S}X_S$ is positive semi-definite for all $S\subseteq \mathcal{N}$. %Finally, we define the strategic version of \EDP{} (denoted by \SEDP) by adding the additional constraints of truthfulness, individual rationality, and normal payments, as described in the previous section. We denote by \begin{equation}\label{eq:non-strategic} OPT = \max_{S\subseteq\mathcal{N}} \Big\{V(S) \;\Big| \; \sum_{i\in S}c_i\leq B\Big\} \end{equation} -for the optimal value achievable in the full-information case. +the optimal value achievable in the full-information case. + \subsection{Experimental Design Problem: Strategic Case} -When the experiment subjects are strategic, the experimental design problem becomes a special case of a \emph{budget feasible reverse auction} -, as introduced by \citeN{singer-mechanisms}. %comprises a set of items $\mathcal{N}=\{1,\ldots,n\}$ as well as a single buyer. Each item has a cost $c_i\in\reals_+$. Moreover, the buyer has a positive value function $V:2^{\mathcal{N}}\to \reals_+$, as well as a budget $B\in \reals_+$. In the \emph{full information case}, the costs $c_i$ are common + +When the experiment subjects are strategic, the experimental design problem becomes a special case of a \emph{budget feasible reverse auction}, as introduced by \citeN{singer-mechanisms}. %comprises a set of items $\mathcal{N}=\{1,\ldots,n\}$ as well as a single buyer. Each item has a cost $c_i\in\reals_+$. Moreover, the buyer has a positive value function $V:2^{\mathcal{N}}\to \reals_+$, as well as a budget $B\in \reals_+$. In the \emph{full information case}, the costs $c_i$ are common %knowledge; the objective of the buyer in this context is to select a set $S$ %maximizing the value $V(S)$ subject to the constraint $\sum_{i\in S} c_i\leq %B$. We write: @@ -117,11 +125,11 @@ $S:\reals_+^n \to 2^\mathcal{N}$ and (b) a \emph{payment function} $p:\reals_+^n\to \reals_+^n$. Given the vector or costs $c=[c_i]_{i\in\mathcal{N}}$, the allocation function $f$ determines the set in $ \mathcal{N}$ of items to be purchased, while the payment function -returns a vector of payments $[p_i]_{i\in \mathcal{N}}$. +returns a vector of payments $[p_i(c)]_{i\in \mathcal{N}}$. Let $s_i(c) = \id_{i\in S(c)}$ be the binary indicator of $i\in f(c)$. Mechanism design in budget feasible reverse auctions seeks mechanisms that have the following properties \cite{singer-mechanisms,chen}: \begin{itemize} \item \emph{Normalization.} Unallocated experiments receive zero payments, \emph{i.e.}, - \begin{align}\notin S(c)\text{ implies }p_i(c)=0.\label{normal}\end{align} + \begin{align}s_i(c)=0\text{ implies }p_i(c)=0.\label{normal}\end{align} \item\emph{Individual Rationality.} Payments for allocated experiments exceed costs, \emph{i.e.}, \begin{align} p_i(c)\geq c_i\cdot s_i(c).\label{ir}\end{align} \item \emph{No Positive Transfers.} Payments are non-negative,\emph{i.e.}, |
