aboutsummaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
authorThibaut Horel <thibaut.horel@gmail.com>2015-04-02 12:59:58 -0400
committerThibaut Horel <thibaut.horel@gmail.com>2015-04-02 12:59:58 -0400
commit4ad2f2ed18bacf270db38f4aaf5c35e13615a98e (patch)
tree57500b3312c9e5297250c4c5c56dd9d2686dfcc1
parentb07276072fd182c9228e6a6b800f0390672d05f1 (diff)
downloadlearn-optimize-4ad2f2ed18bacf270db38f4aaf5c35e13615a98e.tar.gz
Add subset selection application
-rw-r--r--results.tex9
1 files changed, 9 insertions, 0 deletions
diff --git a/results.tex b/results.tex
index 8e6a186..ce5368d 100644
--- a/results.tex
+++ b/results.tex
@@ -209,6 +209,15 @@ sets are the sets of size at most one).
This can be written as multivariate concave over modular
(\textbf{TODO:} I think multivariate concave over modular is not
submodular in general, it is for $\log\det$. Understand this better).
+ \item \emph{data subset selection/summarization:} in statistical machine
+ translation, Bilmes used sum of concave over modular:
+ \begin{displaymath}
+ f(S) = \sum_{f} \lambda_f \phi\left(\sum_{e\in S}w_f(e)\right)
+ \end{displaymath}
+ where each $f$ represents a feature, $w_f(e)$ represents how much of
+ $f$ element $e$ has, and $\phi$ captures decreasing marginal gain when
+ we have a lot of a given feature.
+ Facility location functions are also commonly used for subset selection.
\end{itemize}
\section{Passive Optimization}