summaryrefslogtreecommitdiffstats
path: root/notes.tex
diff options
context:
space:
mode:
Diffstat (limited to 'notes.tex')
-rw-r--r--notes.tex133
1 files changed, 113 insertions, 20 deletions
diff --git a/notes.tex b/notes.tex
index 2d6477d..615f71b 100644
--- a/notes.tex
+++ b/notes.tex
@@ -1,59 +1,152 @@
\documentclass{article}
\usepackage[utf8]{inputenc}
-
+\usepackage{amsmath}
\begin{document}
-\paragraph{General problem} A user is evolving in an environment, understand how to place ads in this environment.
+\section{Introduction}
+\paragraph{General problem} A user is evolving in an environment, understand how
+to place ads in this environment.
-\paragraph{Model} The environment is a weighted directed graph, the vertices are called the \emph{scenes} (aisles in a store, web pages). The outgoing edges of a vertex are the actions the user can do at this vertex. The weight of an edge is the probability of chosing this action. A user \emph{session} is a Markov process over the graph (shopping session, web browsing session).
+\paragraph{Model} The environment is a weighted directed graph, the vertices are
+called the \emph{scenes} (aisles in a store, web pages). The outgoing edges of a
+vertex are the actions the user can do at this vertex. The weight of an edge is
+the probability of choosing this action. A user \emph{session} is a Markov
+process over the graph (shopping session, web browsing session).
-\emph{Could be added later: each scene has a duration, the average time spent by a user on this scene.}
+\emph{Could be added later: each scene has a duration, the average time spent by
+ a user on this scene.}
-On each scene an ad can be displayed. Advertisers want to place their ad on the graph. Each advertiser's request is characterized by the following parameters:
+On each scene an ad can be displayed. Advertisers want to place their ad on the
+graph. Each advertiser's request is characterized by the following parameters:
\begin{itemize}
\item a per-impression value
\item a subset of the nodes where the ad can be displayed
\item the expected number of times the ad will be seen
\item \emph{could be added later: the expected duration of exposure to the user}
-\item \emph{could be added later: a multiplying factor for the per-impression value if the ad is displayed on consecutive scenes}
+\item \emph{could be added later: a multiplying factor for the per-impression
+ value if the ad is displayed on consecutive scenes}
\end{itemize}
Two different types of problem:
\begin{itemize}
-\item there is a flow of request, only one request can be granted at the same time: when a request arrives, it can either be granted or delayed (with a cost). This yields an online scheduling problem.
-\item all the request arrive at the same time, several requests can be granted at the same time, an auction is runned to select the winning advertisers.
+\item there is a flow of request, only one request can be granted at the same
+ time: when a request arrives, it can either be granted or delayed (with a
+ cost). This yields an online scheduling problem.
+\item all the request arrive at the same time, several requests can be granted
+ at the same time, an auction is run to select the winning advertisers.
\end{itemize}
\paragraph{Related work}
\begin{itemize}
-\item \cite{cm} “Slated Cascade model”: the graph is a matrix, the user starts at the top of a column and goes down it, only at the end of the column can he switch to another one. There is a PTAS to compute the assignment which maximizes the social welfare.
-\item \cite{oad} the graph is an infinite path and is a modelisation of a web browsing session: on each page their is a constant probability of continuation. The paper gives a 7-competitive algorithm against the optimal offline solution.
+\item \cite{cm} “Slated Cascade model”: the graph is a matrix, the user starts
+ at the top of a column and goes down it, only at the end of the column can he
+ switch to another one. There is a PTAS to compute the assignment which
+ maximizes the social welfare.
+\item \cite{oad} the graph is an infinite path and is a model of a web browsing
+ session: on each page their is a constant probability of continuation. The
+ paper gives a 7-competitive algorithm against the optimal offline solution.
\end{itemize}
-\paragraph{Restriction}
-The graph is a DAG: if the environment is well designed, the user shouldn't be passing through the same vertex twice.
-Start first with a tree or even path and see if all the advertiser's requirements can be matched.
+\paragraph{Restriction} The graph is a DAG: if the environment is well designed,
+the user shouldn't be passing through the same vertex twice. Start first with a
+tree or even path and see if all the advertiser's requirements can be matched.
Muthu's strategy: \emph{find the first non trivial problem and solve it}.
-\paragraph{Starting problem} Let $T$ be a weighted tree: for each $u\in T$, $C(u)$ is the set of children of $u$. For all $v$ in $C(u)$, $p(v)$ will denote the probability of transition from vertex $u$ to vertex $v$. $T(u)$ is the tree rooted at $u$. $r$ is the root of the tree.
+\section{Static ads placement on a tree}
+
+\subsection{Notations}
+
+Let $T$ be a weighted tree: for each $u\in T$, $C(u)$ is the set of children of
+$u$. For all $v$ in $C(u)$, $p(v)$ will denote the probability of transition
+from vertex $u$ to vertex $v$. $T(u)$ is the tree rooted at $u$. $r$ is the root
+of the tree.
+
+For any function $f$ of the nodes, $f(T)$ will denote the expected return of a
+random walk on the tree starting at $r$, stopping when it reaches a leaf and
+where passing through vertex $v$ adds $f(u)$ to the return:
+\begin{displaymath}
+ f(T)= f(r) + \sum_{v\in C(u)} p_v f(T(v))
+\end{displaymath}
+
+Introducing the cumulative probabilities yields a simple expression for
+$f(T)$. Let $\lambda_u$ denote the probability of walking from the root of the
+tree to node $u$: $\lambda_u=\Pi_{i=1}^k p(u_k)$, with $u_1, \ldots, u_k$ the
+path from the root to $u$, then:
-For any function $f$ of the nodes, $f(T)$ will denote the expected return of a random walk on the tree starting at $r$, stopping when it reaches a leaf and where passing through vertex $v$ adds $f(u)$ to the return:
\begin{displaymath}
-f(T)= f(r) + \sum_{v\in C(u)} p_v f(T(v))
+ f(T)= \sum_{u\in T}\lambda_u f(u)
\end{displaymath}
-There is a set of $n$ advertisers. $b_i$ is the per-impression value of advertiser $i$ and $n_i$ the expected number of times advertiser $i$ would like his ad to be seen.
+\subsection{Problem}
+
+There is a set of $n$ advertisers. $b_i$ is the per-impression value
+of advertiser $i$ and $n_i$ the expected number of times advertiser
+$i$ would like his ad to be seen.
-An assignment of advertisers is a collection of subsets of $T$: $(S_i)_{1\leq i \leq n}$. The elements of $S_i$ are the vertices of $T$ where the ad of advertiser $i$ will be displayed. The overall return of advertiser $i$ is simply $R_i = f_i(T)$ with $f_i:u\mapsto b_i\mathbf{1}_{S_i}(u)$.
+An assignment is a subset $A$ of advertisers and a collection of subsets of $T$:
+$(A_i)_{i\in S}$. The elements of $A_i$ are the vertices of $T$ where the ad of
+advertiser $i$ will be displayed. The overall return of advertiser $i$ is simply
+$R_i = f_i(T)$ with $f_i:u\mapsto b_i\mathbf{1}_{A_i}(u)$.
The goal is to find an assignment which maximizes the social welfare:
+\begin{equation}\label{eq:opt}
+ W = \sum_{i\in A} R_i = \sum_{i\in A} b_i \mathbf{1}_{A_i}(T)
+\end{equation} with respect to the constraint:
+
+\begin{equation}\label{eq:constraint}
+ \forall A_i,\; \mathbf{1}_{A_i}(T)\geq n_i
+\end{equation}
+
+If \eqref{eq:constraint} is the only constraint, then the optimization problem
+is trivial: select the advertiser whose $b_i n_i$ is the greatest and place his
+ad on all the nodes of the tree. However, by doing this $\mathbf{1}_{A_i}(T)$
+could be much greater than $n_i$. There are three ways of further refining the
+problem to get a reasonable solution:
+\begin{enumerate}
+\item\label{over-serve} replacing $\mathbf{1}_{A_i}(T)$ by $n_i$ in
+ \eqref{eq:opt}. In other words, even if the publisher over-serve an
+ advertiser's request, he will not receive more than what the advertiser was
+ initially willing to pay for.
+\item\label{min-screen} adding the constraint $\mathbf{1}_{A_i}(T)\leq n_i +
+ \max_{x\in A_i} \lambda_x$: this simply expresses that removing any screen to
+ the set of screens assigned to advertiser $i$ will lead to under-serving his
+ request.
+\item\label{fractional} allow fractional occupation of a screen by an ad: in
+ addition to choosing a set of screen for advertiser $i$, you can also choose
+ for each screen a fraction of the time this ad will be displayed on the
+ screen. This way, the screen can be share between several ads by doing a
+ simple rotation of the ads. The main advantage of fractional allocation of the
+ screens is that it allows the publisher to meet exactly the advertiser's
+ request.
+\end{enumerate}
+
+Solutions \ref{over-serve}. and \ref{min-screen}. seem to lead to very complex
+combinatorial problems (dynamic programming gives a running time of
+$O(n2^{|T|})$). Solution \ref{fractional}. seem to be more realistic: what is
+the point of having screens if you do not leverage their ability to change their
+content frequently?
+
+Without taking the constraint of the advertisers into account (for each
+advertiser, there is a subset of node where his ad can be displayed), the
+optimization problem is simply choosing a subset of advertisers $A$ that
+maximizes:
\begin{displaymath}
-W = \sum_{i=1}^n R_i = \sum_{i=1}^n b_i \mathbf{1}_{S_i}(T)
+ \sum_{i\in A} b_i n_i
\end{displaymath}
+subject to the constraint:
+\begin{displaymath}
+ \sum_{i\in A} n_i \leq B
+\end{displaymath}
+where $B$ is the number of views' capacity of the tree. This is nothing else
+than a Knapsack Problem. In this case, the placement of ads has no importance:
+the tree is just seen as a pool of available number of views that can be filled
+continuously, and the position of ads has no importance.
+
+
\bibliographystyle{plain}
\bibliography{papers.bib}
-\end{document}
+\end{document} \ No newline at end of file