From 85a1c1d0a819621bdccb256d320f9df40ebee995 Mon Sep 17 00:00:00 2001
From: Thibaut <thibaut.horel@gmail.com>
Date: Wed, 14 Dec 2011 18:49:57 -0800
Subject: More notes.

Simple cases
---
 .gitignore |   6 ++-
 notes.tex  | 133 +++++++++++++++++++++++++++++++++++++++++++++++++++----------
 2 files changed, 117 insertions(+), 22 deletions(-)

diff --git a/.gitignore b/.gitignore
index e10afc6..a2ca3fe 100644
--- a/.gitignore
+++ b/.gitignore
@@ -1,3 +1,5 @@
 *~
-
-
+*.log
+*.aux
+*.bbl
+*.blg
diff --git a/notes.tex b/notes.tex
index 2d6477d..615f71b 100644
--- a/notes.tex
+++ b/notes.tex
@@ -1,59 +1,152 @@
 \documentclass{article}
 \usepackage[utf8]{inputenc}
-
+\usepackage{amsmath}
 \begin{document}
 
-\paragraph{General problem} A user is evolving in an environment, understand how to place ads in this environment.
+\section{Introduction}
+\paragraph{General problem} A user is evolving in an environment, understand how
+to place ads in this environment.
 
-\paragraph{Model} The environment is a weighted directed graph, the vertices are called the \emph{scenes} (aisles in a store, web pages). The outgoing edges of a vertex are the actions the user can do at this vertex. The weight of an edge is the probability of chosing this action. A user \emph{session} is a Markov process over the graph (shopping session, web browsing session).
+\paragraph{Model} The environment is a weighted directed graph, the vertices are
+called the \emph{scenes} (aisles in a store, web pages). The outgoing edges of a
+vertex are the actions the user can do at this vertex. The weight of an edge is
+the probability of choosing this action. A user \emph{session} is a Markov
+process over the graph (shopping session, web browsing session).
 
-\emph{Could be added later: each scene has a duration, the average time spent by a user on this scene.}
+\emph{Could be added later: each scene has a duration, the average time spent by
+  a user on this scene.}
 
-On each scene an ad can be displayed. Advertisers want to place their ad on the graph. Each advertiser's request is characterized by the following parameters:
+On each scene an ad can be displayed. Advertisers want to place their ad on the
+graph. Each advertiser's request is characterized by the following parameters:
 \begin{itemize}
 \item a per-impression value
 \item a subset of the nodes where the ad can be displayed
 \item the expected number of times the ad will be seen
 \item \emph{could be added later: the expected duration of exposure to the user}
-\item \emph{could be added later: a multiplying factor for the per-impression value if the ad is displayed on consecutive scenes}
+\item \emph{could be added later: a multiplying factor for the per-impression
+    value if the ad is displayed on consecutive scenes}
 \end{itemize}
 
 Two different types of problem:
 \begin{itemize}
-\item there is a flow of request, only one request can be granted at the same time: when a request arrives, it can either be granted or delayed (with a cost). This yields an online scheduling problem.
-\item all the request arrive at the same time, several requests can be granted at the same time, an auction is runned to select the winning advertisers. 
+\item there is a flow of request, only one request can be granted at the same
+  time: when a request arrives, it can either be granted or delayed (with a
+  cost). This yields an online scheduling problem.
+\item all the request arrive at the same time, several requests can be granted
+  at the same time, an auction is run to select the winning advertisers.
 \end{itemize}
 
 \paragraph{Related work}
 \begin{itemize}
-\item \cite{cm} “Slated Cascade model”: the graph is a matrix, the user starts at the top of a column and goes down it, only at the end of the column can he switch to another one. There is a PTAS to compute the assignment which maximizes the social welfare.
-\item \cite{oad} the graph is an infinite path and is a modelisation of a web browsing session: on each page their is a constant probability of continuation. The paper gives a 7-competitive algorithm against the optimal offline solution.
+\item \cite{cm} “Slated Cascade model”: the graph is a matrix, the user starts
+  at the top of a column and goes down it, only at the end of the column can he
+  switch to another one. There is a PTAS to compute the assignment which
+  maximizes the social welfare.
+\item \cite{oad} the graph is an infinite path and is a model of a web browsing
+  session: on each page their is a constant probability of continuation. The
+  paper gives a 7-competitive algorithm against the optimal offline solution.
 \end{itemize}
 
-\paragraph{Restriction}
-The graph is a DAG: if the environment is well designed, the user shouldn't be passing through the same vertex twice.
-Start first with a tree or even path and see if all the advertiser's requirements can be matched.
+\paragraph{Restriction} The graph is a DAG: if the environment is well designed,
+the user shouldn't be passing through the same vertex twice.  Start first with a
+tree or even path and see if all the advertiser's requirements can be matched.
 
 Muthu's strategy: \emph{find the first non trivial problem and solve it}.
 
-\paragraph{Starting problem} Let $T$ be a weighted tree: for each $u\in T$, $C(u)$ is the set of children of $u$. For all $v$ in $C(u)$, $p(v)$ will denote the probability of transition from vertex $u$ to vertex $v$. $T(u)$ is the tree rooted at $u$. $r$ is the root of the tree.
+\section{Static ads placement on a tree}
+
+\subsection{Notations}
+
+Let $T$ be a weighted tree: for each $u\in T$, $C(u)$ is the set of children of
+$u$. For all $v$ in $C(u)$, $p(v)$ will denote the probability of transition
+from vertex $u$ to vertex $v$. $T(u)$ is the tree rooted at $u$. $r$ is the root
+of the tree.
+
+For any function $f$ of the nodes, $f(T)$ will denote the expected return of a
+random walk on the tree starting at $r$, stopping when it reaches a leaf and
+where passing through vertex $v$ adds $f(u)$ to the return:
+\begin{displaymath}
+  f(T)= f(r) + \sum_{v\in C(u)} p_v f(T(v))
+\end{displaymath}
+
+Introducing the cumulative probabilities yields a simple expression for
+$f(T)$. Let $\lambda_u$ denote the probability of walking from the root of the
+tree to node $u$: $\lambda_u=\Pi_{i=1}^k p(u_k)$, with $u_1, \ldots, u_k$ the
+path from the root to $u$, then:
 
-For any function $f$ of the nodes, $f(T)$ will denote the expected return of a random walk on the tree starting at $r$, stopping when it reaches a leaf and where passing through vertex $v$ adds $f(u)$ to the return:
 \begin{displaymath}
-f(T)= f(r) + \sum_{v\in C(u)} p_v f(T(v))
+  f(T)= \sum_{u\in T}\lambda_u f(u)
 \end{displaymath}
 
-There is a set of $n$ advertisers. $b_i$ is the per-impression value of advertiser $i$ and $n_i$ the expected number of times advertiser $i$ would like his ad to be seen.
+\subsection{Problem}
+
+There is a set of $n$ advertisers. $b_i$ is the per-impression value
+of advertiser $i$ and $n_i$ the expected number of times advertiser
+$i$ would like his ad to be seen.
 
-An assignment of advertisers is a collection of subsets of $T$: $(S_i)_{1\leq i \leq n}$. The elements of $S_i$ are the vertices of $T$ where the ad of advertiser $i$ will be displayed. The overall return of advertiser $i$ is simply $R_i = f_i(T)$ with $f_i:u\mapsto b_i\mathbf{1}_{S_i}(u)$.
+An assignment is a subset $A$ of advertisers and a collection of subsets of $T$:
+$(A_i)_{i\in S}$. The elements of $A_i$ are the vertices of $T$ where the ad of
+advertiser $i$ will be displayed. The overall return of advertiser $i$ is simply
+$R_i = f_i(T)$ with $f_i:u\mapsto b_i\mathbf{1}_{A_i}(u)$.
 
 The goal is to find an assignment which maximizes the social welfare:
+\begin{equation}\label{eq:opt}
+  W = \sum_{i\in A} R_i = \sum_{i\in A} b_i \mathbf{1}_{A_i}(T)
+\end{equation} with respect to the constraint:
+
+\begin{equation}\label{eq:constraint} 
+  \forall A_i,\; \mathbf{1}_{A_i}(T)\geq n_i
+\end{equation}
+
+If \eqref{eq:constraint} is the only constraint, then the optimization problem
+is trivial: select the advertiser whose $b_i n_i$ is the greatest and place his
+ad on all the nodes of the tree. However, by doing this $\mathbf{1}_{A_i}(T)$
+could be much greater than $n_i$. There are three ways of further refining the
+problem to get a reasonable solution:
+\begin{enumerate}
+\item\label{over-serve} replacing $\mathbf{1}_{A_i}(T)$ by $n_i$ in
+  \eqref{eq:opt}. In other words, even if the publisher over-serve an
+  advertiser's request, he will not receive more than what the advertiser was
+  initially willing to pay for.
+\item\label{min-screen} adding the constraint $\mathbf{1}_{A_i}(T)\leq n_i +
+  \max_{x\in A_i} \lambda_x$: this simply expresses that removing any screen to
+  the set of screens assigned to advertiser $i$ will lead to under-serving his
+  request.
+\item\label{fractional} allow fractional occupation of a screen by an ad: in
+  addition to choosing a set of screen for advertiser $i$, you can also choose
+  for each screen a fraction of the time this ad will be displayed on the
+  screen. This way, the screen can be share between several ads by doing a
+  simple rotation of the ads. The main advantage of fractional allocation of the
+  screens is that it allows the publisher to meet exactly the advertiser's
+  request.
+\end{enumerate}
+
+Solutions \ref{over-serve}. and \ref{min-screen}. seem to lead to very complex
+combinatorial problems (dynamic programming gives a running time of
+$O(n2^{|T|})$). Solution \ref{fractional}. seem to be more realistic: what is
+the point of having screens if you do not leverage their ability to change their
+content frequently?
+
+Without taking the constraint of the advertisers into account (for each
+advertiser, there is a subset of node where his ad can be displayed), the
+optimization problem is simply choosing a subset of advertisers $A$ that
+maximizes:
+\begin{displaymath}
+  \sum_{i\in A} b_i n_i
+\end{displaymath}
+subject to the constraint:
 \begin{displaymath}
-W = \sum_{i=1}^n R_i = \sum_{i=1}^n b_i \mathbf{1}_{S_i}(T)
+  \sum_{i\in A} n_i \leq B
 \end{displaymath}
+where $B$ is the number of views' capacity of the tree. This is nothing else
+than a Knapsack Problem. In this case, the placement of ads has no importance:
+the tree is just seen as a pool of available number of views that can be filled
+continuously, and the position of ads has no importance.
+
+
 
 
 \bibliographystyle{plain}
 \bibliography{papers.bib}
 
-\end{document}
+\end{document}
\ No newline at end of file
-- 
cgit v1.2.3-70-g09d2