summaryrefslogtreecommitdiffstats
path: root/notes.tex
blob: 615f71bbbf26b6f4b61eb5343ebf94bc6f7bb9f5 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
\documentclass{article}
\usepackage[utf8]{inputenc}
\usepackage{amsmath}
\begin{document}

\section{Introduction}
\paragraph{General problem} A user is evolving in an environment, understand how
to place ads in this environment.

\paragraph{Model} The environment is a weighted directed graph, the vertices are
called the \emph{scenes} (aisles in a store, web pages). The outgoing edges of a
vertex are the actions the user can do at this vertex. The weight of an edge is
the probability of choosing this action. A user \emph{session} is a Markov
process over the graph (shopping session, web browsing session).

\emph{Could be added later: each scene has a duration, the average time spent by
  a user on this scene.}

On each scene an ad can be displayed. Advertisers want to place their ad on the
graph. Each advertiser's request is characterized by the following parameters:
\begin{itemize}
\item a per-impression value
\item a subset of the nodes where the ad can be displayed
\item the expected number of times the ad will be seen
\item \emph{could be added later: the expected duration of exposure to the user}
\item \emph{could be added later: a multiplying factor for the per-impression
    value if the ad is displayed on consecutive scenes}
\end{itemize}

Two different types of problem:
\begin{itemize}
\item there is a flow of request, only one request can be granted at the same
  time: when a request arrives, it can either be granted or delayed (with a
  cost). This yields an online scheduling problem.
\item all the request arrive at the same time, several requests can be granted
  at the same time, an auction is run to select the winning advertisers.
\end{itemize}

\paragraph{Related work}
\begin{itemize}
\item \cite{cm} “Slated Cascade model”: the graph is a matrix, the user starts
  at the top of a column and goes down it, only at the end of the column can he
  switch to another one. There is a PTAS to compute the assignment which
  maximizes the social welfare.
\item \cite{oad} the graph is an infinite path and is a model of a web browsing
  session: on each page their is a constant probability of continuation. The
  paper gives a 7-competitive algorithm against the optimal offline solution.
\end{itemize}

\paragraph{Restriction} The graph is a DAG: if the environment is well designed,
the user shouldn't be passing through the same vertex twice.  Start first with a
tree or even path and see if all the advertiser's requirements can be matched.

Muthu's strategy: \emph{find the first non trivial problem and solve it}.

\section{Static ads placement on a tree}

\subsection{Notations}

Let $T$ be a weighted tree: for each $u\in T$, $C(u)$ is the set of children of
$u$. For all $v$ in $C(u)$, $p(v)$ will denote the probability of transition
from vertex $u$ to vertex $v$. $T(u)$ is the tree rooted at $u$. $r$ is the root
of the tree.

For any function $f$ of the nodes, $f(T)$ will denote the expected return of a
random walk on the tree starting at $r$, stopping when it reaches a leaf and
where passing through vertex $v$ adds $f(u)$ to the return:
\begin{displaymath}
  f(T)= f(r) + \sum_{v\in C(u)} p_v f(T(v))
\end{displaymath}

Introducing the cumulative probabilities yields a simple expression for
$f(T)$. Let $\lambda_u$ denote the probability of walking from the root of the
tree to node $u$: $\lambda_u=\Pi_{i=1}^k p(u_k)$, with $u_1, \ldots, u_k$ the
path from the root to $u$, then:

\begin{displaymath}
  f(T)= \sum_{u\in T}\lambda_u f(u)
\end{displaymath}

\subsection{Problem}

There is a set of $n$ advertisers. $b_i$ is the per-impression value
of advertiser $i$ and $n_i$ the expected number of times advertiser
$i$ would like his ad to be seen.

An assignment is a subset $A$ of advertisers and a collection of subsets of $T$:
$(A_i)_{i\in S}$. The elements of $A_i$ are the vertices of $T$ where the ad of
advertiser $i$ will be displayed. The overall return of advertiser $i$ is simply
$R_i = f_i(T)$ with $f_i:u\mapsto b_i\mathbf{1}_{A_i}(u)$.

The goal is to find an assignment which maximizes the social welfare:
\begin{equation}\label{eq:opt}
  W = \sum_{i\in A} R_i = \sum_{i\in A} b_i \mathbf{1}_{A_i}(T)
\end{equation} with respect to the constraint:

\begin{equation}\label{eq:constraint} 
  \forall A_i,\; \mathbf{1}_{A_i}(T)\geq n_i
\end{equation}

If \eqref{eq:constraint} is the only constraint, then the optimization problem
is trivial: select the advertiser whose $b_i n_i$ is the greatest and place his
ad on all the nodes of the tree. However, by doing this $\mathbf{1}_{A_i}(T)$
could be much greater than $n_i$. There are three ways of further refining the
problem to get a reasonable solution:
\begin{enumerate}
\item\label{over-serve} replacing $\mathbf{1}_{A_i}(T)$ by $n_i$ in
  \eqref{eq:opt}. In other words, even if the publisher over-serve an
  advertiser's request, he will not receive more than what the advertiser was
  initially willing to pay for.
\item\label{min-screen} adding the constraint $\mathbf{1}_{A_i}(T)\leq n_i +
  \max_{x\in A_i} \lambda_x$: this simply expresses that removing any screen to
  the set of screens assigned to advertiser $i$ will lead to under-serving his
  request.
\item\label{fractional} allow fractional occupation of a screen by an ad: in
  addition to choosing a set of screen for advertiser $i$, you can also choose
  for each screen a fraction of the time this ad will be displayed on the
  screen. This way, the screen can be share between several ads by doing a
  simple rotation of the ads. The main advantage of fractional allocation of the
  screens is that it allows the publisher to meet exactly the advertiser's
  request.
\end{enumerate}

Solutions \ref{over-serve}. and \ref{min-screen}. seem to lead to very complex
combinatorial problems (dynamic programming gives a running time of
$O(n2^{|T|})$). Solution \ref{fractional}. seem to be more realistic: what is
the point of having screens if you do not leverage their ability to change their
content frequently?

Without taking the constraint of the advertisers into account (for each
advertiser, there is a subset of node where his ad can be displayed), the
optimization problem is simply choosing a subset of advertisers $A$ that
maximizes:
\begin{displaymath}
  \sum_{i\in A} b_i n_i
\end{displaymath}
subject to the constraint:
\begin{displaymath}
  \sum_{i\in A} n_i \leq B
\end{displaymath}
where $B$ is the number of views' capacity of the tree. This is nothing else
than a Knapsack Problem. In this case, the placement of ads has no importance:
the tree is just seen as a pool of available number of views that can be filled
continuously, and the position of ads has no importance.




\bibliographystyle{plain}
\bibliography{papers.bib}

\end{document}