summaryrefslogtreecommitdiffstats
path: root/notes.tex
blob: b8a77d1da380d5995e136e946b577b7c8fcecb7f (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
\documentclass{article}
\usepackage[utf8]{inputenc}
\usepackage{amsmath,amsthm}
\newtheorem{lemma}{Lemma}
\begin{document}

\section{Introduction}
\paragraph{General problem} A user is moving in an environment, understand how
to place ads in this environment.

\paragraph{Model} The environment is a weighted directed graph, the vertices are
called the \emph{scenes} (aisles in a store, web pages). The outgoing edges of a
vertex are the actions the user can do at this vertex. The weight of an edge is
the probability of choosing this action. A user \emph{session} is a Markov
process over the graph (shopping session, web browsing session).

\emph{Could be added later: each scene has a duration, the average time spent by
  a user on this scene.}

On each scene an ad can be displayed. Advertisers want to place their ad on the
graph. Each advertiser's request is characterized by the following parameters:
\begin{itemize}
\item a per-impression value
\item a subset of the nodes where the ad can be displayed
\item the expected number of times the ad will be seen
\item \emph{could be added later: the expected duration of exposure to the user}
\item \emph{could be added later: a multiplying factor for the per-impression
    value if the ad is displayed on consecutive scenes}
\end{itemize}

Two different types of problem:
\begin{itemize}
\item there is a flow of request, only one request can be granted at the same
  time: when a request arrives, it can either be granted or delayed (with a
  cost). This yields an online scheduling problem.
\item all the request arrive at the same time, several requests can be granted
  at the same time, an auction is run to select the winning advertisers.
\end{itemize}

\paragraph{Related work}
\begin{itemize}
\item \cite{cm} “Slated Cascade model”: the graph is a matrix, the user starts
  at the top of a column and goes down it, only at the end of the column can he
  switch to another one. There is a PTAS to compute the assignment which
  maximizes the social welfare.
\item \cite{oad} the graph is an infinite path and is a model of a web browsing
  session: on each page their is a constant probability of continuation. The
  paper gives a 7-competitive algorithm against the optimal offline solution.
\end{itemize}

\section{Static ads placement on a tree}

\subsection{Notations}

Let $T$ be a weighted tree. For a node $u\in T$:
\begin{itemize}
\item $C(u)$ is the set of children of $u$
\item $p(u,v)$ is the transition probability from $u$ to $v\in C(u)$
\item $r$ is the root of the tree
\end{itemize}

For any function $f$ of the nodes, $f(T)$ will denote the expected return of a
random walk on the tree starting at $r$, stopping when it reaches a
leaf and earning $f(u)$ for any visited node $u$:
\begin{displaymath}
  f(T)= f(r) + \sum_{v\in C(r)} p(r,v) f\big(T(v)\big)
\end{displaymath}

Introducing the cumulative probabilities yields a simple expression for
$f(T)$. Let $\lambda_u$ denote the probability of walking from the root of the
tree to node $u$: $\lambda_u=\Pi_{i=1}^{k-1} p(u_i,u_{i+1})$, with $u_1, \ldots, u_k$ the
path from the root to $u$, then:
\begin{displaymath}
  f(T)= \sum_{u\in T}\lambda_u f(u)
\end{displaymath}
note that the lambdas are decreasing along the branches of $T$. 

For our purpose, $f$ will be the indicator function of a subset of
nodes in $T$. For any $S\subset T$, $\mathbf{1}_S$ is defined by:
\begin{displaymath}
  \mathbf{1}_S(u) = \left\{
    \begin{array}{ll}
      1\quad \mathrm{if}\; u\in S\\
      0\quad\mathrm{otherwise}
    \end{array}
  \right.
\end{displaymath}
$\mathbf{1}_S(T)=\sum_{u\in S}\lambda_u$ is simply the expected number
of views of nodes in $S$ for a user walking in the tree. The lambdas
can then be seen as capacities in terms of number of views. $B$ will
denote the overall capacity of the tree: $\mathbf{1}_T(T)$.

\subsection{Problem}

There is a set of $n$ advertisers. For advertiser $i$:
\begin{itemize}
\item $b_i$ is his per-impression value
\item $n_i$ is his requested number of impressions  
\end{itemize}

An assignment is a subset $A$ of advertisers and a collection of subsets of $T$:
$(A_i)_{i\in S}$. The elements of $A_i$ are the vertices of $T$ where the ad of
advertiser $i$ will be displayed. The overall return of advertiser $i$ is simply
$R_i = \mathbf{1}_{A_i}(T)$.

The goal is to find an assignment which maximizes the social welfare:
\begin{equation}\label{eq:opt}
  W = \sum_{i\in A} R_i = \sum_{i\in A} b_i \mathbf{1}_{A_i}(T)
\end{equation} with respect to the constraints:
\begin{equation}\label{eq:constraint} 
  \forall A_i,\; \mathbf{1}_{A_i}(T)\geq n_i
\end{equation}
\begin{equation}
  \sum_{i\in A} \mathbf{1}_{A_i}(T) \leq B
\end{equation}
without any other constraint, the optimization problem is trivial:
select the advertiser whose $b_i n_i$ is the greatest and place his ad
on all the nodes of the tree. However, by doing this
$\mathbf{1}_{A_i}(T)$ could be much greater than $n_i$. There are
three ways of further refining the problem to get a reasonable
solution:
\begin{enumerate}
\item\label{over-serve} \emph{Free disposal:} replacing $\mathbf{1}_{A_i}(T)$ by $n_i$ in
  \eqref{eq:opt}. In other words, even if the publisher over-serve an
  advertiser's request, he will not receive more than what the advertiser was
  initially willing to pay for.
\item\label{min-screen} \emph{One screen margin:} adding the constraint $\mathbf{1}_{A_i}(T)\leq n_i +
  \max_{x\in A_i} \lambda_x$: this simply expresses that removing any screen to
  the set of screens assigned to advertiser $i$ will lead to under-serving his
  request.
\item\label{fractional} \emph{Fractional allocation:} allow fractional occupation of a screen by an ad: in
  addition to choosing a set of screen for advertiser $i$, you can also choose
  for each screen a fraction of the time this ad will be displayed on the
  screen. This way, the screen can be share between several ads by doing a
  simple rotation of the ads. The main advantage of fractional allocation of the
  screens is that it allows the publisher to meet exactly the advertiser's
  request.
\end{enumerate}

\subsection{Trivial Case}
From now on we will assume \emph{Fractional allocation}. In this case we
can now serve each request exactly, \eqref{eq:constraint} is
automatically true, and we only need to find a subset $A$ of
advertisers that maximizes:
\begin{displaymath}
  \sum_{i\in A} b_i n_i
\end{displaymath}
subject to the constraint:
\begin{displaymath}
  \sum_{i\in A} n_i \leq B
\end{displaymath}
This is nothing else than a knapsack problem where $b_i n_i$ is the
value of object $i$, $n_i$ is its weight and $B$ is the capacity of
the knapsack.

\paragraph{Remarks}

\begin{itemize}
\item the placement of ads has no importance: the tree is simply seen
  as a pool of available number of views that can be filled
  continuously in any order.
\item the topology of the graph has no impact on the problem.
\end{itemize}

\subsection{Contiguous placement}

To make the problem more interesting we add the following two
constraints:
\begin{enumerate}
\item\label{con:cont} if one ad appears on more than one screen in a branch, the screens must be contiguous
\item\label{con:limit} no ad must be seen on more than two screens
\end{enumerate}

\subsubsection{Case of the path}

An allocation will be called \emph{feasible} if it satisfies the two
afore-mentioned constraints.

\begin{lemma}
  Reordering the ads by decreasing order of $n_i$ in a feasible
  allocation only breaks the constraint \ref{con:limit}. by one screen:
  no ad is seen on more than three contiguous screens.
\end{lemma}

\begin{proof}
  Case disjunction and lots of trivial inequalities but it works. The
  fact that the capacities of the nodes are decreasing along a branch is
  important (this is the first time the topology of the graph is used).
\end{proof}

This lemma gives a way to find an approximation of the optimal
solution using dynamic programming: consider the ads by increasing
order of $n_i$ and start packing them from the bottom.

\paragraph{Question:} the $n_i$ are not necessarily integers, we can
do rounding. What is the typical precision needed?

\paragraph{Price of anarchy:} The greedy dynamic programming algorithm
will have a global welfare greater than the optimal feasible
allocation. Looking at how the algorithm proceeds, for each
advertiser, we exceed the optimal solution by at most the capacity of
the node we are currently filling. Does this show that the price of
anarchy is bounded by 2? It seems that this gives a bound on the Price
of Stability. This bound could potentially be proven tight by a simple
example.

\subsubsection{Case of the tree}

Try to find a swapping argument similar to the one used for the path.


\nocite{Johnson:PCKP,DBLP:journals/telsys/ShawCC97}

\bibliographystyle{plain}
\bibliography{papers.bib}

\end{document}