summaryrefslogtreecommitdiffstats
path: root/DECSUP00465.tex
diff options
context:
space:
mode:
Diffstat (limited to 'DECSUP00465.tex')
-rw-r--r--DECSUP00465.tex156
1 files changed, 156 insertions, 0 deletions
diff --git a/DECSUP00465.tex b/DECSUP00465.tex
new file mode 100644
index 0000000..ab2e99f
--- /dev/null
+++ b/DECSUP00465.tex
@@ -0,0 +1,156 @@
+\documentclass[12pt]{article}
+\usepackage[T1]{fontenc}
+\usepackage[utf8]{inputenc}
+\usepackage[hmargin=1in, vmargin=1in]{geometry}
+
+\title{Review for \emph{Optimal Pricing in E-Commerce Based on Sparse and Noisy Data}}
+\date{}
+\begin{document}
+\maketitle
+
+\paragraph{Problem and results.} In this paper, the authors study the problem
+of determining the optimal price of products in the context of online stores.
+The goal is to use past data (consisting of prices and the associated mean daily
+revenue/profit) to propose new (approximately) optimal prices. The main
+challenges are that the data is sparse (some products have few
+data points) and noisy (mean daily revenue varies a lot based on exogenous
+factors), and that the function to optimize (revenue as a function of the
+chosen price) is both unknown and complex with possibly multiple local optima.
+The proposed method proceeds as follows:
+\begin{itemize}
+ \item use bootstrap estimation to build an estimator of the probability
+ that an observed price in the dataset is optimal.
+ \item use kernel regression to extrapolate this function to all prices:
+ this steps gives a way to compute for any (possibly unobserved) price,
+ the probability that it is optimal.
+ \item use additional data (\emph{e.g. products from the same category or
+ competitor prices}) to build a prior distribution on the probability
+ that a price is optimal. The model used for this are decision trees.
+ \item use the Metropolis-Hastings to sample prices from the posterior
+ distribution.
+\end{itemize}
+After describing their method, the authors present experimental results, both
+simulation-based and also based on a deployment on a large European E-commerce
+website and show that the new method performs better than baselines and also
+led to significant increases in the revenue of the company it was deployed at.
+
+\paragraph{Comments on the proposed method.} The proposed method uses fairly
+standard ideas from statistics/machine learning, but which apparently haven't
+been used until now in the context of optimal pricing in E-commerce. I am not
+knowledgeable about the E-commerce literature, so I am basing this observation
+on the related work section. Each step is properly motivated by the authors and
+tailored to the specific challenges and features of the problem at hand.
+However, my main concerns regarding the method are the following:
+\begin{itemize}
+ \item Instead of directly optimizing the (unknown) revenue/profit function,
+ the authors choose to focus on maximizing the probability that a given
+ price is optimal. However, directly estimating the probability that
+ a price is optimal is unfeasible, so instead, the method optimizes the
+ probability that a given price leads to a revenue (or a combination of
+ revenue and profit) above some threshold, as defined in Equation (5).
+ It is really unclear what this achieves: in particular, the algorithm
+ could get stuck in selecting suboptimal prices for which it is very
+ confident, and fail to explore prices which could be better but for
+ which the confidence is very low at the moment (see also my comment
+ below about exploration/exploitation trade-off).
+ \item Related to the previous point, very little indication is given about
+ the choice of the threshold introduced in Equation (5). This is
+ problematic since if it is chosen too low, then the
+ algorithm could get stuck in suboptimal minima. The authors indicate
+ that the threshold can be chosen based on previously observed data
+ points, which I think heightens the problem of getting stuck before
+ performing sufficient exploration.
+ \item The paper mentions ``dynamic pricing'' several times, and
+ this is indeed a key feature of the problem at hand. However, the
+ proposed method seems highly static to me: the main algorithm considers
+ all data points available so far and computes prices for the next day
+ (or next time period); this needs to be rerun whenever new prices are
+ needed. It could be nice to show how this procedure can be turned into
+ a feedback loop where new data points are efficiently incorporated. For
+ example, is it the case that the probabilities can be efficiently
+ updated when new data points are observed? Is the approach fully
+ Bayesian in the sense that the computed probabilities are used as the
+ prior distribution for the next step, or does the method always start
+ with the prior distribution given by the decision trees?
+ \item Even if each step is fairly standard and theoretically well
+ understood, the entire procedure combines them in a way which is hard
+ to reason about. As mentioned before, the threshold in (5) depends on
+ the observed data and seem to lead to insufficient exploration. This
+ could be counterbalanced by the proposal distribution in the
+ Metropolis-Hastings algorithm, but it is unclear what the
+ overall procedure converges to (see my comment below about the
+ asbence of theoretical results).
+\end{itemize}
+
+\paragraph{Absence of theoretical results.} Related to the last point above, it
+is regrettable that the proposed method lacks theoretical guarantees. Even
+though it is known that each individual step converges/is optimal in the limit
+of infinitly many data points, the steps are combined in a way which makes it
+hard to be convinced that the entire procedure converges to a good solution in
+the limit.
+
+
+\paragraph{Additional remark on the overall approach.} At a high-level, the
+problem is an instance of the following problem: optimizing an
+unknown function based on noisy observtions of its values. This is
+a well-studied problem whose main challenge is the so-called
+``exploration-exploitation'' trade-off: methods for this problem need to
+balance exploration (to learn enough about the structure of the objective
+function) and exploitation (to make good decisions with respect to what is
+currently known). Important keywords relevant to this problem and which roughly
+cover this area of research: bandit algorithms, bayesian bandits, online
+optimization, regret minimization, Thompson sampling, derivative-free
+optimization under noise, optimization with bandit feedback. Many solutions
+proposed in this line of work share some similarity with the method proposed in
+this paper, but do so in a more principled way. I think the authors should at
+the very least discuss extensively the need to design a new ad hoc solution and
+how it compares to the above line of work. I would also recommend trying to
+frame their problem using one of the above frameworks: for example, the
+framework of regret minimization seems very well suited to describe the problem
+of dynamic pricing where decisions are made over time based on partial
+observations.
+
+\paragraph{Comments on the experimental results.} The experiments seem sound
+and well-designed and the proposed method does seem to perform significantly
+better than baselines. Maybe more details should be given about how and to
+which value the various meta-parameters where chosen: in particular, the
+boostrap parameter $B$, the kernel regression parameter $h$, the threshold in
+equation (5), the $\lambda$ on page 13, how the decision trees where trained.
+
+\paragraph{Comments on the writing.} The paper is well written overall with
+a structure which is easy to follow. A few points which could improve the
+exposition:
+\begin{itemize}
+\item the caption of Figure 1 page 4 should have one sentence explaining that
+ the curve is obtained via extrapolation with kernel regression (and refer
+ to the later section where it is explained).
+\item section 3.4.1 on bootstrap is heavy and not very useful in my opinion.
+ It should be assumed that readers are somewhat familiar with
+ boostrap estimation. I would reduce this section to one paragraph with
+ maybe one or two sentences on the general principle of bootsrap
+ estimation and then a clear explanation of the form: ``we apply
+ boostrap estimation on data X to estimate Y, this will be then used to
+ do Z''.
+\item the description of Algorithm 1 could be significantly simplified by
+ removing all uses of the $\hat{F}_{boot}$ distribution: here the
+ algorithm only uses an empirical distribution to estimate
+ a probability. All which is really done is counting the fraction of
+ bootstrap samples whose mean target value is above the threshold
+ $\tilde{y}$. This can be explained in one sentence in English, but is
+ considerably obfuscated by the way Algorithm 1 is written. Furthermore,
+ I think that line 9 in Algorithm 1 contains a typo, I think it should
+ read: ``Solve $\hat{q}_n = \hat{F}_{boot}(\tilde{y})$''. But again,
+ I would recommend not using $\hat{F}_{boot}$ altogether.
+\item section 3.5.2 is similarly heavy: it should be assumed that the reader is
+ familiar with the concept of prior distribution. For example, I would
+ remove the two paragraphs below Equation (8).
+\end{itemize}
+
+\paragraph{Conclusion.} While the problem is interesting and the proposed
+method works well experimentally, I cannot recommend the paper for publication
+in its current form: the method is not particularly novel and combines fairly
+standard techniques in an ad hoc way which makes it hard to reason about
+convergence, whereas a well-established area of research already contains many
+tools which could lead to a more principled solution to the problem.
+
+\end{document}