summaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
authorThibaut Horel <thibaut.horel@gmail.com>2021-03-21 13:21:47 -0400
committerThibaut Horel <thibaut.horel@gmail.com>2021-03-21 13:21:47 -0400
commit66733475b696ad47700a7b36a22c6c2d616d4e0f (patch)
tree3e354271b28017134fef3a796d52dee5f9c648b4
parent4de5e8748f82759565c71c6831c7ed123a028553 (diff)
downloadreviews-66733475b696ad47700a7b36a22c6c2d616d4e0f.tar.gz
Add JASA review revision
-rw-r--r--jasa-2019-0653-R1.tex69
1 files changed, 69 insertions, 0 deletions
diff --git a/jasa-2019-0653-R1.tex b/jasa-2019-0653-R1.tex
new file mode 100644
index 0000000..8591e78
--- /dev/null
+++ b/jasa-2019-0653-R1.tex
@@ -0,0 +1,69 @@
+\documentclass[10pt]{article}
+\usepackage[T1]{fontenc}
+\usepackage[utf8]{inputenc}
+\usepackage[hmargin=1.2in, vmargin=1.2in]{geometry}
+\usepackage{amsmath,amsfonts}
+
+\title{\large Review of \emph{Real-time Regression Analysis of Streaming Clustered Data with Possible Abnormal Data Batches}}
+\date{}
+
+\begin{document}
+
+\maketitle
+
+This is an update on a previous review of the same paper after reading the
+authors' revision. Overall I would like to thank the authors for taking my
+comments and questions in serious considerations and improving the paper
+accordingly.
+
+\paragraph{1.}
+The main change at the technical level has been a clarification about the
+regime with which the number of samples is taken to grow to infinity, with now
+two distinct regime: one where the size of each batch is constant and the
+number of batches grows to infinity, and one where the first batch's size grows
+to infinity (with two sub-regimes depending on whether the subsequent batches
+can also grow to infinity).
+
+The asymptotic analysis of the estimator in the first of these two regime was not
+previously covered by the original proof, but in this revision the authors
+added a separate analysis for this case while also clarifying the proof in the
+other case.
+
+Thanks to this improvement, I have now reached a reasonable level confidence in
+the correctness of the stated results and believe that the paper is technically
+sound.
+
+\paragraph{} A minor suggestion to improve the argument given on line 17, page
+43 in the appendix, which lacks rigor as currently written ($n$ hasn't been
+ defined and it seems to suggest that all batches have the same size, which
+ is not without loss of generality, it is also not clear in which sense the
+ approximation $\simeq$ needs to be understood).
+
+ By definition one has $n_j = N_j - N_{j-1}$ hence, defining $N_0=0$:
+ \begin{align*}
+ \sum_{j=1}^{b-1} \frac{n_j}{\sqrt{N_j}} = \sum_{j=1}^{b-1}
+ \frac{N_j-N_{j-1}}{\sqrt{N_j}}
+ \leq
+ \sum_{j=1}^{b-1}
+ \int_{N_{j-1}}^{N_j}
+ \frac{dt}{\sqrt{t}} = \int_{0}^{N_j}\frac{dt}{\sqrt{t}} = 2\sqrt{N_j}\,,
+ \end{align*}
+ where the inequality holds since $t\mapsto 1/\sqrt{t}$ is a decreasing
+ function.
+
+\paragraph{2.} The authors also clarified the details of how the Newton-Raphson
+method is used, in particular conditions guaranteeing convergence and the
+convergence criterion used in the numerical experiments.
+
+While I agree that the numerical experiments clearly show that convergence of
+the NR method does happen quickly in practice, I was not convinced by the
+authors' explanation that there is no need to control the residual error in the
+theoretical analysis, and in particular make sure that it does not accumulate
+over the iterations of the recursive procedure. The authors claim that it is
+the ``conventional practice in the statistical literature'', but my impression
+is that nested procedure (where a subroutine, like NR here, is used in each
+iteration) are becoming increasingly common for online estimation (following
+a similar trend in the fields of stochastic optimization and machine learning)
+and it is now standard to do and end-to-end analysis of the entire procedure,
+including the error terms accrued at each iteration.
+\end{document}