tcc-2019-232.txt


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246

Overall grade: 5
Confidence level: 3
Technical: 3
Editorial: 1

Summary of the paper
--------------------

This paper considers the problem of building pseudorandom functions (PRF) from
non-adaptive pseudorandom functions (naPRF). The main result rules out the
existence of a fairly general class of one-call fully black box constructions
(in the sense of [RTV04]). Specifically, the authors consider constructions
with the following structure:

1. a (possibly seeded) preprocessing function is applied to the input

2. the preprocessed input is fed to the naPRF, importantly (and this
   is the only real restriction compared to arbitrary one-call constructions)
   the seed used in computing the naPRF is a uniform random variable
   independent of the input, i.e. it is what the naPRF "expects" as a seed

3. a (possibly seeded) post-processing function is applied to the output of the
   naPRF

The proof of the main result uses the standard fact that fully black-bock
constructions relativize: for a one-call construction F of the type described
above, the authors construct an oracle relative to which a naPRF exist but for
which F is not secure against adaptive adversaries. This oracle is similar to
the oracle constructed in (Myers 2004) and can be informally described as
follows:

1. the first part of the oracle is a uniformly random (seeded) oracle O, which
   in particular is an exponentially strong PRF

2. the second part of the oracle, R, acts as an "inverter" for F, but *only* to
   an adversary which is able to prove that it is making adaptive queries to F

The authors then proceed to prove that relative to (O, R):

1. O is a naPRF. This is the most challenging part and requires showing that
   a non-adaptive adversary cannot take advantage of R. This is achieved by
   showing how, to a non-adaptive adversary, R can be (approximately) simulated
   efficiently.

2. F instantiated with O is not a PRF. This part is fairly straightforward
   since R was designed specifically to break F^O.

The above proof is only valid when the preprocessing function satisfies
a "universality" property. In the last part of the paper, the authors show how
this restriction can be removed in specific cases.

Strengths
---------

* The question of building PRFs from naPRFs is important and this paper nicely
  complements a small set of results ruling out natural classes of
  constructions.  In particular it shows that the one-call construction of
  (Berman, Haitner 2015) cannot be proved secure in a black-box manner when the
  naPRF is only secure to PPT adversaries.

* The class of one-call construction which is introduced in this paper is
  interesting. In particular, the authors convincingly show that a subtle
  interplay between the pre-processing and post-processing functions requires
  both a careful analysis and additional assumptions (like c-universality).

Weaknesses
----------

* Simply ruling out one-call constructions is arguably restricted. The authors
  make a convincing case that this is already non-trivial and progress compared
  to what is already known. However, it could have been nice to explain in
  greater detail the main challenges in extending this result to arbitrary
  constant number of calls.

* The paper seem to have been put together rather hastily and would benefit
  from a thorough proofreading pass (more on this below).

Technical inaccuracy
--------------------

1. Although I am quite confident that the results and proof is correct, it seems
that there is an inaccuracy in the proof:

* it starts in Definition 5, p. 13 where the definition of the good seeds
  requires |C(s, X_1)| > l/(c-1) (and similarly for X_2 and X). This is
  inconsistent with the informal description given above the definition which
  requires that there is no c-way collision in C(s, X_1). Note that the absence
  of a c-way collision implies |C(s, X_1) > l/c-1) but the converse it not
  true. 

* the previous point becomes problematic in appendix E.2 (proof of Lemma 12),
  where it is claimed that |C(s, T_1)|> l/(c-1) implies that when considering
  a subset I of l/2 indices, there must be at least l/2(c-1) distinct values in
  |C(s, T_1[I])|. This is not true, since C(s, T_1) could for example comprise:

  * l/(c-1) distinct values in the first l/(c-1) indices
  * the same value repeated in the next l - l/(c-1) indices

  Clearly, the set containing the last l/2 indices will not have the desired
  property (for, say, c greater than 4)

For this reason, I believe that Definition 5 should be fixed to reflect the
original intended meaning (absence of c-way collision). Then, the entire proof
should go through after replacing all the instances (not just in appendix E.2)
of the clause "|C(s, X)|> l/(c-1)" by the clause "there is no c-way collision
in C(S, X)". And only using that "|C(s, X)| > l/(c-1)" when necessary as
a property implied by the absence of a c-way collision. 

In particular, Definition 12 (Badf) and the Proof of Lemma 15 (p. 34) needs to
be fixed in a similar manner.

Comments on the presentation
----------------------------

Although the general structure is good, the paper suffers at times from
inconsistencies and redundancies which interfere with the flow of the paper:

* the security parameter is called lambda in the introduction, but then it
  seems that n plays the role of lambda (or is it n+sigma?)

* the class of one-call constructions considered in the paper is described
  three times, which is at least one too many. I suggest the one at the
  beginning of section 2.1 be removed.

* beta-sparse and beta-good are used quite inconsistently. In particular, in
  section 2.2, an informal description of a beta-good seed is given, but the
  description given turns out to correspond to what is later defined as
  beta-sparse. However, section 2.2 does require the notion of beta-good to be
  correct.

* the choice of name 1/2-good for elements in the co-domain of G is
  questionable since it could be confused with the notion of a beta-good seed.

* the formalism of function families in section 3 is unnecessarily complicated:

  - splitting F into two functions F.Kg and F.Eval is unnecessary since in all
	cases considered in the paper the key distribution is uniformly random.
	I recommend having only F taking as input a uniformly random key. In fact,
	this is how it is done in most places in the proof, and the function F.Kg
	is not used consistently.

  - the functions do not need to take 1^n as an extra input since it can be
	inferred from the key length. In fact, this input is already dropped in
	most places, so I would recommend not introducing it at all.

* the notation H^O for the naPRF in the relativized world seems unnecessary
  since it is just a different name for O

* the overview in Section 2.2 is quite confusing and only became clear after
  reading the formal proof:

  - throughout it says that the oracle R provides requests or challenges. This
	is counterintuitive since we usually think of an oracle as something which
	is queried, not the converse. It could make sense to, already at this
	stage, describe the "three-part" structure of the oracle, whose first two
	parts effectively can be thought of as providing challenges.

  - introducing the three-part structure would also be a good way to explain
	that the oracle is designed to force the adversary to make adaptive queries
	by committing to a value y before receiving the input to the next query. At
	the moment this point is a bit lost in the overview and only described by
	the somewhat cryptic "R issues x_{i+1} only after learning y_i".

* the flow of the proof is a bit hard to follow since it is split in many
  lemmas with many forward pointers. I wonder whether some re-ordering could
  make it easier to follow.
  
* the proof could also be factored better: I believe the goal of Lemma 4 is to
  factor out parts of the proof, but it does not quite achieve this goal. For
  example the proof of Lemma 12 is extremely similar to the proof of Lemma
  4 but does not use it. Also, whenever Lemma 4 is invoked, the preliminary
  work to invoke it is identical, so I suspect that even more could be factored
  out with a slightly more general lemma.

* The quantity 2l/4c which appears frequently should be simplified to l/2c.

* Lemma 11: I believe the factor q is unnecessary since there is at most one
  query where X_2 is sampled. So instead of a union bound, it seems that one
  could the disjoint events "X_2 is sampled at query i" for i in {1, ..., q}.

Typos and minor inaccuracies
----------------------------

The following list is far from being exhaustive:

* Section 1, paragraph "The BH approach", single-line equation: k should be
  replaced by s

* Section 1, paragraph "Our results in detail": "which often requires the input
  to H to (be) fresh" (missing word)

* Section 2.2, paragraph "Security of H revisited": "considering only
  beta-goods" --> "considering only beta-good seeds"

* Section 2.2, paragraph "Security of H revisited", last sentence:
  "beta-sparse" --> "beta-good" (see also above comment on inconsistent use of
  beta-sparse/good)

* Section 4.2, last paragraph: "closely to related to" --> "closely related to"
  (extra "to")

* Section 4.2, last paragraph: "the challenge in the" --> "the challenge"

* Section 4.3, first paragraph: "Putting Propositions 1 and 2 (together)"
  (missing word)

* Section 5.4, first paragraph: "The challenge is (to)" (missing word)

* Appendix D: the set Q^* which is used throughout has not been formally
  defined. In particular, Fig. 4 should show how it is computed.

* Proof of Lemma 11: beta-good should be replaced by beta-sparse throughout

* Proof of Lemma 11: both products should be indexed starting from i=0

* "Proof overview" of Lemma 14: "of size at least l(k+1)/2c(c-1)", the k should
  be replaced by c

* "Proof overview" of Lemma 15, last single-line equation: there is a missing
  2.sigma multiplicative factor in front of the last summand

* Appendix E.2: "it must be that there are at least l/2(k-1)", the k should be
  replaced by c

* Appendix E.2: "of the Z_1[i]'s (Let Z_1[i])" some extra words should be
  removed

* Appendix E.2: "where t = |J_s| \geq l/2c - l/2(c-1)" it should be "l/2(c-1)
  - l/2c"

* Proof of Lemma 14, last line: t= l(c+1)/2c(c-1) (there is a missing (c+1)
  factor)

* Proof of Lemma 15: what is proved is an upper bound  on Pr[Bad_2 and
  j < t | not Bad] (and similarly for j > t + p -1). In other words, the
  condition on j should be put on the left-hand side of the conditioning.

* Appendix G, equation (17): l^* and delta do not exist in this context.
  delta*l^* should be replaced by l/c

Summary of recommendation
-------------------------

The problem and the results are interesting. I believe this paper will be
a valuable contribution to TCC if the above comments are addressed and after
a thorough pass of proofreading.