World's most popular travel blog for travel bloggers.

# General question on approach to showing sample size required is $\Omega(\cdot)$

, ,
Problem Detail:

I was not able to find a similar question in the archives, so pardon this relatively elementary question if it has been asked before.

Let $E_n$ be some event that depends on the sample size $n$ and suppose we are looking for an upper bound on $n$ that guarantees $Pr(E_n)\geq 1-\varepsilon$ for some $\varepsilon\in(0,1)$. For example, if we are flipping a biased coin that lands heads with probability $p>.5$ and we define the event $E_n = \{\text{# of heads}>\frac{n}{2}\}$ then we are looking for $n\in O(f(p))$ since the sample size will be a function of the parameter $p$. One way I have seen this approached is to upper bound the probability of $E_n$ NOT occurring and then set this upper bound to be at most $\varepsilon$ and then solving such that we get $n\geq\text{ something involving } p$. My interpretation of this is that it is sufficient to have $n$ this big to guarantee $Pr(E_n)\geq 1-\varepsilon$ which is why it is an upper bound.

My question then is this: if my interpretation above is correct, is there an analogous approach to showing a lower bound for $n$ i.e. $n\in\Omega(g(p))$?

###### Answered By : Yuval Filmus

There is a difference between the two questions. Let $p_n$ be the number of heads in $n$ trials divided by $n$. We know that for large $n$, $p_n$ is very close to $p$; qualitatively this is the law of large numbers, and quantitatively this is the central limit theorem. This phenomenon is known as concentration. This is what you use to give an upper bound on $n$. In contrast, to give a lower bound on $n$ you need to show anti-concentration: that for small $n$, $p_n$ has a reasonable probability of being somewhat far from $p$. The techniques for showing anti-concentration are often more delicate.

In your particular example, concentration can be shown using classical inequalities such as Chernoff's inequality. For anti-concentration you can use the Paley–Zygmund inequality, but you won't necessarily get good bounds. To get nearly tight bounds you can use a general-purpose double-sided inequality such as Berry–Esseen, or a specialized one à la de Moivre–Laplace.