World's most popular travel blog for travel bloggers.

# [Solved]: proving the error bound for a hypothesis

, ,
Problem Detail:

Given a hypothesis $h:X\rightarrow Y$ ($h$ is returned by an Empirical Risk Minimization (ERM) strategy with realizable case i.e. $h$ is consistent with the sample examples) over $X=[0,1]\subseteq R$ where $Y=\{0,1\}$ and $D$ is the uniform distribution over $X$. How can I prove that for the accuracy parameter $0\leq\epsilon\leq 1$ we might get $err_D(h)>\epsilon$? where $err_D(h)$ is the true error of $h$ in the distribution $D$.

This is a homework and all I can do is using probabilities (as its shown in the course slides and other sources) to give some bounds on the error (i.e. $P(err_D(h)> \epsilon)\leq k(1-\epsilon)^m$) where $k$ is the number of bad hypotheses $|H_B|$ s.t. ($\bar{h}\in H_B | err_D(\bar{h})>\epsilon)$ and $m$ is the sample size.

MY question is: is there any other way (beside using probabilities) to prove this? It just seems odd to me when the question asks for a prove and the answer is about probability bound. I am new to the field of learning theory and this might seems like a silly question but I am really trying to understand the intuition behind using probabilities.

Consider a case where the actual label is $1$ for all $x\in [0,1]$. Suppose $H$ consists of two classifiers: one that realizes this (i.e. $\forall x\in [0,1], h(x)=1$), and one that is almost entirely wrong: $g(x)=0$ except for some finite set of numbers $S$.

If the sample is exactly $S$, then both classifiers have the same error on the sample set (0), so you may choose $g$. However, you'll get that $err_D(g)=1>\epsilon$.

As mentioned in the comments, this doesn't really prove anything useful, since the probability that the sample will be exactly $S$ is $0$.