Latest update

# Random Binary Strings, Time until All Differ

2017-12-03 03:08:47

Let $\{X_i\}_1, \{X_i\}_2, \ldots, \{X_i\}_n$ be random binary strings where for all strings $j$ and all positions $i$, independently of all other bits on all strings, $X_{i, j} \sim Ber(1/2)$. Assume that we generate the strings by sequentially drawing the $i$-th bit independently for all strings. In this setting, let $T$ denote the first instance in time at which no two strings are equal. I wish to argue that

P[T \geq 2\log_2{n} + 1] \leq 1/4.

In particular I am interested if Markov's inequality will do the job. To obtain a handle on $E[T]$, let $Y_i$ count the distinct strings pairs at time $i$. Clearly we have reached $T$ iff $Y_T = 0$.

Exploiting linearity of expectation, it is easily shown that $E[Y_{i + 1} | Y_i = k] = k/2$ and hence $E[T] \leq \log_2{y_0}$. Since $y_0 \approx n^2$ this allows however only for

P[T \geq 2\log_2{n} + 1] \leq \frac{E[T]}{2\log_2{n} + 1} \leq \frac{2\log_2{n}}{2\log_2{n} + 1},

\