## Randomized QuickSort

Randomized QuickSort is an efficient sort algorithm requiring only $\Theta(n\log n)$ expected time on any input.

## Randomized Select

Randomized Select is an efficient algorithm to find $i$th smallest element of Array[low…high] with only linear expected time on any input.

## Analysis

### Randomized QuickSort

Let $T(n)$ = the random variable for the running time of randomized QuickSort on an input of size $n$. Assuming those r.v. are independent.
For $k = 0, 1, 2, \cdots, n - 1$, define the indicator r.v.
$X_k = \begin{cases} 1\text{ if RANDOMIZED-PARTITION generates a k:n - k - 1 split,}\newline 0~otherwise \end{cases}$
Then we can simplify the $T(n)$ equations:
\begin{align*} T(n) &= \begin{cases} T(0) + T(n - 1) + \Theta(n)\text{ if 0:n - 1 split},\newline T(1) + T(n - 2) + \Theta(n)\text{ if 1:n - 2 split},\newline \cdots\newline T(n - 1) + T(0) + \Theta(n)\text{ if n - 1:0 split}\newline \end{cases}\newline &= \sum_{k = 0}^{n - 1}X_k\left(T(k) + T(n - k - 1) + \Theta(n)\right) \end{align*}
Take expectations of both sides:
\begin{align*} E[T(n)] &= E\left[\sum_{k = 0}^{n - 1}X_k\left(T(k) + T(n - k - 1) + \Theta(n)\right)\right]\newline &= \sum_{k = 0}^{n - 1}E\left[X_k\left(T(k) + T(n - k - 1) + \Theta(n)\right)\right] \end{align*}
Since independence of $X_k$ from other random choices,
\begin{align*} E[T(n)] &= \sum_{k = 0}^{n - 1}E\left[X_k\right]E\left[T(k) + T(n - k - 1) + \Theta(n)\right]\newline &= \frac{1}{n}\sum_{k = 0}^{n - 1}E\left[T(k) + T(n - k - 1) + \Theta(n)\right]\newline &= \frac{1}{n}\sum_{k = 0}^{n - 1}E\left[T(k)\right] + \frac{1}{n}\sum_{k = 0}^{n - 1}E\left[T(n - k - 1)\right] + \frac{1}{n}\sum_{k = 0}^{n - 1}E\left[\Theta(n)\right]\newline &= \frac{2}{n}\sum_{k = 0}^{n - 1}E\left[T(k)\right] + \Theta(n)\newline \end{align*}
Absorb the $k = 0$ terms in $\Theta(n)$, so we get
$E[T(n)] = \frac{2}{n}\sum_{k = 1}^{n - 1}E\left[T(k)\right] + \Theta(n)$
Prove $E[T(n)] \leq an\log n$ for constant $a > 0$:

Substitue inductive hypothesis

\begin{align*} E[T(n)] &\leq \frac{2}{n}\sum_{k = 1}^{n - 1}ak\log k + \Theta(n)\newline \sum_{k = 1}^{n - 1}ak\log k &< \int_{k = 1}^nak\log k \mathrm dk\newline &= (\frac{1}{2}ak^2\log k - \frac{1}{4}ak^2)|_1^n\newline &= \frac{1}{2}an^2\log n - \frac{1}{4}an^2 \end{align*}
So that,
\begin{align*} E[T(n)] &\leq \frac{2}{n}\left(\frac{1}{2}an^2\log n - \frac{1}{4}an^2\right) + \Theta(n)\newline &= an\log n - (\frac{1}{2}an - \Theta(n))\newline \end{align*}
if $a$ is chosen sufficient large so that $\frac{1}{2}an$ dominates the $\Theta(n)$,
\begin{align*} E[T(n)] &\leq an\log n \end{align*}
Thus, randomized QuickSort has a $\Theta(n\log n)$ expected sorting time.

### Randomized Select Analysis

Let $T(n)$ = the random variable for the running time of randomized select on an input of size $n$. Assuming those r.v. are independent.
For $k = 0, 1, 2, \cdots, n - 1$, define the indicator r.v.
$X_k = \begin{cases} 1\text{ if RANDOMIZED-PARTITION generates a k:n - k - 1 split,}\newline 0~otherwise \end{cases}$
Then we can simplify the $T(n)$ equations:
\begin{align*} T(n) &= \begin{cases} T(max\{0, n - 1\}) + \Theta(n)\text{ if 0:n - 1 split},\newline T(max\{1, n - 2\}) + \Theta(n)\text{ if 1:n - 2 split},\newline \cdots\newline T(max\{n - 1, 0\}) + \Theta(n)\text{ if n - 1:0 split}\newline \end{cases}\newline &= \sum_{k = 0}^{n - 1}X_k\left(T(max\{k, n - k - 1\}) + \Theta(n)\right) \end{align*}
Take expectations of both sides:
\begin{align*} E[T(n)] &= E\left[\sum_{k = 0}^{n - 1}X_k\left(T(max\{k, n - k - 1\}) + \Theta(n)\right)\right]\newline &= \sum_{k = 0}^{n - 1}E\left[X_k\left(T(max\{k, n - k - 1\}) + \Theta(n)\right)\right] \end{align*}
Since independence of $X_k$ from other random choices,
\begin{align*} E[T(n)] &= \sum_{k = 0}^{n - 1}E\left[X_k\right]E\left[T(max\{k, n - k - 1\}) + \Theta(n)\right]\newline &= \frac{1}{n}\sum_{k = 0}^{n - 1}E\left[T(max\{k, n - k - 1\}) + \Theta(n)\right]\newline &= \frac{2}{n}\sum_{k = \lfloor\frac{n}{2}\rfloor}^{n - 1}E\left[T(k)\right] + \Theta(n)\newline \end{align*}
Prove $E[T(n)] \leq cn$ for constant $c > 0$:

Substitute inductive hypothesis

\begin{align*} E[T(n)] &\leq \frac{2}{n}\sum_{k = \lfloor\frac{n}{2}\rfloor}^{n - 1}ck + \Theta(n)\newline &\leq \frac{2}{n}(\frac{3}{8}cn^2) + \Theta(n)\newline &= cn - (\frac{1}{4}cn - \Theta(n)) \end{align*}
if $c$ is chosen sufficient large so that $\frac{1}{4}cn$ dominates the $\Theta(n)$,
\begin{align*} E[T(n)] &\leq cn \end{align*}
Thus, randomized select has a linear expected time.