\documentclass{article}
\usepackage{amsmath}                            % want AMS fonts
\usepackage{amssymb}                            % use AMS symbols
\usepackage[ruled]{algorithm2e}
\usepackage{url}
\input{p}

%=================================================================
% My macros
%=================================================================
\newcommand{\Zstar}[1]{\ensuremath{\mathbb{Z}_{#1}^*}}

\begin{document}
\newcommand{\PRTWOB}{\mathit{PR2B}}
\newcommand{\hyb}{\mathit{Hyb}}
\newcommand{\RandnB}{\mathit{RandnB}}
\newcommand{\PRnB}{\mathit{PRnB}}
\newcommand{\hybTWOB}{\mathit{Hyb2B}}
\newcommand{\hybridTWOB}{\mathit{Hyb2B}}
\newcommand{\RandTWOB}{\mathit{Rand2B}}
\newcommand{\pk}{\mathit{PK}}
\newcommand{\pr}{\textit{Pr}}
\newcommand{\fake}{\mathit{Fake}}
\newcommand{\enc}{\mathit{Enc}}
\newcommand{\randenc}{\mathit{RandEnc}}
\newcommand{\FakeCiphertext}{\mathit{FakeCiphertext}}


\homework{7}{March 7, 2007}{March 14, 2007}{Anna Lysyanskaya}

\problem{Pseudo Random Generator and the Hybrid Argument}

In class we discussed a pseudorandom generator which outputs only 2
bits, $G_{2B}$.  Let $f$ be a one way permutation, and let $H$ be a hardcore bit function for $f$.  On seed $x\in $domain of $f$, the generator will produce:
$H(x), H(f(x))$.  We showed that even when we also include $f(f(x))$,
the output of this generator on a random seed is indistinguishable
from a random 2 bit string, i.e.:

Define the following 2 distributions:
\begin{displaymath}
D_{\PRTWOB} = \{\pk \leftarrow GPK(1^k); x\leftarrow \mbox{dom}(f_{\pk}): (\pk, f(f(x)), H(x), H(f(x)))\}
\end{displaymath} and
\begin{displaymath}
D_{\RandTWOB} = \{\pk \leftarrow GPK(1^k); x\leftarrow \mbox{dom}(f_{\pk}); b_0, b_1 \leftarrow \{0,1\}: (\pk, x, b_0, b_1)\}
\end{displaymath}
We showed that $D_{\PRTWOB} \approx D_{\RandTWOB}$ and thus $G_{2B}$ is a valid 2 bit PRG.  We did this by considering a third distribution:
\begin{displaymath}
D_{\hybTWOB} = \{\pk \leftarrow GPK(1^k); x\leftarrow \mbox{dom}(f_{\pk}); b_0 \leftarrow \{0,1\}: (\pk, f(x), b_0, H(x))\}
\end{displaymath}

We showed that $D_{\PRTWOB} \approx D_{\hybTWOB}$, and $D_{\hybTWOB} \approx D_{\RandTWOB}$, and we argued that that implied that $D_{\PRTWOB} \approx D_{\RandTWOB}$.

For a review of the proof that $D_{\PRTWOB} \approx D_{\hybTWOB}$, see page 3.


Now, what if we want a PRG that outputs more than 2 bits?  We discussed the
solution in class, but lets work out the details.  

Our $n$ bit PRG will take a seed $x\in\mbox{dom}(f)$, and
output $H(x), H(f(x)), \ldots H(f^{n-1}(x))$, where $f^n$ represents the
operation where $f$ is applied $n$ times (we require that $n$ be polynomial in our security parameter $k$).  To show that this is a valid PRG, it is enough to show that the following 2 distributions are indistinguishable:
\begin{displaymath}
D_{\PRnB} = \{\pk \leftarrow GPK(1^k); x\leftarrow \mbox{dom}(f_{\pk}): (\pk, f^n(x), H(x), H(f(x)), \ldots, H(f^{n-1}(x)))\}
\end{displaymath} and
\begin{displaymath}
D_{\RandnB} = \{\pk \leftarrow GPK(1^k); x\leftarrow \mbox{dom}(f_{\pk}); b_0, b_1, \ldots b_n\leftarrow \{0,1\}: (\pk, x, b_0, b_1, \ldots b_n)\}
\end{displaymath}
(Note that this is actually slightly stronger than what we need for pseudorandomness since here we also give the adversary $f^n(x)$).

We will do that by introducing hybrid distributions for $i = 1, \ldots n-1$.  Hybrid $i$ will choose the first $i$ bits of its output at random and then will compute the next $n-i$ bits as $H(x), \ldots H(f^{n-i-1}(x))$.  More formally:
\begin{displaymath}
D_{\hyb i} =  \{\pk; x; b_1, \ldots b_i \leftarrow \{0,1\}: (\pk, f^{n-i}(x),b_1, \ldots b_i, H(x), H(f(x)), \ldots, H(f^{n-i-1}(x)))\}
\end{displaymath}
Note now that $D_{\hyb 0} = D_{\PRnB}$ and $D_{\hyb n} = D_{\RandnB}$.  

As discussed in class, we know that if we can show that there is some negligible $\nu$ such that for all $i$, no adversary can distinguish $D_{\hyb i}$ from $D_{\hyb i+1}$ with advantage greater than $\nu(x)$, and if $n$ is polynomial in $k$, then that means that $D_{\PRnB} \approx D_{\RandnB}$.

Thus all we need to do is prove that hybrids $i$ and $i+1$ are indistinguishable.  Let's prove that now.  (Hint: we will generalize the proof given in the Class Review section.)
\begin{alphabetize}
\item Fill in the blanks in the following proof outline:
We want to show that: \underline{~~~~~~~~} $\implies$ \underline {~~~~~~~}\\
We will do that by showing the contrapositive, i.e. that: $\neg$ \underline {~~~~~~~} $\implies$ $\neg$\underline {~~~~~~~~}.\\

This means we assume we have an adversary $\A$, which takes as input \underline{ ~~~~} and outputs $0$ or $1$.  

Let $p_{\underline{~~}, 0}$ be the probability that $\A$ outputs $0$
when given a random element from distribution \underline{~~~~}, and
let $p_{\underline{~~}, 0}$ be the probability that $\A$ outputs $0$
when given a random element from distribution \underline{~~~~}.  Then
we are assuming that there exists nonnegligible $\epsilon$ such that
$|p_{\underline{~~}, 0} - p_{\underline{~~}, 0}| = \epsilon(k)$.  Assume
WLOG that $p_{\underline{~~}, 0} > p_{\underline{~~},0}$, i.e. that $\A$
is more likely to output a $0$ when given a random element of
distribution \underline{~~~~} than when its is given a random element
of distribution \underline{~~~~}.

Now we want to show that we can bulid an algorithm $\B$ which takes as
input \underline{~~~~~~~~~~~} and tries to output $0$ if it was given
an instance of distribution \underline{~~~~} and $1$ if it was given
an instance of distribution \underline{~~~~} , and succeeds in
distinguishing with nonnegligible advantage, $\epsilon'$.  Basically, we want to show that 
\begin{displaymath}
|\pr[e \leftarrow D_{\underline{~~}}; b\leftarrow \B(e): b=0] - \pr[e\leftarrow D_{\underline{~~}}; b\leftarrow \B(e): b=0]| = \epsilon'(k) 
\end{displaymath}
for nonnegligible $\epsilon'$.

\item What should algorithm $\B$ do?  What arguments should it send to $\A$?  What should it output if $\A$ returns 0?  What should it output if $\A$ returns 1?

\item What is $\B$'s advantage, i.e. show that $\epsilon'$ as defined above is nonnegligible.  



\end{alphabetize}

\problem{Public Key Encryption}

Suppose we use an $f$ which is a trapdoor permutation to build a PRG
using the construction above.  Then we can use this to construct a
public key encryption scheme:
\begin{itemize}
\item Bob chooses $(\pk,\sk)\leftarrow GPK(1^k)$ and publishes $\pk$
\item Alice wants to encrypt an $n$ bit message $m$.  She chooses a random $x \leftarrow $ domain of $f_\pk$.  Let $m_i$ be the $i$th bit of the message $m$.  Alice  computes $C = (C_0=f^n(x), c_1=H(x) \oplus m_1, c_2=H(f(x)) \oplus m_2, \ldots, c_{n-1}=H(f^{n-2}(x)) \oplus m_{n-1}, c_n=H(f^{n-1}(x)) \oplus m_n)$, and sends that to Bob.
\item Bob receives $C = C_0, c_1, \ldots c_n$.  Bob uses his secret
key to compute $f^{-1}(C_0) = f^{n-1}(x), f^{-1}(f^{-1}(C_0)) =
f^{n-2}(x), \ldots (f^{-1})^n(C_0) = x$.  Then Bob uses these values
to compute $H(x), H(f(x)), \ldots H(f^{n-1}(x))$.  Finally, Bob
computes $H(x) \oplus c_1 = m_1, H(f(x)) \oplus c_2 = m_2, \ldots
H(f^{n-1}(x)) \oplus c_n = m_n$ and thus recovers $m$.
\end{itemize}

Now we will prove that this encryption scheme is secure.  Recall our
definition of security for encryption.  We said an encryption scheme
(GPK, Enc, Dec) is secure if there exists an algorithm FakeCiphertext
such that for all $m$, the following two distribution are equivalent:
\begin{displaymath}
D_{\enc}(m)=\{(PK, SK)\leftarrow GPK(1^k); c \leftarrow \enc(PK, m);
(c,m,PK)\}
\end{displaymath} and 
\begin{displaymath}
D_{\fake}(m)\{(PK, SK)\leftarrow GPK(1^k); \hat{c} \leftarrow
\FakeCiphertext(PK, |m|); (\hat{c}, m, PK)\}.
\end{displaymath}  Note that FakeCiphertext is only
given the length of the messsage, and yet it produces something
indistinguishable from the encryption of the message.  This captures
our idea that encryption should hide all information about the message
(besides its length).

\begin{alphabetize}
\item If we want to show that the encryption scheme described above is secure according to this definition, we need to show that there exists an algorithm $\FakeCiphertext$ as described above.  What should $\FakeCiphertext(\pk, l)$ do?

\item Now lets show that this is indistinguishable from a real
encryption.  First let's consider an alternate encryption algorithm $\randenc$
which on input $(\pk, m)$ chooses a random $x\leftarrow$ domain of $f_\pk$ as above, and then chooses random string $b_1, \ldots b_n$.  It outputs $C= (C_0 = x, c_1 = m_1 \oplus b_1, \ldots c_n = m_n \oplus b_n$.  

Show that this distribution is identical to that generated by $\FakeCiphertext$(i.e. that for all $m$, $D_\randenc(m) \approx D_fake(m)$.

\item Show that the above distribution is indistinguishable from that generated by the true encryption algorithm.  
Recall that we proved in problem 1 that $D_{Random} \approx D_{PR}$.  Let's use this to show that for all $m$, $D_\enc(m) \approx D_\randenc(m)$.  
Fill in the blanks:

We want to show that: \underline{~~~~~~~~} $\implies$ \underline {~~~~~~~}\\
We will do that by showing the contrapositive, i.e. that: $\neg$ \underline {~~~~~~~} $\implies$ $\neg$\underline {~~~~~~~~}.\\

This means we assume we have an adversary $\A$, which takes as input \underline{ ~~~~} and outputs $0$ or $1$.  

Let $p_{\underline{~~}, 0}$ be the probability that $\A$ outputs $0$
when given a random element from distribution \underline{~~~~}, and
let $p_{\underline{~~}, 0}$ be the probability that $\A$ outputs $0$
when given a random element from distribution \underline{~~~~}.  Then
we are assuming that there exists nonnegligible $\epsilon$ such that
$|p_{\underline{~~}, 0} - p_{\underline{~~}, 0}| = \epsilon(k)$.  Assume
WLOG that $p_{\underline{~~}, 0} > p_{\underline{~~},0}$, i.e. that $\A$
is more likely to output a $0$ when given a random element of
distribution \underline{~~~~} than when its is given a random element
of distribution \underline{~~~~}.

Now we want to show that we can bulid an algorithm $\B$ which takes as
input \underline{~~~~~~~~~~~} and tries to output $0$ if it was given
an instance of distribution \underline{~~~~} and $1$ if it was given
an instance of distribution \underline{~~~~} , and succeeds in
distinguishing with nonnegligible advantage, $\epsilon'$.  I.e. we want to show that $|\pr[e \leftarrow D_{\underline{~~}}; b\leftarrow \B(e): b=0] - \pr[e\leftarrow D_{\underline{~~}}; b\leftarrow \B(e): b=0]| = \epsilon'(k)$ for nonnegligible $\epsilon'$.

\item What should algorithm $\B$ do?  What arguments should it send to $\A$?  What should it output if $\A$ returns 0?  What should it output if $\A$ returns 1?

\item What is $\B$'s advantage, i.e. show that $\epsilon'$ as defined above is nonnegligible.  

\end{alphabetize}


\section*{Class Review}
Recall the definition we used in class for a hardcore bit: $H$ is a hardcore bit function for $f$ if the following two distributions are indistinguishable:
\begin{displaymath}
D_{0} = \{\pk \leftarrow GPK(1^k); x\leftarrow \mbox{domain of } f_{\pk}: (\
\pk, f(x), H(x))\}
\end{displaymath} and 
\begin{displaymath}
D_{\frac{1}{2}} = \{\pk \leftarrow GPK(1^k); x\leftarrow \mbox{domain of } f_{\pk}; b\leftarrow \{0,1\}: (\pk, x, b)\}.
\end{displaymath}

We want to show that $H(x)$ is hardcore for $f  \implies D_{\PRTWOB} \approx D_{\hybridTWOB}$\\
i.e. that $D_0 \approx D_{\frac{1}{2}}  \implies D_{\PRTWOB} \approx D_{\hybridTWOB}$\\

We will do that by showing the contrapositive, i.e. that:  $\neg  (D_{\PRTWOB} \approx D_{\hybridTWOB}) \implies \neg (D_0 \approx D_{\frac{1}{2}})$. 



This means we assume we have an adversary $\A$, who takes as input
$(\pk,y,b_1, b_2)$ which is either of the form $(\pk, f(f(x)), H(x),
H(f(x)))$ for randomly chosen $x$, or of the form $(\pk, f(x), b, H(x))$
for randomly chosen bit $b$ and randomly chosen $x$.  $\A$ outputs $0$
or $1$.

Let $p_{\PRTWOB, 0}$ be the probability that $\A$ outputs $0$
when given a random element from distribution $D_{\PRTWOB}$, and
let $p_{\hybridTWOB, 0}$ be the probability that $\A$ outputs $0$
when given a random element from distribution $D_{\hybridTWOB}$.  Then
we are assuming that there exists nonnegligible $\epsilon$ such that
$|p_{\PRTWOB, 0} - p_{\hybridTWOB, 0}| = \epsilon(k)$.  Assume
WLOG that $p_{\PRTWOB, 0} > p_{\hybridTWOB,1}$, i.e. that $\A$
is more likely to output a $0$ when given a random element of
distribution $D_{\PRTWOB}$ than when its is given a random element
of distribution $D_{\hybridTWOB}$.

Now we want to show that we can bulid an algorithm $\B$ which takes as
input $(\pk, y, b')$ which is either of the form $(\pk, f(x), H(x))$
for randomly chosen $x$, or of the form $(\pk, x, b)$ for
randomly chosen $x$ and random bit $b$.  $\B$ tries to output
$0$ if it was given an instance of distribution $D_0$ and $1$ if it
was given an instance of distribution $D_{\frac{1}{2}}$ , and we want
to show that it succeeds in distinguishing with nonnegligible
advantage, $\epsilon'$.  I.e. we want to show that $|\pr[e \leftarrow
D_{0}; \hat{b}\leftarrow \B(e): \hat{b}=0] - \pr[e\leftarrow D_{\frac{1}{2}};
\hat{b}\leftarrow \B(e): \hat{b}=0]| = \epsilon'(k)$ for nonnegligible
$\epsilon'$.

Let's define algorithm $\B$ as follows: on input $(\pk, y, b')$, $\B$ computes $f(y)$ and $H(y)$.  It then runs $\A (\pk, f(y), b', H(y))$.  If $\A$ outputs 0, $\B$ outputs 0, and if $\A$ outputs 1, $\B$ outputs 1.  

Note that, if the input $\B$ recieved was chosen from $D_0$, then it
was of the form $(\pk, f(x), H(x))$ for randomly chosen $x$.  That
means that what $\A$ receives was $(\pk, f(y) = f(f(x)), b' = H(x),
H(y) = H(f(x)))$, which is clearly distributed like $D_{\PRTWOB}$.
Thus, $\A$ will output $0$ with probability $p_{\PRTWOB, 0}$.

On the other hand, if the input $\B$ received was chosen from $D_1$, then it was of the form $(\pk, x, b)$ for randomly chosen $x$ and random bit $b$.  That means that what $\A$ received was $(\pk, f(y) = f(x), b' = b, H(y) = H(x))$, which is clearly distributed like $D_{\PRTWOB}$.  Thus, $\A$ will output $0$ with probability $p_{\hybridTWOB, 0}$.

Thus $\B$'s advantage will be:  
\begin{eqnarray*}
&& |\pr[e \leftarrow D_{0}; b\leftarrow \B(e): b=0] - \pr[e\leftarrow D_{\frac{1}{2}}; b\leftarrow \B(e): b=0]|\\
&=& |\pr[(\pk, y, b) \leftarrow D_{0}; b\leftarrow \A(\pk, f(y), b, H(y)): b=0]\\
& &  - \pr[(\pk, y, b)\leftarrow D_{\frac{1}{2}}; b\leftarrow \A(\pk, f(y), b, H(y)): b=0]|\\
&=& |p_{\PRTWOB, 0} - p_{\hybridTWOB,0}|  \mbox{ by our arguments above}\\
&=& \epsilon(k)
\end{eqnarray*}

Thus, $\B$ can distinguish $D_0$ from $D_{\frac{1}{2}}$ with nonnegligible advantage.





 \end{document}
