1. Entropy | Zangwei

Preliminary

A mathematical theory of communicationo - Shannon 1948
Convexity: $f(\sum_ip_ix_i)\leq \sum_ip_if(x_i)$
- $f(E_p x_i)\leq E_pf(x_i)$
- convex: $f’’(x)\geq 0$
$f(x)=-x\log x$ is concave

Entropy: $H(X)=-\sum_{x\in\mathcal{X}}p(x)\log p(x)=E_{p}\log\frac{1}{p(X)}$

Joint Entropy: $H(X,Y)=-E\log p(X,Y)=-\sum_{x\in\mathcal{X}}\sum_{y\in\mathcal{Y}}p(x,y)\log p(x,y)$

Conditional Entropy: $H(Y|X)=\sum_{x\in\mathcal{X}}p(x)H(Y|X=x)=-\sum_{x\in\mathcal{X}}\sum_{y\in\mathcal{Y}}p(x,y)\log p(y|x)=-E\log p(y|x)$

$H(Y|X)\leq H(X)$
remaining uncertainty when $X$ is known
$H(X|Y)\neq H(Y|X)$
$H(X|Y)+H(Y)=H(Y|X)+H(X)=H(X,Y)$
Chain rule: $H(X_1,X_2,\cdots,X_n)=H(X_1)+H(X_2|X_1)+\cdots+H(X_n|X_{n-1},\cdots,X_1)$
Zero conditional entropy: if $H(Y|X)=0,Y=f(X)$

Relative Entropy (Kullback-Leibler distance): $D(p|q)=\sum_{x\in X}p(x)\log\frac{p(x)}{q(x)}=E_p\log\frac{p(X)}{q(X)}=E_p(-\log q(x))-H(p)$

Mutual Information: $I(X;Y)=\sum_x\sum_yp(x,y)\log\frac{p(x,y)}{p(x)p(y)}=D(p(x,y)|p(x)p(y))=E_{p(x,y)}\log\frac{p(X,Y)}{p(X)p(Y)}$

Conditional Mutual Information: $I(X;Y|Z)=H(X|Z)-H(X|Y,Z)=E_{p(x,y,z)}\log\frac{p(X,Y|Z)}{p(X|Z)p(Y|Z)}$

Conditional Relative Entropy: $D(p(y|x)|q(y|x))=E_{p(x,y)}\log \frac{q(Y|X)}{q(Y|X)}$

$H(X_1,X_2,\cdots,X_n)\leq\sum_{i=1}^nH(X_i)$ equality iff $X_i$ are independent
$I(X;Y|Z)$ and $I(X;Y)$
- $X\rightarrow Y\rightarrow Z:I(X;Z|Y)=0$
- $Z=X+Y\bmod 2:I(X;Y|Z)>I(X;Y)$
Data Processing Inequality: $X\rightarrow Y\rightarrow Z,I(X;Y)\geq I(X;Z)$
- $X\rightarrow Y\rightarrow Z,I(X;Y)=I(X;Z)+I(X;Y|Z)$
$I(X;Y;Z)=I(X;Y)-I(X;Y|Z)$
Perfect Secrecy: $H(X)\leq H(Z)$ 信息长度小于秘钥长度
Fano’s Inequality: relationship between $P_e$ and $H(X|Y)$
- $X\rightarrow Y\rightarrow \hat X,P_e=P(\hat X\neq X)$
- $H(P_e)+P_e\log|\mathcal{X}|\geq H(X|\hat X)\geq H(X|Y)$
- weaken: $P_e\geq\frac{H(X|Y)-1}{\log |\mathcal{X}|}$
Log sum inequality: for nonnegative numbers, $\sum_{i=1}^na_i\log\frac{a_i}{b_i}\geq(sum_{i=1}^na_i)\log\frac{\sum_{i=1}^na_i}{\sum_{i=1}^nb_i}$, equal iff $\frac{a_i}{b_i}=c$
$H(p)$ is concave of $p$
$D(p|q)$ is convex in pair $(p,q)$