Short notes, stray thoughts, and the occasional technical write-up.
KL Divergence Explained
Entropy, KL divergence, and why forward vs reverse KL determines whether a model collapses onto one mode or spreads across all of them. First post in a series building up to the ELBO.