EM algorithm for Gaussian mixture model on gridded space

The common setting for GMM is that we try to fit multiple Gaussian models on the feature space directly. However, there are so many applications that we want to fit the models on the gridded space where each grid cell contains a real value. The traditional formulation of EM-GMM does not work on such space, and therefore, a way around it is to resample the distribution obtained from the gridded space. Such approach is not quite computationally efficient in both run-time and memory as we need to resample the original gridded space so many times to guarantee the sample represents the original space well.

I reformulate the likelihood function of such grid data and derive an EM algorithm for such an scenario as seen in the report here. Some preliminary results are also shown here:

Experiment1: 2D grid space

The dataset comprises 5 Gaussian components in 2D-gridded space. We run the GMM for gridded data using 5 components, randomly initialize the posterior for each grid cell. The results are compared to the original dataset below. We found the results well approximate the original dataset.

The log-likelihood increases as the iterations go.

Experiment2: 2D grid space with truncation

We also test the performance of our GMM on the truncated data. It turns out that the GMM cannot capture the truncation well and tries to fit the complete Gaussian model within the gridded space.

Experiment3: 3D grid space

It works for any arbitrary dimension. The algorithm also neglect the truncation in this 3d gridded space too.

Experiment4: Application on image processing

We fit GMM to each channel (RGB) independently of an image.

MATLAB toolbox is made available here.