Releases · parichit/DCEM

Improves the EM* implementation for even faster execution. The EM* is motivated by the ideas published in the Using data to build a better EM: EM* for big data. Hasan Kurban, Mark Jenne, Mehmet M. Dalkilic (2016) https://doi.org/10.1007/s41060-017-0062-1.

The package now supports both the EM* and the traditional EM algorithm for speed-up comparison. The EM* leverages the max-heap structure to expedite the execution time manifold compared to the conventional EM.

Assets 3

23 Jul 17:53

parichit

v1.0.0

9ea9658

DCEM with EM* Implementation

The DCEM_1.0.0 release brings the faster version of the EM algorithm ([1] EM* algorithm). The EM* algorithm leverages the heap structure internally to avoid revisiting the data in the long run thereby reducing the run time of the conventional EM implementation, significantly. For easy accessibility, the function call for accessing the EM* stays the same as in the previous versions with the same parameters (the only exception being the function name). For technical details about the algorithm, please see the following:

Reference:

[1] Using data to build a better EM: EM* for big data. Hasan Kurban, Mark Jenne, Mehmet M. Dalkilic doi:https://doi.org/10.1007/s41060-017-0062-1.

Assets 3

05 Apr 22:20

parichit

v0.0.2

b8f5910

DCEM Improved and Faster Initialization

Implements the Expectation Maximisation (EM) algorithm for clustering finite Gaussian mixture models for
both multivariate and univariate datasets. The initialization is done by randomly selecting the samples from the dataset as the mean of the Gaussian(s). This version improves the parameter initialization on big datasets by using the ideas published in [1] K-means++: The Advantages of Careful Seeding, David Arthur and Sergei Vassilvitskii. URL http://ilpubs.stanford.edu:8090/778/1/2006-13.pdf. The algorithm returns a set of Gaussian parameters-posterior probabilities, mean, covariance matrices (multivariate data)/standard-deviation (for univariate datasets) and priors.

Reference:
[1] K-means++: The Advantages of Careful Seeding, David Arthur and Sergei Vassilvitskii. URL http://ilpubs.stanford.edu:8090/778/1/2006-13.pdf

[2] Hasan Kurban, Mark Jenne, Mehmet M. Dalkilic (2016) doi:10.1007/s41060-017-0062-1. This work is partially supported by NCI Grant 1R01CA213466-01.

Assets 3

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Releases: parichit/DCEM

DCEM 2.0.3 with vignettes

DCEM - patched for K++ initialization

DCEM: A new release with Blazing fast EM* Implementation.

DCEM with EM* Implementation

DCEM Improved and Faster Initialization