Python下的PPCA库:pca-magic

jopen 10年前

Python下的PPCA库,相比Scikit-Learn里的实现,该库能更好的处理缺失数据,并基于另外的数据集进行插值。

Install via pip:

pip install ppca

Load in the data which should be arranged asn_samplesbyfeatures. As usual, you should make sure your data is stationary (take first differences if possible) and standardized.

from ppca import PPCA  ppca = PPCA(data)

Fit the model with parameterdspecifying the number of components and verbose printing convergence output if required.

ppca.fit(d=100, verbose=True)

The model parameters and components will be attached to the ppca object.

variance_explained = ppca.var_exp  components = ppca.X  model_params = ppca.C

If you want the principal components, calltransform.

component_mat = ppca.transform()

Post fitting the model, save the model if you want.

ppca.save('mypcamodel')

Load a model, post instantiating a PPCA object. This will make fitting/transforming much faster.

ppca.load('mypcamodel.npy')

项目主页:http://www.open-open.com/lib/view/home/1424920226687