Apache SystemM v0.10.0-incubating 发布,一个机器学习语言
jopen 8年前
<p style="text-align: center;"><img alt="" src="https://simg.open-open.com/show/cc4cc96b41b42ef2a55a6e564e52bb6a.png" /></p> <p> </p> <p>SystemML是灵活的,可伸缩机器学习 (ML) 语言,使用Java编写。可实现三大功能:(1) 可定制算法;(2) 多个执行模式,包括单个,Hadoop 批量和 Spark 批量;(3) 自动优化。</p> <p>SystemML的机器学习主要基于两方面:</p> <ul> <li>SystemML 语言,声明式机器学习 (DML)。SystemML 包含线性代数原语,统计功能和 ML 指定结构,可以更容易也更原生的表达 ML 算法。算法通过 R 类型或者 Python 类型的语法进行表达。DML 通过提供灵活的定制分析表达和独立于底层输入格式和物理数据表示的数据显著提升数据科学的生产力。 </li> <li>SystemML 提供自动优化功能,通过数据和集群特性保证高效和可伸缩。SystemML 可以在 MapReduce 或者 Spark 环境运行。</li> </ul> <h2>更新日志 </h2> <h3>Different Types of Spark Matrix Blocks</h3> <ul> <li>Supported internal formats: MCSR (default), CSR, COO</li> <li>Automatic MCSR➡CSR on Spark read/caching (for memory efficiency)</li> <li>Automatic MCSR➡CSR on sparse update-in-place (avoid serialization)</li> </ul> <h3>Frame Support for JMLC API/CP</h3> <ul> <li>New frame data type, deeply integrated into compiler and runtime</li> <li>New builtin functions: transformapply, transformencode, transformdecode, transformmeta</li> <li>Supported operations: read/write, left/right indexing, casting, append, transform/transformapply</li> </ul> <h3>Framework Compatibility/Configuration</h3> <ul> <li>[SYSTEMML-418] Version-specific Spark memory budgets (>=1.6, legacy)</li> <li>[SYSTEMML-158] Updated deprecated Hadoop properties</li> <li>[SYSTEMML-476] Version-specific MR configuration handling (MRv2, MRv1)</li> <li>Fixes for backwards compatibility to MRv1 (Guava dependency conflicts, runtime changes such as task handling for multiple output committer)</li> <li>New pass-through mapred/mapreduce configurations through SystemML-config</li> <li>[SYSTEMML-584/585] New thread-local configuration handling (compiler/DML config)</li> </ul> <h3>Deep Learning Support</h3> <ul> <li>[SYSTMEML-618] New DML-script NN library</li> <li>[SYSTEMML-540] New built-in singlenode operations: conv2d, maxpooling, im2col, col2im, rotate</li> <li>New lenet-train DML script</li> </ul> <h3>API/Script Usability</h3> <ul> <li>[SYSTEMML-607/604/611] Parser error handling</li> <li>[SYSTEMML-506/508/544/577/649/651] Extended MLContext/JMLC APIs</li> <li>[SYSTEMML-625/626/632] Improved source statement handling (e.g., imports, absolute paths)</li> <li>[SYSTEMML-617/631/654] Improved namespace handling</li> <li>[SYSTEMML-240] Extended stats outputs for Spark collect/broadcast/parallelize</li> <li>[SYSTEMML-495] SystemML configuration handling</li> <li>[SYSTEMML-209] Include algorithms in SystemML jar</li> <li>[SYSTEMML-647/648] Deprecated castAsScalar, ppred</li> <li>[SYSTEMML-477] JSON meta data handling</li> <li>[SYSTEMML-294] Print matrix built-in function</li> <li>[SYSTEMML-296/676/670] Improved PyDML syntax: slicing, rand, cdf, elif</li> <li>[SYSTEMML-675] Support for negative for/parfor loop increments</li> </ul> <h3>New Fused Physical Operators</h3> <ul> <li>[SYSTEMML-488] Fused wdivmm w/ 4 operands</li> <li>[SYSTEMML-510] Fused wdivmm/wcemm w/ eps term</li> </ul> <h3>Various Performance Features</h3> <ul> <li>[SYSTEMML-427/512] Extended IPA (propagate scalar variables)</li> <li>[SYSTEMML-282] Extended update-in-place support for parfor intermediates</li> <li>[SYSTEMML-552/399] Performance parallel binary/text readers (sort sparse/nnz handling)</li> <li>[SYSTEMML-552/641] Cache-conscious operations: sparse-dense wdivmm/wsloss, sparse-dense/sparse-sparse mm, dense-dense skinny rhs mm</li> <li>[SYSTEMML-641] Tuned special cases for block matrix multiplication: e.g., mm w/ skinny rhs, colwise parallelization wide rhs</li> <li>[SYSTEMML-396/400] New/extended multithreaded operations: cumsum/cummin/cummax/cumprod, transpose, and rand</li> <li>[SYSTEMML-510/694] New simplification rewrites: “pushdown unaryagg-transpose”, “simplify transpose-aggbin-binary chains”, “reorder minus-mmult”, “canonicalize matmult-add-scalar”, improved constant folding (all unary)</li> <li>[SYSTEMML-653] Asynchronous bufferpool cleanup of evicted files/nio file eviction</li> <li>MR iqm/quantile/median (qsort num reducers, qpick buffer size)</li> </ul> <h3>DML Script Updates</h3> <ul> <li>[SYSTEMML-536] New KNN algorithm (still staging)</li> <li>[SYSTEMML-534] Optional console output univariate statistics</li> <li>[SYSTMEML-494] GLM compiler warnings</li> <li>Robustness input/output handling L2SVM, MSVM, and Naive Bayes</li> <li>Random data generator for ALS</li> </ul> <h3>Various Fixes</h3> <ul> <li>Dozens of fixes for diverse issues, fix pack for 0.9 release</li> </ul> <h3>Build, Documentation, Examples</h3> <ul> <li>[SYSTEMML-551] Enhanced JMLC javadoc</li> <li>[SYSTEMML-484] Build javadoc jar</li> <li>[SYSTEMML-468] Contributing to SystemML doc</li> <li>[SYSTEMML-517/524] DML Language Reference updates</li> <li>[SYSTEMML-498] Troubleshooting guide</li> <li>SystemML Jupyter/Zeppelin Notebook examples</li> </ul> <h2>下载</h2> <table> <tbody> <tr> <td>systemml-0.10.0-incubating (tar.gz)</td> <td><a href="/misc/goto?guid=4958991567768594218">tar.gz</a></td> <td><a href="/misc/goto?guid=4958991567887686452">MD5</a></td> <td><a href="/misc/goto?guid=4958991568000486338">ASC</a></td> </tr> <tr> <td>systemml-0.10.0-incubating (zip)</td> <td><a href="/misc/goto?guid=4958991568088688360">zip</a></td> <td><a href="/misc/goto?guid=4958991568191644783">MD5</a></td> <td><a href="/misc/goto?guid=4958991568287537831">ASC</a></td> </tr> <tr> <td>systemml-0.10.0-incubating-standalone (tar.gz)</td> <td><a href="/misc/goto?guid=4958991568369781456">tar.gz</a></td> <td><a href="/misc/goto?guid=4958991568469131911">MD5</a></td> <td><a href="/misc/goto?guid=4958991568550317252">ASC</a></td> </tr> <tr> <td>systemml-0.10.0-incubating-standalone (zip)</td> <td><a href="/misc/goto?guid=4958991568629412190">zip</a></td> <td><a href="/misc/goto?guid=4958991568712511081">MD5</a></td> <td><a href="/misc/goto?guid=4958991568798311741">ASC</a></td> </tr> <tr> <td>systemml-0.10.0-incubating (Source tar.gz)</td> <td><a href="/misc/goto?guid=4958991568869591874">tar.gz</a></td> <td><a href="/misc/goto?guid=4958991568943130037">MD5</a></td> <td><a href="/misc/goto?guid=4958991569026828300">ASC</a></td> </tr> <tr> <td>systemml-0.10.0-incubating (Source zip)</td> <td><a href="/misc/goto?guid=4958991569111880909">zip</a></td> <td><a href="/misc/goto?guid=4958991569200174564">MD5</a></td> <td><a href="/misc/goto?guid=4958991569283069716">ASC</a></td> </tr> </tbody> </table> <h2> </h2>