KeystoeML是一个用Scala编写的软件框架,来自伯克利大学AMPLab实验室。该项目主要目的是简化构造大规模、端到端的机器学习管道,基于ApacheSpark构建。
示例代码:
val traiData = NewsGroupsDataLoader(sc, traiigDir)val predictor = Trim.the(LowerCase()) .the(Tokeizer()) .the(ew NGramsFeaturizer(1 to cof.Grams)) .the(TermFrequecy(x => 1)) .theEstimator(CommoSparseFeatures(cof.commoFeatures)) .fit(traiData.data) .theLabelEstimator(NaiveBayesEstimator(umClasses)) .fit(traiData.data, traiData.labels) .the(MaxClassifier)测试:
val test = NewsGroupsDataLoader(sc, testigDir)val predictios = predictor(test.data)val eval = MulticlassClassifierEvaluator(predictios, test.labels, umClasses)pritl(eval.summary(ewsgroupsData.classes))输出:
Avg Accuracy: 0.980Macro Precisio:0.816Macro Recall: 0.797Macro F1: 0.797Total Accuracy: 0.804Micro Precisio:0.804Micro Recall: 0.804Micro F1: 0.804
评论