Dagli Java 机器学习函数库_开源项目-程序员客栈

Dagli是LinkedIn开源的用于Java（和其他JVM语言）的机器学习函数库，其开发团队表示通过它可轻松编写不易出错、可读、可修改、可维护且易于部署的模型管道，而不会引起技术债。Dagli充分利用了现代多核的CPU和功能日益强大的GPU，可以对真实世界模型进行有效的单机训练。

下面是一个文本分类器的介绍性示例，此文本分类器以管道的形式实现，使用梯度增强决策树模型(XGBoost)的主动学习以及高维ngram集作为逻辑回归分类器中的特征：

Placeholder<String>text=newPlaceholder<>();Placeholder<LabelType>label=newPlaceholder<>();Tokenstokens=newTokens().withInput(text);NgramVectorunigramFeatures=newNgramVector().withMaxSize(1).withInput(tokens);Producer<Vector>leafFeatures=newXGBoostClassification<>().withFeaturesInput(unigramFeatures).withLabelInput(label).asLeafFeatures();NgramVectorngramFeatures=newNgramVector().withMaxSize(3).withInput(tokens);LiblinearClassification<LabelType>prediction=newLiblinearClassification<LabelType>().withFeaturesInput().fromVectors(ngramFeatures,leafFeatures).withLabelInput(label);DAG2x1.Prepared<String,LabelType,DiscreteDistribution<LabelType>>trainedModel=DAG.withPlaceholders(text,label).withOutput(prediction).prepare(textList,labelList);LabelTypepredictedLabel=trainedModel.apply("Sometextforwhichtopredictalabel",null);//trainedModelnowcanbeserializedandlaterloadedonaserver,inaCLIapp,inaHiveUDF...

Dagli Java 机器学习函数库开源项目

作品详情

重点城市程序员兼职推荐

重点岗位程序员兼职推荐