IND是一个开源的系统,可以处理大部分的独立事件,而这些独立事件都是使用固定长度的向量描述的值。IND提供了一系列的功能特征和使用风格,主要是为了方便普通用户以及高级用户或者是那些对调查研究感兴趣的人使用。IND是由四个基础的例程组成:数据操作例程、目录生成例程、目录检测例程和目录显示例程。
INDisapplicabletomostdatasetsconsistingofindependentinstances,eachdescribedbyafixedlengthvectorofattributevalues.Anattributevaluemaybeanumber,oneofasetofattributespecificsymbols,oromitted.Oneoftheattributesisdelegatedthe"target"andINDgrowstreestopredictthetarget.Predictioncanthenbedoneonnewdataorthedecisiontreeprintedoutforinspection.
INDprovidesarangeoffeaturesandstyleswithconvenienceforthecasualuseraswellasfine-tuningfortheadvanceduserorthoseinterestedinresearch.INDcanbeoperatedinaBreiman/Friedman/Olshen/Stone-likemode(butwithoutregressiontrees,surrogatesplitsormultivariatesplits),andinamodelikeC4.5.Advancedfeaturesallowmoreextensivesearch,interactivecontrolanddisplayoftreegrowing,andBayesianandMMLalgorithmsfortreepruningandsmoothing.Theseoftenproducemoreaccurateclassprobabilityestimatesattheleaves.
INDalsocomeswithacomprehensiveexperimentalcontrolsuite.INDconsistoffourbasickindsofroutines;datamanipulationroutines,treegenerationroutines,treetestingroutines,andtreedisplayroutines.Thedatamanipulationroutinesareusedtopartitionasinglelargedatasetintosmallertrainingandtestsets.Thegenerationroutinesareusedtobuildclassifiers.Thetestroutinesareusedtoevaluateclassifiersandtoclassifydatausingaclassifier.Andthedisplayroutinesareusedtodisplayclassifiersinvariousformats.
INDiswritteninK&RC,withcontrollingscriptsinthe"csh"shellofUNIX,andextensiveUNIXmanentries.ItisdesignedtobeusedonanyUNIXsystem,althoughithasonlybeenthoroughlytestedonSUNplatforms.INDcomeswithamanualgivingaguidetotreemethods,andpointerstotheliterature,andseveralcompaniondocuments.
评论