Meta-classifier approach to reliable text classification(9)

时间:2026-01-21

A problem with automatic classifiers is that there is no way to know if a particular classification is just a guess or a certain answer. Reliable classification is the task of predicting whether a certain instance is correctly classified or not, i.e., a cl

1.2.RELATEDWORK

forrandomness,computableapproximationsofalgorithmictestsofrandomness

areused[Proedrouetal.,2001].Inthiscontextwedistinguishthetypicalness

frameworkandthetransductionframework.

Thetypicalnessframeworkprovidesreliabilityestimationsondatathatis

independentlyandidenticallydistributed(iid).Considerasequenceofinstances

togetherwithanewinstanceofanunknownclass.Thetypicalnessframeworkis

usedtogainameasureofreliabilityforallpossibleclassesofthisnewinstance

usingatypicalnessfunction.Foreachpossibleclassitisexaminedhowlikelyit

isthatallinstancesoftheextendedsequence,i.e.,thesequencewiththenew

instanceadded,aredrawnindependentlyfromthesamedistribution.Themore

typicalthesequenceis,thehigherthereliabilitymeasure.

Thetypicalnessfunctioncanbeconstructedbymeasuringthe“strangeness”

ofindividualinstances,usingindividualstrangenessfunctions[Kukar,2004].

Adrawbackofthisapproachisthatthestrangenessfunctiondependsonthe

classi cationalgorithmthatisused.Sofar,theonlysuccessfulapplications

useSupportVectorMachines[Vovketal.,1999]andthenearestneighbour

algorithm[Proedrouetal.,2001].Anotherdisadvantageofthisapproachisits

computationalcomplexity[KukarandKononenko,2002,Melluishetal.,2001].

Anotherstatisticalframeworkforreliableclassi cationisthetransduction

framework.Theframeworkisclassi er-independentanditisbasedonatrans-

ductivestepduringtheclassi cationprocess.First,aninstanceisclassi edbya

baseclassi er,andthentheinstanceisaddedtothetrainingset,togetherwith

theclassi cationitreceivedfromthebaseclassi er.Theclassi erisre-trained

andtheinstanceisclassi edagain.Thereliabilityoftheinstanceclassi ca-

tionismeasuredasthedi erencebetweenposteriorclassprobabilitiesthatthe

instancereceivesbeforeandafteritwasaddedtothetrainingset[Kukarand

Kononenko,2002].Smirnovetal.[2003b]showthatthisapproachisnotonly

computationallyine cient,butitalsorequireshighprecisionofthereal-number

representationwhenalargeamountofdataisusedwithmanyclasses.Onlyin

thecasethatthetrainingsetissmall,addinganinstancetothetrainingset

canchangethelevelofrandomnesssigni cantly.

Kukar[2004]presentsanalgorithmthatjoinsthetransductiveframework

andthetypicalnessframework,therebymakingthetransductivestepstatisti-

callysound.Experimentalresultsshowedthatreliabilitycanbeestimatedquite

accurately,butagainthehighcomputationalcostsposeaproblem.

1.2.3Version-SpaceSupport-VectorMachines

Adi erentapproachtoreliableinstanceclassi cationisbasedonversionspaces

[Smirnovetal.,2005].WerefertoMitchell[1997]foranintroductiontoversion

spaces.Themainideaistoconstructversionspacesthatcontainthehypothe-

sesofthetargetconceptstobelearnedortheircloseapproximations.The

unanimous-votingruleisimplementedbytestingversionspacesforcollapse.

Applyingthisrulemakesitimpossibletomisclassifyinstanceswhenthereisno

noiseinthedata.Althoughexperimentalresultsarepromising,thisapproach

iscomputationallyexpensiveaswell.

3

…… 此处隐藏:1134字,全部文档内容请下载后查看。喜欢就下载吧 ……
Meta-classifier approach to reliable text classification(9).doc 将本文的Word文档下载到电脑

精彩图片

热门精选

大家正在看

× 游客快捷下载通道(下载后可以自由复制和排版)

限时特价:4.9 元/份 原价:20元

支付方式:

开通VIP包月会员 特价:19元/月

注:下载文档有可能“只有目录或者内容不全”等情况,请下载之前注意辨别,如果您已付费且无法下载或内容有问题,请联系我们协助你处理。
微信:fanwen365 QQ:370150219