Meta-classifier approach to reliable text classification(10)
时间:2026-01-21
时间:2026-01-21
A problem with automatic classifiers is that there is no way to know if a particular classification is just a guess or a certain answer. Reliable classification is the task of predicting whether a certain instance is correctly classified or not, i.e., a cl
1.3.PROBLEMSTATEMENTANDRESEARCHQUESTIONS
1.2.4Meta-Classi erApproach
Themeta-classi erapproachtakesadi erentlinetoreliableclassi cation.
Givenabaseclassi er,theapproachistolearnameta-classi erthatpredicts
thecorrectnessofeachinstanceclassi cationofthebaseclassi er.Thebase
classi erplusthemetaclassi erformonecombinedclassi er.Theclassi cation
ruleofthecombinedclassi eristoassignaclasspredictedbythebaseclassi er
toaninstanceifthemeta-classi erclassi esthebaseclassi cationasReliable,
otherwisetheinstanceclassi cationisrejected[SeewaldandF¨urnkranz,2001].
Thecrucialstepforthemeta-classi erapproachisthegenerationofthe
metadatathatisusedtotrainthemeta-classi er.Themetadataarerepresented
byfourdi erentmetadatarepresentations.Allmetadatarepresentationshave
thesamebinarymetaclassattribute.Themetaclassofaninstanceindicates
thereliabilityofthebaseclassi cation.Iftheinstanceisclassi edcorrectly
bythebaseclassi erthemetaclassisReliable,otherwisethemetaclassis
Unreliable.Belowwelistfourmetadatarepresentations.
Originalinstancesrepresentation.Allattributesinthemetainstance,
excepttheclassattribute,arethesameasintheoriginalinstances.
Probabilitydistributionrepresentation.Ametainstancecontainsthepos-
teriorprobabilitiesforallclassesasgivenbythebaseclassi er.
Basicstatisticsrepresentation.Attributesofthemetainstancearedif-
ferentcharacteristicsofthebaseclassi cation,likethebaseclassandthe
posteriorprobabilityofthebaseclass.
Nearestneighbourdistancesrepresentation.Thisrepresentationcanonly
beusedincombinationwithanearestneighbourbaseclassi er.Attributes
ofthemetainstancearecalculateddistancesbetweennearest(un)like
neighboursofatargetinstance[CheethamandPrice,2004].
Ameta-classi ercanbetrainedondi erentclasslevels.Aglobalmeta-
classi erapproachlearnsonemeta-classi erforallclasses.Alocalmeta-
classi erapproachlearnsonemeta-classi erforeachbaseclass.Eachlocal
meta-classi erclassi esonlymetainstanceswithoneparticularbaseclass.
The rstapplicationsofmeta-classi ersareinthecontextofensembles.
Theestimatedreliabilityofaclassi cationisusedtochoosetheclassi er(s)
thatwillclassifytheinstance[SeewaldandF¨urnkranz,2001].Inthese rst
meta-classi erstheoriginalinstancesrepresentationisusedasmetadatarepre-
sentation.InalaterversionofSeewald[2003],alsotheprobabilitydistribution
representationisused.Real-worldapplicationsinwhichthemeta-classi erap-
proachisusedincludeautomatictextclassi cation[Smirnovetal.,2003a]and
spam- ltering[Delanyetal.,2004]edtheprobabilitydis-
tributionmetadatarepresentation,edthenearestneighbour
distancesrepresentation.
Themeta-classi erapproachisdiscussedindetailinthethirdchapter.
1.3ProblemStatementandResearchQuestions
Inthisthesiswelookforanapproachtoreliableclassi cationthatcanbeused
inreal-worldpracticalapplications,i.e.,welookforanapproachthatissta-
4
…… 此处隐藏:1051字,全部文档内容请下载后查看。喜欢就下载吧 ……上一篇:第八章 收银员的礼仪
下一篇:浅析网络安全技术