Meta-classifier approach to reliable text classification(7)
时间:2026-01-21
时间:2026-01-21
A problem with automatic classifiers is that there is no way to know if a particular classification is just a guess or a certain answer. Reliable classification is the task of predicting whether a certain instance is correctly classified or not, i.e., a cl
Chapter1
Introduction
Thischapterisanintroductiontotheresearchtopicofreliableclassi cation.Wedescribethetaskofreliableclassi cationinsection1.1.Variousapproachestosolvingtheproblemofreliableclassi cationarediscussedinsection1.2.Insection1.3ourchoiceforthemeta-classi erapproachthatwewouldliketoin-vestigateismotivated.Moreover,theproblemstatementandresearchquestionsareformulated.Finally,insection1.4anoutlineoftheremainingpartofthethesisisgiven.
1.1TaskofReliableClassi cation
Knowingthelimitsofone’sownknowledgeisatypicalfeatureofhumanintel-ligence.Whenfacedwithadi cultproblem,onecancometotheconclusionthathisknowledgeisnotsu cienttosolvetheproblemandonecansay“Idonotknow”putersareabletosolvedi erentkindsofdi cultproblemssuchasplayingchessormakinghugecomputations.Butincontrasttohumanscomputersusuallydonothavethecapabilitytorecognizethelimitsoftheirknowledge.
Machinelearningisconcernedwithdevelopingclassi erstolearnfromex-perienceorextractknowledgefromexamplesinadatabase[Mitchell,1997].Theclassi ersthatwestudyinthisthesislearnfromlabelledinstancesandareabletoclassifyunseenandunlabelledinstances.However,theseclassi ersdousuallynotdistinguishbetweenluckyguessesandcertainanswers.Theyjustclassifyallnewinstances,sothatyoucanneverbereallysureifapar-ticularclassi cationiscorrect.Inotherwords:theclassi ersarenotawareofthelimitsoftheirknowledge.Thatisthereasonwhyclassi ersareoflimiteduseinmanyreal-worldapplications.Thecostsofamisclassi cationareoftenhigh,especiallyinsafety-criticaldomainssuchasmedicaldiagnosingandairtra ccontrolsystems.Classi ersareonlyusedinapplicationswhereerrorsdonothavefar-reachingconsequences,forexampleinternetapplicationssuchasautomaticdocumentindexing,spam- ltering,wordsensedisambiguation,andhierarchicalcategorizationofWebpages[Sebastiani,2002,Delanyetal.,2004].Inthisthesiswestudythetaskofreliableclassi cation.Wede nethistaskasataskofpredictingwhetheracertaininstancehasbeencorrectlyclassi ed,1Throughoutthethesisweuse‘he’or‘his’whenbothhe/sheandhis/herarepossible.
1
…… 此处隐藏:266字,全部文档内容请下载后查看。喜欢就下载吧 ……上一篇:第八章 收银员的礼仪
下一篇:浅析网络安全技术