Clustering using firefly algorithm Performance study(3)

发布时间:2021-06-07

萤火虫算法

166J.Senthilnathetal./SwarmandEvolutionaryComputation1(2011)164–171

whereKisthenumberofclusters,foragivennpatternxi(i=1,...,n)thelocationoftheithpatternandck(k=1,...,K)isthekthclustercenter,tobefoundbyEq.(6):ck=

xi

(6)

i∈Ck

nk

wherenkisthenumberofpatternsinthekthcluster.

Theclusteranalysisformstheassignmentofdatasetintoclusterssothatitcanbegroupedintosameclusterbasedonsomesimilaritymeasures[23].Distancemeasurementismostwidelyusedforevaluatingsimilaritiesbetweenpatterns.TheclustercentersarethedecisionvariableswhichareobtainedbyminimizingthesumofEuclideandistanceonalltrainingsetinstancesinthed-dimensionalspacebetweengenericinstancexiandthecenteroftheclusterck.Thecost(objective)functionforthepatterniisgivenbyEq.(7),asin[9,14]f

Train

i=

1

DDd(x,

CLknown(xj)jpTraini

)

(7)

j=1

whereDTrainisthenumberoftrainingdatasetwhichisusedtonormalizethesumthatwillrangeanydistancewithin[0.0,1.0]andpCLknown(xj)

todatabase.

idefinestheclassthatinstancebelongstoaccordingNotethatinourFAalgorithm,thedecisionvariablesaretheclustercenters.TheobjectivefunctioninourFAalgorithmisgivenbyEq.(7).Inourstudy,weconsiderthestandard13benchmarkproblemsgivenin[14].Foragivendataset,letnbethenumberofdatapoints,dbethedimension,cbethenumberofclasses.Agivendatapointbelongstoonlyoneofthesecclasses.Ofthegivendataset,75%ofthedatasetarerandomlyselectedtoobtaintheclustercentersusingEq.(7).Inthiswayweobtaintheclustercentersforallthecclasses.Theremaining25%ofdatasetisused(calledtestdataset)toobtaintheclassificationerrorpercentage(CEP).AnillustrativeexampleofthisFAalgorithmanditsperformancemeasure,isgiveninthenextsection.

4.Performancemeasuresandanillustrativeexample

Asdiscussedintheearliersection,ingtheseclustercenters,thetestingdatasetareclassifiedandtheperformanceofclassificationareanalyzed.

4.1.Performanceevaluation

TheperformanceoftheextractedknowledgeintheformofclustercentersbytheFAisevaluatedusingClassificationErrorPercentage(CEP)andclassificationefficiency.CEPdependsonlyontestdataandtheclassificationefficiencydependsonbothtrainingandtestingdata.

4.1.1.ClassificationErrorPercentage(CEP)

CEPisobtainedonlyusingthetestdata[9].Foreachproblem,wereporttheCEPwhichisthepercentageofincorrectlyclassifiedpatternsofthetestdatasetsasgivenin[9],tomakeareliablecomparison.

Theclassificationofeachpatternisdonebyassigningittotheclasswhosedistanceisclosesttothecenteroftheclusters.Then,theclassifiedoutputiscomparedwiththedesiredoutputandiftheyarenotexactlythesame,thepatternisseparatedasmisclassified[9].Thisprocedureisappliedtoalltestdataandthetotalmisclassifiedpatternnumberispercentagedtothesizeoftestdataset,whichisgivenbyCEP=

numberofmisclassifiedsamples

totalsizeoftestdataset

×100.

(8)

20

Class 2

15

y

training dataClass 1

testing data

10

5

0510

152025

x

Fig.1.Datadistribution.

4.1.2.Classificationefficiency

Classificationefficiencyisobtainedusingboththetrainingandtestdata.Theclassificationmatrixisusedtoobtainthestatisticalmeasuresfortheclass-levelperformance(individualefficiency)andtheglobalperformance(averageandoverallefficiency)oftheclassifier[24].Theindividualefficiencyisindicatedbythepercentageclassificationwhichtellsushowmanysamplesbelongingtoaparticularclasshavebeencorrectlyclassified.Thepercentageclassification(ηi)fortheclassciisgivenbyEq.(9).

ηii

i=

qn(9)

qji

j=1

whereqiiisthenumberofcorrectlyclassifiedsamplesandnisthenumberofsamplesfortheclassciinthedataset.Theglobalperformancemeasuresaretheaverage(ηa)andoverall(ηo)classification,whicharedefinedas

η1

nca=nηi

(10)

ci=1

η1

nco=

Nqii(11)i=1

wherencisthetotalnumberofclassesandNisthenumberofpatterns.

4.2.Illustrativeexample

WeillustratehowtheFireflyAlgorithm(FA)isusedforclusteringwiththefollowingsyntheticdata.Althoughtheproposedalgorithmcanbeusedforanytypeofmixturemodel,wefocusonaGaussianmixture.LetusconsidertwoGaussianmixturesthathavetwoinputfeatures,namelyxandy.Here,themeanvaluesµ1=[8,8]Tandµ2=[16,16]T,co-variancematrix(x,y)={(6,3);(3,2)}areassumedandeachclasshaveequalnumberofsamples.Inourexperimentation100samplesaregeneratedrandomlyforeachclass.Ofthese75datapointsareusedfortrainingandtheremaining25isusedfortestingineachclass.ThissyntheticdatageneratedisshowninFig.1.

Weusethefireflyalgorithmontrainingdatatoobtainclustercenters.Letxibeoneofthesolutions(clustercenters)andJibetheobjectivefunctionvalueforthisclustercenter.

Weconsiderapopulationsizeof5firefliesatlocationsx1,x2,x3,x4andx5within2d-dimensional,searchspace.NowevaluatethefitnessofthepopulationJ1,J2,J3J4,andJ5usingEq.(7)whichisdirectlyproportionaltolightintensityI1,I2,I3,I4andI5.Nowcomparetheintensityvaluesofafirefly,if(I2<I1)thenmovefirefly2toward1usingEq.(4),similarlycomparealltheagents

Clustering using firefly algorithm Performance study(3).doc 将本文的Word文档下载到电脑

精彩图片

热门精选

大家正在看

× 游客快捷下载通道(下载后可以自由复制和排版)

限时特价:7 元/份 原价:20元

支付方式:

开通VIP包月会员 特价:29元/月

注:下载文档有可能“只有目录或者内容不全”等情况,请下载之前注意辨别,如果您已付费且无法下载或内容有问题,请联系我们协助你处理。
微信:fanwen365 QQ:370150219