推荐系统netflix获奖算法(9)

发布时间:2021-06-07

赢得netflix推荐系统大奖的算法

representation).ThemovierepresentationisstillbasedonthetimeSVD++model.TheresultingRMSEisalso0.8661.

Finally,weaddedk-NNfeaturesontopofthetimeSVD++features.Thatis,foreachu ipair,wefoundthetop20moviesmostsimilartoi,whichwereratedbyu.Weaddedthemoviescores,eachmultipliedbytheirrespectivesimilaritiesasadditionalfeatures.SimilaritiesherewereshrunkPearsoncorrelations[1].ThisslightlyreducestheRMSEto0.8660.AnotherusageofGBDTisforsolvingaregressionproblempermovie.Foreachuserwecomputeda50-Dcharacteristicvectorformedbythevaluesofthe50hiddenunitsofarespectiveRBM.Then,foreachmovieweusedGBDTforsolvingtheregressionproblemoflinkingthe50-Duservectorstothetrueuserratingsofthemovie.Theresult,withRMSE=0.9248,willbedenotedas[PQ7]inthefollowingdescription.

B.ListofBellKor’sProbe-Qualifyingpairs

Welistthe24BellKorpredictorswhichparticipatedintheGBDTblending.Noticethatmanymoreofourpredictorsareinthe nalblendofQualifyingresults(asmentionedearlierinthisarticle).However,onlyforthoselistedbelowwepossesscorrespondingProberesults,whichrequireextracomputationalresourcestofullyre-trainthemodelwhileexcludingtheProbesetfromthetrainingset.

PostProgressPrize2008predictors

Thosewerementionedearlierinthisdocument:1)PQ12)PQ23)PQ34)PQ45)PQ56)PQ67)PQ7

ProgressPrize2008predictors

Thefollowingisbasedonournotationin[3]:8)SVD++(1)(f=200)

9)Integrated3)(f=100,k=300)10)SVD++((f=500)

11)FirstneighborhoodmodelofSec.2.2of[3]

(RMSE=0.9002)

12)Aneighborhoodmodelmentionedtowardstheendof

Sec.2.2of[3](RMSE=0.8914)ProgressPrize2007predictors

Thefollowingisbasedonournotationin[2]:13)Predictor#4014)Predictor#3515)Predictor#6716)Predictor#75

17)NNMF(60factors)withadaptiveuserfactors18)Predictor#8119)Predictor#73

20)100neighborsUser-kNNonresidualsofallglobal

effectsbutthelast421)Predictor#8522)Predictor#45

9

23)Predictor#8324)Predictor#106

OnelastpredictorwithRMSE=0.8713isinthe nalblend.Itisbasedontheblendingtechniquedescribedinpage12of[3].Thetechniquewasappliedtothefourpredictorsindexedaboveby:2,9,12,and13.

VIII.CONCLUDINGREMARKS

GrantingthegrandprizecelebratestheconclusionoftheNet ixPrizecompetition.Wideparticipation,extensivepresscoverageandmanypublicationsallre ecttheimmensesuc-cessofthecompetition.Dealingwithmovies,asubjectclosetotheheartsofmany,wasde nitelyagoodstart.Yet,muchcouldgowrong,butdidnot,thankstoseveralenablingfactors.The rstsuccessfactorisontheorganizationalside–Net ix.Theydidagreatservicetothe eldbyreleasingapreciousdataset,anactwhichissorare,yetcourageousandimportanttotheprogressofscience.Beyondthis,bothdesignandconductofthecompetitionwere awlessandnon-trivial.Forexample,thesizeofthedatawasrightontarget.Muchlargerandmorerepresentativethancomparabledatasets,yetsmallenoughtomakethecompetitionaccessibletoanyonewithacommodityPC.Asanotherexample,Iwouldmentionthesplitofthetestsetintothreeparts:Probe,Quiz,andTest,whichwasessentialtoensurethefairnessofthecompetition.Despitebeingplannedwellahead,itprovedtobeadecisivefactorattheverylastminuteofthecompetition,threeyearslater.Thesecondsuccessfactoristhewideengagementofmanycompetitors.Thiscreatedpositivebuzz,leadingtofurtherenrollmentofmanymore.Muchwassaidandwrittenonthecollaborativespiritofthecompetitors,whichopenlypublishedanddiscussedtheirinnovationsonthewebforumandthroughscienti cpublications.Thefeelingwasofabigcommunityprogressingtogether,makingtheexperiencemoreenjoyableandef cienttoallparticipants.Infact,thisfacilitatedthena-tureofthecompetition,whichproceededlikealongmarathon,ratherthanaseriesofshortsprints.

Anotherhelpfulfactorwassometouchofluck.Themostprominentoneisthechoiceofthe10%improvementgoal.Anysmalldeviationfromthisnumber,wouldhavemadethecompetitioneithertooeasyorimpossiblydif cult.Inaddition,thegoddessofluckensuredmostsuspenseful nishlinesinboth2007ProgressPrizeand2009GrandPrize,matchingbestsportsevents.

Thescienceofrecommendersystemsisaprimebene ciaryofthecontest.Manynewpeoplebecameinvolvedinthe eldandmadetheircontributions.Thereisaclearspikeinrelatedpublications,andtheNet ixdatasetisthedirectcatalysttodevelopingsomeofthebetteralgorithmsknowninthe eld.Outofthenumerousnewalgorithmiccontributions,Iwouldliketohighlightone–thosehumblebaselinepredictors(orbiases),whichcapturemaineffectsinthedata.Whiletheliteraturemostlyconcentratesonthemoresophisticatedalgorithmicaspects,wehavelearnedthatanaccuratetreatmentofmaineffectsisprobablyatleastassigni cantascomingupwithmodelingbreakthroughs.

Finally,wewereluckytowinthiscompetition,butrecog-nizetheimportantcontributionsofthemanyothercontestants,

推荐系统netflix获奖算法(9).doc 将本文的Word文档下载到电脑

精彩图片

热门精选

大家正在看

× 游客快捷下载通道(下载后可以自由复制和排版)

限时特价:7 元/份 原价:20元

支付方式:

开通VIP包月会员 特价:29元/月

注:下载文档有可能“只有目录或者内容不全”等情况,请下载之前注意辨别,如果您已付费且无法下载或内容有问题,请联系我们协助你处理。
微信:fanwen365 QQ:370150219