数据挖掘概念与技术课后习题答案
时间:2025-05-01
时间:2025-05-01
数据挖掘 数据分析 机器算法
DataMining:ConceptsandTechniques
2ndEdition
SolutionManual
JiaweiHanandMichelineKamber
TheUniversityofIllinoisatUrbana-Champaign
cMorganKaufmann,2006
Note:ForInstructors’referenceonly.Donotcopy!Donotdistribute!
数据挖掘 数据分析 机器算法
Contents
1Introduction
1.11Exercises.................................................33
13
13
31
31
41
41
53
53
69
69
79
79
91
91
1032DataPreprocessing2.8Exercises.................................................3DataWarehouseandOLAPTechnology:AnOverview3.7Exercises.................................................4DataCubeComputationandDataGeneralization4.5Exercises.................................................5MiningFrequentPatterns,Associations,andCorrelations5.7Exercises.................................................6Classi cationandPrediction6.17Exercises.................................................7ClusterAnalysis7.13Exercises.................................................8MiningStream,Time-Series,andSequenceData8.6Exercises.................................................9GraphMining,SocialNetworkAnalysis,andMultirelationalDataMining
9.5Exercises.................................................103
11110MiningObject,Spatial,Multimedia,Text,andWebData
10.7Exercises.................................................111
12311ApplicationsandTrendsinDataMining
11.7Exercises.................................................123
1
数据挖掘 数据分析 机器算法
Chapter1
Introduction
1.11Exercises
1.1.Whatisdatamining?Inyouranswer,addressthefollowing:
(a)Isitanotherhype?
(b)Isitasimpletransformationoftechnologydevelopedfromdatabases,statistics,andmachinelearning?(c)Explainhowtheevolutionofdatabasetechnologyledtodatamining.
(d)Describethestepsinvolvedindataminingwhenviewedasaprocessofknowledgediscovery.Answer:
Dataminingreferstotheprocessormethodthatextractsor“mines”interestingknowledgeorpatternsfromlargeamountsofdata.
(a)Isitanotherhype?
Dataminingisnotanotherhype.Instead,theneedfordatamininghasarisenduetothewideavailabilityofhugeamountsofdataandtheimminentneedforturningsuchdataintousefulinformationandknowledge.Thus,dataminingcanbeviewedastheresultofthenaturalevolutionofinformationtechnology.
(b)Isitasimpletransformationoftechnologydevelopedfromdatabases,statistics,andmachinelearning?No.Dataminingismorethanasimpletransformationoftechnologydevelopedfromdatabases,sta-tistics,andmachinelearning.Instead,datamininginvolvesanintegration,ratherthanasimpletransformation,oftechniquesfrommultipledisciplinessuchasdatabasetechnology,statistics,ma-chinelearning,high-performancecomputing,patternrecognition,neuralnetworks,datavisualization,informationretrieval,imageandsignalprocessing,andspatialdataanalysis.
(c)Explainhowtheevolutionofdatabasetechnologyledtodatamining.
Databasetechnologybeganwiththedevelopmentofdatacollectionanddatabasecreationmechanismsthatledtothedevelopmentofe ectivemechanismsfordatamanagementincludingdatastorageandretrieval,andqueryandtransactionprocessing.Thelargenumberofdatabasesystemso eringqueryandtransactionprocessingeventuallyandnaturallyledtotheneedfordataanalysisandunderstanding.Hence,dataminingbeganitsdevelopmentoutofthisnecessity.
(d)Describethestepsinvolvedindataminingwhenviewedasaprocessofknowledgediscovery.Thestepsinvolvedindataminingwhenviewedasaprocessofknowledgediscoveryareasfollows: Datacleaning,aprocessthatremovesortransformsnoiseandinconsistentdata
Dataintegration,wheremultipledatasourcesmaybecombined
3
数据挖掘 数据分析 机器算法
4CHAPTER1.INTRODUCTION
Dataselection,wheredatarelevanttotheanalysistaskareretrievedfromthedatabase
Datatransformation,wheredataaretransformedorconsolidatedintoformsappropriatefor
mining
Datamining,anessentialprocesswhereintelligentande cientmethodsareappliedinorderto
extractpatterns
Patternevaluation,aprocessthatidenti esthetrulyinterestingpatternsrepresentingknowl-
edgebasedonsomeinterestingnessmeasures
Knowledgepresentation,wherevisualizationandknowledgerepresentationtechniquesareused
topresenttheminedknowledgetotheuser
1.2.Presentanexamplewheredataminingiscrucialtothesuccessofabusiness.Whatdataminingfunctions
doesthisbusinessneed?Cantheybeperformedalternativelybydataqueryprocessingorsimplestatisticalanalysis?
Answer:
Adepartmentstore,forexample,http://ingdataminingfunctionssuchasassociation,thestorecanusetheminedstrongassociationrulestodeterminewhichproductsboughtbyonegroupofcustomersarelikelytoleadtothebuyingofcertainotherproducts.Withthisinformation,thestorecanthenmailmarketingmaterialsonlytothosekindsofcustomerswhoexhibitahighlikelihoodofpurchasingadditionalproducts.Dataqueryprocessingisusedfordataorinformationretrievalanddoesnothavethemeansfor ndingassociationrules.Similarly,simplestatisticalanalysiscannothandlelargeamountsofdatasuchasthoseofcustomerrecordsinadepartmentstore.
1.3.SupposeyourtaskasasoftwareengineeratBig-Universityistodesignadataminingsystemtoexamine
theiruniversitycoursedatabase,whichcontainsthefollowinginformation:thename,address,andstatus(e.g.,undergraduateorgraduate)ofeachstudent,thecoursestaken,andtheircumulativegradepointaverage(GPA).Describethearchitectureyouwouldchoose.Whatisthepurposeofeachcomponentofthisarchitecture?
Answer:
Adataminingarchitecturethatcanbeusedforthisapplicationwouldconsistofthefollowingmajorcomponents:
Adatabase,datawarehouse,orotherinformationrepository,whichconsistsofthesetofdatabases,datawarehouses,spreadsheets,orotherkindsofinformationrepositoriescontainingthe
studentandcourseinformation.
Adatabaseordatawarehousese …… 此处隐藏:37084字,全部文档内容请下载后查看。喜欢就下载吧 ……
下一篇:无线防盗报警器设计