数据挖掘概念与技术课后习题答案

时间:2025-05-01

数据挖掘 数据分析 机器算法

DataMining:ConceptsandTechniques

2ndEdition

SolutionManual

JiaweiHanandMichelineKamber

TheUniversityofIllinoisatUrbana-Champaign

cMorganKaufmann,2006

Note:ForInstructors’referenceonly.Donotcopy!Donotdistribute!

数据挖掘 数据分析 机器算法

Contents

1Introduction

1.11Exercises.................................................33

13

13

31

31

41

41

53

53

69

69

79

79

91

91

1032DataPreprocessing2.8Exercises.................................................3DataWarehouseandOLAPTechnology:AnOverview3.7Exercises.................................................4DataCubeComputationandDataGeneralization4.5Exercises.................................................5MiningFrequentPatterns,Associations,andCorrelations5.7Exercises.................................................6Classi cationandPrediction6.17Exercises.................................................7ClusterAnalysis7.13Exercises.................................................8MiningStream,Time-Series,andSequenceData8.6Exercises.................................................9GraphMining,SocialNetworkAnalysis,andMultirelationalDataMining

9.5Exercises.................................................103

11110MiningObject,Spatial,Multimedia,Text,andWebData

10.7Exercises.................................................111

12311ApplicationsandTrendsinDataMining

11.7Exercises.................................................123

1

数据挖掘 数据分析 机器算法

Chapter1

Introduction

1.11Exercises

1.1.Whatisdatamining?Inyouranswer,addressthefollowing:

(a)Isitanotherhype?

(b)Isitasimpletransformationoftechnologydevelopedfromdatabases,statistics,andmachinelearning?(c)Explainhowtheevolutionofdatabasetechnologyledtodatamining.

(d)Describethestepsinvolvedindataminingwhenviewedasaprocessofknowledgediscovery.Answer:

Dataminingreferstotheprocessormethodthatextractsor“mines”interestingknowledgeorpatternsfromlargeamountsofdata.

(a)Isitanotherhype?

Dataminingisnotanotherhype.Instead,theneedfordatamininghasarisenduetothewideavailabilityofhugeamountsofdataandtheimminentneedforturningsuchdataintousefulinformationandknowledge.Thus,dataminingcanbeviewedastheresultofthenaturalevolutionofinformationtechnology.

(b)Isitasimpletransformationoftechnologydevelopedfromdatabases,statistics,andmachinelearning?No.Dataminingismorethanasimpletransformationoftechnologydevelopedfromdatabases,sta-tistics,andmachinelearning.Instead,datamininginvolvesanintegration,ratherthanasimpletransformation,oftechniquesfrommultipledisciplinessuchasdatabasetechnology,statistics,ma-chinelearning,high-performancecomputing,patternrecognition,neuralnetworks,datavisualization,informationretrieval,imageandsignalprocessing,andspatialdataanalysis.

(c)Explainhowtheevolutionofdatabasetechnologyledtodatamining.

Databasetechnologybeganwiththedevelopmentofdatacollectionanddatabasecreationmechanismsthatledtothedevelopmentofe ectivemechanismsfordatamanagementincludingdatastorageandretrieval,andqueryandtransactionprocessing.Thelargenumberofdatabasesystemso eringqueryandtransactionprocessingeventuallyandnaturallyledtotheneedfordataanalysisandunderstanding.Hence,dataminingbeganitsdevelopmentoutofthisnecessity.

(d)Describethestepsinvolvedindataminingwhenviewedasaprocessofknowledgediscovery.Thestepsinvolvedindataminingwhenviewedasaprocessofknowledgediscoveryareasfollows: Datacleaning,aprocessthatremovesortransformsnoiseandinconsistentdata

Dataintegration,wheremultipledatasourcesmaybecombined

3

数据挖掘 数据分析 机器算法

4CHAPTER1.INTRODUCTION

Dataselection,wheredatarelevanttotheanalysistaskareretrievedfromthedatabase

Datatransformation,wheredataaretransformedorconsolidatedintoformsappropriatefor

mining

Datamining,anessentialprocesswhereintelligentande cientmethodsareappliedinorderto

extractpatterns

Patternevaluation,aprocessthatidenti esthetrulyinterestingpatternsrepresentingknowl-

edgebasedonsomeinterestingnessmeasures

Knowledgepresentation,wherevisualizationandknowledgerepresentationtechniquesareused

topresenttheminedknowledgetotheuser

1.2.Presentanexamplewheredataminingiscrucialtothesuccessofabusiness.Whatdataminingfunctions

doesthisbusinessneed?Cantheybeperformedalternativelybydataqueryprocessingorsimplestatisticalanalysis?

Answer:

Adepartmentstore,forexample,http://ingdataminingfunctionssuchasassociation,thestorecanusetheminedstrongassociationrulestodeterminewhichproductsboughtbyonegroupofcustomersarelikelytoleadtothebuyingofcertainotherproducts.Withthisinformation,thestorecanthenmailmarketingmaterialsonlytothosekindsofcustomerswhoexhibitahighlikelihoodofpurchasingadditionalproducts.Dataqueryprocessingisusedfordataorinformationretrievalanddoesnothavethemeansfor ndingassociationrules.Similarly,simplestatisticalanalysiscannothandlelargeamountsofdatasuchasthoseofcustomerrecordsinadepartmentstore.

1.3.SupposeyourtaskasasoftwareengineeratBig-Universityistodesignadataminingsystemtoexamine

theiruniversitycoursedatabase,whichcontainsthefollowinginformation:thename,address,andstatus(e.g.,undergraduateorgraduate)ofeachstudent,thecoursestaken,andtheircumulativegradepointaverage(GPA).Describethearchitectureyouwouldchoose.Whatisthepurposeofeachcomponentofthisarchitecture?

Answer:

Adataminingarchitecturethatcanbeusedforthisapplicationwouldconsistofthefollowingmajorcomponents:

Adatabase,datawarehouse,orotherinformationrepository,whichconsistsofthesetofdatabases,datawarehouses,spreadsheets,orotherkindsofinformationrepositoriescontainingthe

studentandcourseinformation.

Adatabaseordatawarehousese …… 此处隐藏:37084字,全部文档内容请下载后查看。喜欢就下载吧 ……

数据挖掘概念与技术课后习题答案.doc 将本文的Word文档下载到电脑

    精彩图片

    热门精选

    大家正在看

    × 游客快捷下载通道(下载后可以自由复制和排版)

    限时特价:7 元/份 原价:20元

    支付方式:

    开通VIP包月会员 特价:29元/月

    注:下载文档有可能“只有目录或者内容不全”等情况,请下载之前注意辨别,如果您已付费且无法下载或内容有问题,请联系我们协助你处理。
    微信:fanwen365 QQ:370150219