Estimating the quality of data in relational databases
发布时间:2021-06-07
发布时间:2021-06-07
EstimatingtheQualityofDatainRelational
Databases
AmihaiMotroandIgorRakov
DepartmentofInformationandSoftwareSystemsEngineering
GeorgeMasonUniversity
Fairfax,VA22030-4444
{ami,irakov}@gmu.edu
Abstract
Withmoreandmoreelectronicinformationsourcesbecomingwidelyavailable,theissueofthequalityofthese,often-competing,sourceshasbecomegermane.Weproposeastandardforratinginformationsourceswithrespecttotheirquality.Animportantconsiderationisthatthequalityofinformationsourcesoftenvariesconsiderablywhenspeci careaswithinthesesourcesareconsidered.Thisimpliesthattheassignmentofasingleratingofqualitytoaninformationsourceisusuallyunsatisfactory.Ofcourse,totheuserofaninformationsourcetheoverallqualityofthesourcemaynotbeasimportantasthequalityofthespeci cinformationthatthisuserisextractingfromthesource.Therefore,methodsmustbedevelopedthatwillderivereliableestimatesofthequalityoftheinformationprovidedtousers,fromthequalityspeci cationsthathavebeenassignedtothesources.Ourworkherebearsonalltheseconcerns.Wedescribeanapproachthatusesdualqualitymeasuresthatgaugethedistanceoftheinformationinadatabasefromthetruth.Wethenproposetocombinemanualveri cationwithstatisticalmethodstoarriveatusefulestimatesofthequalityofdatabases.Weconsiderthevarianceinqualitybyisolatingareasofdatabasesthatarehomogeneouswithrespecttoquality,andthenestimatingthequalityofeachseparatearea.Thesecompositeestimatesmayberegardedasqualityspeci cationsthatwillbea xedtoeachdatabase.Finally,weshowhowtoderivequalityestimatesforindividualqueriesfromsuchqualityspeci cations.
ThisworkwassupportedinpartbyDARPAgrantsN0014-92-J-4038andN0060-96-D-3202.
上一篇:三结合教育工作总结
下一篇:理性的批判和道义的批判