INSPIRING FUTURES

Improving data quality in data warehousing applications.

Li, Lin, Peng, Taoxin and Kennedy, Jessie (2010) Improving data quality in data warehousing applications. In: Proceedings of the 12th International Conference on Enterprise Information Systems. SciTePress, Funchal, Madeira - Portugal, pp. 379-382. ISBN 978-989-8425-04-1

Full text not available from this repository. (Request a copy)

Abstract/Description

There is a growing awareness that high quality of data is a key to today’s business success and dirty data that exits within data sources is one of the reasons that cause poor data quality. To ensure high quality, enterprises need to have a process, methodologies and resources to monitor and analyze the quality of data, methodologies for preventing and/or detecting and repairing dirty data. However in practice, detecting and cleaning all the dirty data that exists in all data sources is quite expensive and unrealistic. The cost of cleaning dirty data needs to be considered for most of enterprises. Therefore conflicts may arise if an organization intends to clean their data warehouses in that how do they select the most important data to clean based on their business requirements. In this paper, business rules are used to classify dirty data types based on data quality dimensions. The proposed method will be able to help to solve this problem by allowing users to select the appropriate group of dirty data types based on the priority of their business requirements. It also provides guidelines for measuring the data quality with respect to different data quality dimensions and also will be helpful for the development of data cleaning tools.

Item Type: Book Section
ISBN: 978-989-8425-04-1
Uncontrolled Keywords: Data quality; dirty data; data cleaning tools; data warehousing;
University Divisions/Research Centres: Faculty of Engineering, Computing and Creative Industries > School of Computing
Dewey Decimal Subjects: 000 Computer science, information & general works > 000 Computer science, knowledge & systems > 005 Computer programming, programs & data
Library of Congress Subjects: Q Science > QA Mathematics > QA75 Electronic computers. Computer science
Item ID: 3886
Depositing User: Computing Research
Date Deposited: 04 Feb 2011 16:31
Last Modified: 11 Jun 2012 10:22
URI: http://researchrepository.napier.ac.uk/id/eprint/3886

Actions (login required)

View Item

Edinburgh Napier University is a registered Scottish charity. Registration number SC018373