INSPIRING FUTURES

A framework for data cleaning in data warehouses.

Peng, Taoxin (2008) A framework for data cleaning in data warehouses. Proc. of the 10th International Conference on Enterprise Information Systems (ICEIS). pp. 473-478.

Full text not available from this repository. (Request a copy)

Abstract/Description

It is a persistent challenge to achieve a high quality of data in data warehouses. Data cleaning is a crucial task for such a challenge. To deal with this challenge, a set of methods and tools has been developed. However, there are still at least two questions needed to be answered: How to improve the efficiency while performing data cleaning? How to improve the degree of automation when performing data cleaning? This paper challenges these two questions by presenting a novel framework, which provides an approach to managing data cleaning in data warehouses by focusing on the use of data quality dimensions, and decoupling a cleaning process into several sub-processes. Initial test run of the processes in the framework demonstrates that the approach presented is efficient and scalable for data cleaning in data warehouses.

Item Type: Article
Additional Information: paper presented at 10th International Conference on Enterprise Information Systems 12 - 16, June 2008 Barcelona, Spain
Uncontrolled Keywords: data cleaning; data warehouse; performance efficiency; automation; data quallity; decoupling; scalable;
University Divisions/Research Centres: Faculty of Engineering, Computing and Creative Industries > School of Computing
Dewey Decimal Subjects: 000 Computer science, information & general works > 000 Computer science, knowledge & systems > 006 Special Computer Methods > 006.3 Artificial intelligence
000 Computer science, information & general works > 000 Computer science, knowledge & systems > 004 Data processing & computer science
000 Computer science, information & general works > 000 Computer science, knowledge & systems > 004 Data processing & computer science > 004.2 Systems analysis, design & performance
Library of Congress Subjects: Q Science > QA Mathematics > QA75 Electronic computers. Computer science
Item ID: 3467
Depositing User: Computing Research
Date Deposited: 08 Feb 2010 16:58
Last Modified: 04 Mar 2010 15:58
URI: http://researchrepository.napier.ac.uk/id/eprint/3467

Actions (login required)

View Item

Edinburgh Napier University is a registered Scottish charity. Registration number SC018373