
|
Problems
|
Features
|
Tools
|
|---|---|---|
|
Auditing Tools
| ||
|
Is your data complete and valid? |
Data examination-determines quality of data, patterns within it, and number of different fields used |
WizSoft-WizRule
Vality-Integrity |
|
How well does the data reflect the business rules? Do you have missing values, illegal values, inconsistent values, invalid relationships? |
Compare to business rules and assess data for consistency and completeness against rules |
Prism Solutions, Inc.-Prism
Quality Manager
WizSoft-WizRule
Vality-Integrity |
|
Are you using sources that do not comply to your business rules? |
Data reengineering-examining the data to determine what the business rules are |
WizSoft-WizRule
Vality-Integrity |
|
Cleansing Tools
| ||
|
Does your data need to be broken up between source and data warehouse? |
Data parsing (elementizing)-context and destination of each component of each field |
Trillium Software-Parser
i.d. Centric-DataRight |
|
Does your data have abbreviations that should be changed to insure consistency? |
Data standardizing-converting data elements to forms that are standard throughout the DW |
Trillium Software-Parser
i.d. Centric-DataRight |
|
Is your data correct? |
Data correction and verification-matches data against known lists (addresses, product lists, customer lists) |
Trillium Software-Parser Trillium Software-GeoCoder i.d. Centric-ACE. Clear I.D. Library Group 1 - NADIS |
|
Is there redundancy in your data? |
Record matching-determines whether two records represent data on the same object |
Trillium Software-Matcher Innovative Systems-Match i.d. Centric-Match/Consolidation Group 1-Merge/Purge Plus |
|
Are there multiple versions of company names in your database? |
Record matching-based on user specified fields such as tax ID |
Innovative Systems-Corp.-Match |
|
Is your data consistent prior to entering data warehouse? |
Transform data-“1” for male, “2” for female becomes “M” & “F”-ensures consistent mapping between source systems and data warehouse |
Vality-Integrity
i.d. Centric-Match/Consolidation |
|
Do you have information in free form fields that differs between databases? |
Data reengineering-examining the data to determine what the business rules are |
Vality-Integrity |
|
Do you multiple individuals in the same household that need to be grouped together? |
Householding-combining individual records that have same address |
i.d. Centric-
Match/Consolidation Trillium Software-Matcher |
|
Does your data contain atypical words-such as industry specific words, ethnic or hyphenated names? |
Data parsing combined with data verification – comparison to industry specific lists |
i.d. Centric-ACE, Clear I.D. |
|
Migration and Other Tools
| ||
|
Do you have multiple formats to be accessed-relational dbs, flat files, etc? |
Access the data then map it to the dw schema |
Enterprise/Integrator by Carleton. |
|
Do you have free form text that needs to be indexed, classified, other? |
Text mining-extracts meaning and relevance from large amounts of information |
Semio-SemioMap |
|
Have the rules established during the data cleansing steps been reflected in the metadata? |
Documenting-documenting the results of the data cleansing steps in the metadata | |
|
Is data Y2K compliant? | ||
|
Is the quality of the data poor and people don’t care because they have adjusted to it? |
