Data Cleansing and Conversion

Magsys uses the rich functionality of ACL™ Analytics as the core engine for the Data Cleansing and Conversion Tasks.
Visual Basic is also used to augment results presentation and provide automation.

Characteristics:

  • Data Cleansing is often carried out when awareness grows of problems due to inaccurate or missing data
  • Most data problems should have been picked up by validation at data entry time
  • Data problems can be a result of batch input processes which did not apply validation rules
  • Problems can emerge when systems are merged ⁄ upgraded with insufficient prior data cleansing
  • Standard functions for standardising names, addresses, phone numbers, checking for missing fields and validating ABNs etc.
  • Specific logic can be written to detect duplicate entities and matching to other tables can be useful

Examples of Data Cleansing and Conversion:

Data Cleansing for a state government Portable Long Service Leave entity.
This system keeps records of working periods for people engaged in the building industry.
Due to the intermittent nature of the work and reliance on data provided by many employers and workers, the quality of the data was very variable.
Standardision of names and addresses was applied to facilitate identifying multiple instances of service for apparently different workers where, in actuality, it was the same person.
This involved fuzzy matching on names, birth dates and addresses.
Having found multiple service records for multiple instances of an individual worker, the next step was to build a service history combining them and link them to an updated worker master record.
Further ACL scripts performed more validation on various data fields and also checked logical database integrity.
This data cleansing effort contributed to a signiificant improvement in the quality and integrity of the worker’s service data.

Rebuilding HR records for a state government rail entity.
The original state government rail entity maintained all its employees in Ellipse HR ⁄ payroll. However it was broken up into a number of separate entities and employees were vested to the new entities.
Each new entity maintained their own instance of Ellipse HR ⁄ Payroll and these employees were set up in them.
Over time these entities were further split up ⁄ sold off or merged, finally a new state government decided to reunite a subset of these entities. Employees were re-vested into the re-united entity.
Some employees did not have consistent employee numbers across the Ellipse systems and many transferred between entities. Also various codes were not consistent and needed harmonising.
Fortunately, all the data from the various Ellipse instances and a proprietary personnel system predating the original split, was available.
This facilitated checking and building a validated service history and other HR tables for the employee.
ACL was a key tool in this significant effort.