Data Integrity in the Child Support Agency
The objective of the audit was to examine the integrity of electronic records stored on the CSA's database—Cuba—and to report on the effectiveness of CSA's management of the data.
The Child Support Agency (CSA) was established in 1988 to administer Australia's Child Support Scheme. The CSA currently forms part of the Department of Human Services (DHS).
Through its administration of the Child Support (Registration and Collection) Act 1988 and the Child Support (Assessment) Act 1989 the CSA is responsible for the:
- assessment of how much child support is payable, through the application of a formula;
- registration of court orders for child and spousal maintenance, court registered agreements and administrative assessments;
- collection of child support; and
- disbursement of child support.1
Since the introduction of the Child Support Scheme, the CSA has managed a total of 1.4 million child support cases involving almost 4.6 million people. In 2006-07, the CSA reported transferring some $2.68 billion in child support payments between parents. At June 2007, the CSA managed an active caseload of approximately 800 000 child support cases, involving around 1.4 million parents and approximately 1.2 million children.
The primary electronic database supporting the administration of the Child Support Scheme is named Cuba.2 The data in Cuba are organised around a number of core business functions including: case management; customer relationship management; child support assessments; accounting; and administrative support. The design of Cuba incorporates two significant data constructs—case and customer. Each child support case involves a number of customers—usually a payer, a payee and one or more children. Each customer may be associated, in various roles, with more than one child support case.
While not in the largest category of Australian Government databases, Cuba is one of the more complex—given the various relationships that can exist between payers, payees, children, third parties, employers, financial institutions and overseas government agencies, across multiple child support cases.
Audit scope and objective
The objective of the audit was to examine the integrity of electronic records stored on the CSA's database—Cuba—and to report on the effectiveness of CSA's management of the data.
The audit assessed the CSA's electronic case and customer records and data management practices against the following criteria:
- CSA's case and customer records are accurate and complete;
- CSA's case and customer records are reliable and internally consistent;
- CSA has adequate controls and procedures to ensure a high quality of data capture and recording; and
- CSA effectively manages case and customer records.
The audit considered aspects of the CSA's data capture and recording practices, including data exchange with other agencies, along with technical and administrative level controls surrounding the CSA's data entry. A substantial part of the audit focussed on the analysis of data integrity within the various tables3 of Cuba. The data extract from 16 Cuba tables comprised, in total, 142 958 924 lines of data.
Data within these tables were tested to ensure that selected mandatory fields contained valid entries. We also addressed aspects of internal consistency in the database, applying these as measures of the accuracy and completeness of customer records. The analysis included an assessment of the integrity of the primary key4 for both the case and customer tables.
A major objective of our analysis was to identify any CSA customers who had been issued with more than one customer identification number and to assess the business risk posed by fragmenting customer information across active CSA records.
Based on analysis of an extensive extract of CSA records, the ANAO concluded that the majority of records in Cuba are sufficiently accurate, complete and reliable to support the effective administration of the Child Support Scheme. Anomalous records that were detected usually accounted for a relatively small proportion of all records in the database.
While relatively few in number compared to the entire record set, the presence of erroneous records in the database indicates a weakness in effective control systems for data entry and recording. The CSA has recently introduced a Data Quality Improvement Programme, including a series of activities designed to test specific aspects of data quality in Cuba. The CSA will benefit from extending this programme to incorporate a comprehensive check of the application of all relevant business rules within the database.
Furthermore, the CSA should draw on the findings of this audit, and the information obtained through its Data Quality Improvement Programme, to identify and address weaknesses in data quality control systems. The inclusion of some controls should be relatively straightforward—such as a technical level control to ensure that an ‘end date' does not precede a ‘start date', within a line of data. Other controls, such as enforcing a standardised approach to recording names and addresses, may present more of a challenge, yet are essential to improving the overall quality of customer data in Cuba.
Most of the errors and weaknesses identified in this audit pose a minimal risk to the CSA's overall administration of the Child Support Scheme. However, particular errors or anomalies on individual records can result in an inaccurate calculation of child support liability. For the families involved, the effects can be significant. One of the objectives of the Child Support Scheme is that ‘parents share in the cost of supporting their children according to their capacity'. Incorrect child support liability calculations, resulting from errors on customer records, pose a risk to the achievement of this objective.
This report highlights a number of areas in which the CSA could significantly improve the quality and reliability of data in Cuba, by addressing:
- inconsistent and inaccurate recording of dates, names and addresses;
- redundant records and training records in the production environment of the database;
- information fragmented across multiple customer records;
- weaknesses in the accuracy and reliability of Tax File Numbers and Centrelink Reference Numbers; and
- corrupt records arising from a data conversion in 2002.
Key findings by chapter
Data Capture and Recording—Chapter 2
The Cuba database commenced in 2002 with the conversion of customer and case information from an older mainframe computer system. While many data quality issues associated with the conversion—such as corrupt records—have been resolved by the CSA, a small number still persist.
All Child Support Officers (CSOs) undertake a standardised entry level training program and have access to national guidance material to assist them in their work. The training programme contains a module on Cuba navigation and work management. However, it does not attempt to provide staff with specific skills in data entry. These are generally gained through on the job experience. However, data entry quality assurance practices varied across CSA sites. Some team leaders checked 100 per cent of CSOs' work in their first four weeks on the job, others relied on staff to report instances of incorrect processing.
Case Records—Chapter 3
Testing found that the primary key for the case table—the case identification number—was sound, and contained no duplicate values.
The ANAO's analysis revealed that a relatively small proportion of liability start dates and liability end dates recorded in Cuba may be invalid or unreliable. Some liability start dates precede the commencement of the CSA, while some liability end dates indicate extraordinary apparent case durations.
A relatively small number of records demonstrated a weakness in the integrity of start and end dates pertaining to some case indicators. Further inconsistencies in the use of dates were evident in a small proportion of records in the liability calculation table. In particular, several inconsistencies were noted in the CSA's use of low dates and high dates5 to denote the commencement and end of a period.
The data in Cuba include some records that display an inconsistency between case status and their current case liability determination. Some active cases appear not to be associated with a current liability determination, while some non-active cases show a current liability determination.
The impact of these anomalies on the overall administration of the scheme is likely to be minimal. However, the potential impact on the individual customers involved may be significant, as these data integrity issues may result in incorrect liability determinations. In September 2007, the CSA advised the ANAO that it had commenced corrective action on a number of anomalous case records identified during this audit.
Customer Records—Chapter 4
Testing found that the primary key for the customer table—the child support identification number—was sound and contained no duplicate values.
A relatively small number of records in the customer table exhibited weaknesses in the integrity of date fields used to describe particular periods—with end dates preceding start dates for some customer indicators.
In relation to the accuracy of recording customers' names in Cuba, analysis revealed:
- 105 records where the name fields were blank;
- the existence of over 200 apparent training and other spurious records in the customer table;
- inconsistent application of business rules and acceptable standards for recording customer title, given name, middle name and surname in up to 1 per cent of records in the customer table; and
- 14 per cent of all customer records—current and historical—display the entry ‘UNKNOWN' as the customer's surname.6
In relation to the accuracy of recording customers' dates of birth and dates of death, analysis revealed a number of weaknesses. According to the data recorded in Cuba:
- the oldest payee is 116 years of age and the youngest payee is 10 years of age;
- the oldest child, associated with a currently active case, is 51 years of age; and
- records for nine customers display the same date for the customers' date of birth and date of death, and 19 customer records, associated with active cases, also display a date of death.
Up to 12 per cent of customer records, associated with active cases, do not display a current address. In some instances mandatory address fields were left blank or displayed information other than address details—such as telephone numbers, email addresses or passwords.
Tax File Numbers (TFNs) play a central role in identifying CSA customers and facilitating data exchange with the Australian Taxation Office (ATO). Analysis confirmed that no single value for a TFN was recorded more than once in Cuba. However, 58 individual customer records contained invalid TFNs.7 In addition, 913 pairs of customer records displayed duplicate Centrelink Reference Numbers (CRNs).
Within the individual customer table, 118 676 records are not, and have never been, associated with a CSA case. These include records apparently created during staff training. While they present only a slight business risk to the CSA, they are redundant records and should be removed from the production environment of Cuba.
In September 2007, the CSA advised the ANAO that it had commenced a process to correct anomalous customer records identified during this audit.
Employer Records—Chapter 5
The employer table in Cuba contains just under 259 000 records. In approximately 38 000 instances the employer's name appears more than once—constituting multiple records for employers. Almost 21 000 of these have been identified by the CSA as redundant records. Employers play an important role in collecting child support payments by withholding a portion of their employees' wages and remitting this to the CSA. Multiple records for employers create difficulties for CSOs processing these payments and create unnecessary re-work, transferring payments between the multiple accounts. Analysis of the employer table revealed numerous instances of mandatory fields containing blank entries.
This situation points to a weakness in the application of business rules within the employer table.
Multiple Customer Records—Chapter 6
The CSA uses the child support identification number (CSID) as the primary key for customer records. In a well managed database each individual customer is allocated only one identification number. If a customer has more than one CSID there is a risk of fragmenting that customer's information across two or more unrelated records.8
The CSA has identified up to 18 000 customers with duplicate records and marked these records accordingly.9 The ANAO sought to identify whether any other customers had been issued with more than one CSID, which had not been detected by the CSA. Through a series of internal data matching activities, 27 633 customers in this category were identified. Of this group, 360 customers show currently active cases on each of their CSIDs.
Within the group of multiple record customers, the ANAO observed:
- different case liability determinations across the two records;
- different income details;
- some of the multiple records were linked to the same CSA case, while other records related to different cases;
- 493 multiple record customers displayed two different TFNs, while 99 displayed two different CRNs;
- 136 customers displayed a date of death on one of their records but not on the other; and
- incompatible customer roles—such as a child on one record and a payer or payee on the other.
Customers with multiple records account for less than 1 per cent of all customer records and, therefore, pose only a slight risk to the CSA's overall administration of the Child Support Scheme. However, for the families involved, the errors in calculating child support liabilities can be significant. For example, the ANAO identified one payer with two customer records, each associated with an active child support case. Against one CSID, the customer's annual taxable income was recorded as $15 000 and an annual case liability of $5600 was displayed. Against the other CSID, the customer's annual taxable income was shown as $130 000, generating a case liability of $24 500.
The results of these analyses were provided to the CSA. The CSA investigated the circumstances surrounding the anomalous records and, in September 2007, advised that, for a relatively small number of customers, child support liability calculations were incorrect. The CSA also advised that it had commenced corrective action on these records.
Implications of Data Integrity Issues—Chapter 7
This audit has highlighted a number of specific opportunities for the CSA to improve the quality of data in Cuba. As well as improving the quality of individual customer records, the CSA could also improve the overall management of data in Cuba through a more consistent enforcement of business rules within the database.
The CSA would also benefit from the introduction of an active programme to regularly test the accuracy and reliability of customer and case data. The CSA is aware that a number of corrupt records exist in Cuba, arising from the 2002 data conversion and has commenced work to resolve these issues. The CSA will benefit from further developing its Data Quality Improvement Programme.
The value to be gained from these improvements takes on added significance as the CSA moves to implement a new version of Cuba in July 2008.10 A comprehensive cleansing of current data and the consistent application of business rules, within the database, should ensure that the CSA commences operation of the new version of Cuba, in 2008, with the highest possible quality dataset.
The ANAO made five recommendations to improve the accuracy and integrity of individual customer information stored in the CSA's database, and to ensure a more thorough application of business rules and quality assurance procedures.
The Department of Human Services and the Child Support Agency agreed with all recommendations.
Summary of DHS's response
The Department of Human Services (DHS) appreciate the assurance provided by the ANAO's Data Integrity audit report outlining that the majority of records in the Child Support Agency's (CSA) primary database, Cuba, are sufficiently accurate, complete, and reliable to support the effective administration of the Child Support Scheme. However, where anomalies on individual records may potentially affect a customer's child support liability, we will resolve these errors and develop mechanisms and controls to support the continuous improvement of the data holdings in Cuba.
Furthermore, DHS values the recognition provided by the ANAO of the work commenced by the Data Quality Improvement Program within CSA and agrees with the benefit that would be gained with incorporating ongoing testing of data quality and the application of business rules, in Cuba, as part of this program.
1 CSA, What the Child Support Scheme aims to do [Internet]. Available from <http://www.csa.gov.au/agency/facts.aspx> [accessed 2 August 2007].
2 Cuba is not an acronym—the system was named after Cuba, the goddess of children in Roman mythology.
3 A table is a component of the database that stores related records. For example, the case table stores certain information about CSA cases; the case indicators table stores information on a selection of indicators that relate to CSA cases.
4 The primary key is a means of uniquely identifying each record within the database and a mechanism to link data across various elements of the database.
5 The CSA uses 01/01/0001 as a low date and 31/12/4000 as a high date. While not valid dates in themselves, low dates and high dates are used in a variety of circumstances—as a placeholder for a valid date; to indicate that the true date is unknown; or to indicate that a valid date is not able to be recorded at a particular point in processing a record. The high date also indicates that a particular line of data within a table is current.
6 The majority of these are child records. Prior to the conversion of data from the old CSA system to Cuba, in 2002, many child records did not incorporate a surname. These records were migrated to Cuba with the surname of ‘UNKNOWN'. At the time of conducting this audit, there were over 389 000 active child records displaying the surname ‘UNKNOWN'.
7 In 56 cases the recorded TFNs consisted of more than nine digits, and in two cases the recorded TFNs failed the ATO check digit algorithm.
8 Multiple records most often occur when an existing or previous customer of the CSA becomes involved in a subsequent child support case. It is possible for the CSO processing the second child support application to create a new record for the customer, rather than calling up the existing customer record.
9 The ANAO identified some 12 783 marked in accordance with the Procedural Instruction: A Guide to Duplicate Payers/Payees Records, Version 1.1, 2002. In addition, the ANAO identified another 5000 customer records, distinguishable as duplicate records but not marked in strict accordance with the Procedural Instruction.
10 As part of the introduction of the third stage of reforms of the Child Support Scheme.