The audit examined aspects of the integrity and management of customer data stored on ISIS. In particular, the audit considered measures of data accuracy, completeness and reliability. The scope of the audit also extended to aspects of Centrelink's IT control environment - in particular, controls over data entry.
Like most Australian Government agencies involved in service delivery in the 21st century, Centrelink relies on large and complex information technology (IT) systems to support its extensive business operations. The heart of Centrelink's IT systems is ISIS—the Income Security Integrated System—Centrelink's main customer database.
In 2004–05, Centrelink's IT systems performed more than 5.2 billion electronic computations and processed some $63 billion of social security payments to over six million customers. Centrelink grants approximately 2.8 million new claims each year. At September 2005, the ISIS database held information on over 23 million customers—recording details of customers' identity, circumstances and eligibility for benefits under various social security programmes. Approximately 6.2 million of the 23 million records relate to customers with a current benefit determination.1
In order to distinguish between customer records, a unique identifier is assigned to each record—the Centrelink Reference Number, or CRN. The information in ISIS is organised around the CRN, which links customer information in various parts of the database. For example, the CRN links information on a customer's circumstances and benefit determinations with that in the payments file.
Customer information is spread across eleven networked computing environments, with each environment, essentially, servicing a region, state or territory within Australia2. Centrelink's data holdings are growing at a rate of approximately 30 per cent each year, and at September 2005, the ISIS database held information in over 440 billion fields, with an average of 21 000 fields of information per customer.
The audit examined aspects of the integrity and management of customer data stored on ISIS. In particular, the audit considered measures of data accuracy, completeness and reliability. The scope of the audit also extended to aspects of Centrelink's IT control environment—in particular, controls over data entry.
ANAO considered Centrelink's processes and procedures for entering customer data into ISIS, including the controls surrounding customer registration and the validation of customer data. ANAO also examined Centrelink's existing data integrity error detection and reporting system.
ANAO requested, and Centrelink provided, data extracts from all 23 million ISIS records. ANAO tested the contents of a number of mandatory fields to ensure these conformed to Centrelink's business rules and specifications. ANAO's analysis also included a check of logical relationships between various fields. Centrelink customers are required to prove their identity when claiming a pension, benefit, or allowance from Centrelink. ANAO examined details of Proof of Identity (POI) documents recorded on ISIS.
A substantial part of ANAO's analysis involved testing the integrity of the primary key of the database—the CRN. ANAO checked for the existence of duplicate CRNs—whether any given value for a CRN was associated with more than one customer—and for multiple CRNs—where an individual customer had been assigned more than one CRN. 5
Fieldwork for the audit was primarily undertaken during April 2005 to October 2005. ANAO acquired over 8 million lines of data, extracted from the agency's data integrity error detection system on 12 July 2005. On 13 September 2005, Centrelink provided ANAO with over 23 million lines of data extracted from the main ISIS database, in accordance with ANAO's specifications.
Overall audit conclusion
Centrelink's customer database, ISIS, constitutes one of the largest and most complex Australian Government databases holding information about Australian citizens and residents. With over 23 million records in total, some 6.2 million records support a current benefit determination, and in most cases, payment to a customer by Centrelink.
This audit found that Centrelink could significantly improve the accuracy and integrity of data stored on ISIS. In particular, Centrelink could improve the integrity of the primary key used in ISIS, and reduce the risks associated with fragmenting customer information across multiple records. Centrelink should also remove training records and obsolete customer records from the production environment of its database. ANAO also found that Centrelink should improve the effectiveness of its existing data integrity checking system.
The audit found that up to 30 per cent of customer ‘proof of identity' (POI) information recorded on ISIS was insufficient or unreliable in terms of uniquely identifying or substantiating the identity of customers. While much of this information related to historical records, ANAO also found that this information is still relied upon to process new claims associated with those historical records. ANAO noted that Centrelink has tightened some of the controls around POI data entry and that the quality of recently entered POI information appears to be considerably improved.
While this audit has highlighted a number of business risks arising from these data integrity issues, including the risk of duplicate or inappropriate payments to customers, the ANAO also found that Centrelink had in place a number of other controls designed to prevent inappropriate payments. Accordingly, the audit found that, while these risks exist, duplicate payments had only occurred in a small number of cases.
Therefore, given the scale and complexity of Centrelink's IT operations, and considering the information examined in the scope of this audit, ANAO concluded that Centrelink's electronic customer records are, generally, sufficiently accurate and complete to support the effective administration of the range of social security programmes for which Centrelink is responsible.
ANAO also recognises that Centrelink responded promptly to the matters raised during the course of this audit, and commenced a number of initiatives to address specific data integrity issues identified by ANAO, and to generally improve the quality of data in ISIS. Key among these initiatives were projects to analyse and correct the identification of false positive results in the agency's existing data integrity error checking system, the establishment of a Data Quality Team to develop a long term strategy to improve and maintain data quality and work to comprehensively describe the effects of data integrity errors. Centrelink also undertook to review the operation of the priority rating system for data integrity errors.
In addition, Centrelink acted quickly to review cases of potential duplicate payment of customers, and to commit to resolving cases of duplicate and multiple CRNs.
Data entry and exchange (Chapter 2)
Having introduced the ‘Getting it Right' strategy in 2000, which is founded on the four pillars of: the right person is paid; under the right programme; at the right rate; for the right date(s), Centrelink's intentions of ensuring accurate data and payments were evident to ANAO.
Centrelink provides training for all Customer Service Officers (CSOs) in relation to registering new customers on its database. CSOs are also provided with considerable guidance in relation to processing claims and recording customer information. However, decisions about whether customers lodging a new application should be issued with a new CRN rely on the judgement of individual CSOs. The ANAO found that, despite the range of administrative level controls in place, up to 3 per cent of Centrelink customers appear to have been registered more than once on ISIS (for a detailed treatment of customers with multiple registrations, see Chapter 6).
Centrelink's IT systems incorporate a number of system level controls, designed to ensure compliance with certain business rules and data entry specifications. However, ANAO's analysis of ISIS data, in particular that detailed in Chapter 4 of the report, indicates that not all data entry business rules have been comprehensively enforced.
Centrelink has introduced post-data-entry quality assurance procedures, such as Quality On-Line6 and a Random Sample Survey programme 7. These are designed to detect inaccurate payments or benefit determinations that may have arisen from inaccurate customer data.
Data integrity error detection and reporting system (Chapter 3)
Centrelink has in place an extensive data integrity error detection and reporting system, incorporating checks of structural integrity and checks against various programme business rules. However, ANAO noted that the system was not widely used by Centrelink programme managers, nor was the information systematically analysed to reveal trends or identify the cause of particular data integrity failures.
Also, ANAO found that the data integrity error detection and reporting system had deteriorated over time and failed to incorporate updated or new business rules, thus producing many false positive results. The system did not discriminate between data integrity errors associated with current Centrelink customers and those associated with historical customer records. Consequently, Centrelink programme managers were not afforded an insight into the true magnitude of particular data integrity errors or the actual level of risk relating to current customers.
Centrelink employs a priority rating system to provide a high-level breakdown of data integrity error statistics. However, the ANAO found that the system did not adequately discriminate between errors, nor did it overtly highlight those areas requiring immediate attention by programme managers. This is because approximately two-thirds of all errors were classified as Priority 1 or Priority 2. ANAO also found that, over 50 per cent of the top 87 error definition tables lacked any description of the effect of the error. In this circumstance, programme managers were not presented with sufficient information to recognise the significance, or easily comprehend the likely impact, of particular data integrity problems.
ANAO noted that, according to the information contained in Centrelink's data integrity error detection and reporting system, the number of data integrity failures has steadily increased over the past two years. The ANAO acknowledges that a large proportion of reported errors arose from the incorrect identification of false positives8 , and over half of all identified errors were associated with historical records. However, Centrelink had made little progress, over the two years preceding this audit, in resolving the errors.
Testing data integrity (Chapter 4)
The results of ANAO's analysis of selected database fields, extracted from the 23 million customer records on ISIS, showed that the production environment of ISIS:
- contained at least 10 000 training records—that is, non-genuine customer records created while training Centrelink staff;
- exhibited a degree of inconsistency in the recording of customers' names, with some entries containing a customer's first name, second name and surname, all in the surname field, while leaving the other two fields blank;
- exhibited a degree of inconsistency in recording customers' address details;
- contained entries in particular fields, which were outside the range of legal values defined for those fields; and
- exhibited some anomalies in the recording of customers' dates of birth and death. For example, ANAO found 42 customer records that displayed the same date for the customer's date of birth and date of death and one record indicating that the customer was born two months after his recorded date of death.
These findings point to a lack of, or failure of, system level controls, which should enforce conformance with Centrelink's documented business rules and data recording specifications.
ANAO's analysis of the data indicated that Centrelink records a false or ‘dummy' date of birth, when a customer's true date of birth is not known with certainty. The ‘dummy' dates used in the ISIS database are 1 January and 1 July in any given year, although the years 1900 and 1901 are regularly used. ANAO considers that this practice could skew any statistical analysis, based on customer age, although the effect would be most noticeable for age profiles over 100 years. According to ANAO's analysis, approximately 0.5 per cent of recorded dates of birth, for current Centrelink customers, are inaccurate to some extent.
ANAO identified that 1.46 million customer records on ISIS had a date of death recorded for the customer—some of which were many decades in the past. ANAO also found that a relatively small number of these records supported a current benefit determination. That is, the data supplied by Centrelink to ANAO, indicated that these customers were current?although not necessarily in payment. Centrelink subsequently advised ANAO that payments had ceased for the majority of these customers, but that the records had been corrupted and continued to display a current benefit determination, when they should no longer do so.
Centrelink also advised ANAO that it was required to maintain some records for deceased customers—where there may be an ongoing debt to the Commonwealth, or where the record is associated with a partner record 9. While recognising Centrelink has a valid business reason for maintaining those categories of deceased customer records, ANAO considers that there is little reason to maintain the large number of records relating to deceased customers, which do not fit into these categories, in the production environment of the database. The existence of these records gives rise to an unnecessary risk to the integrity of Centrelink payments.
ANAO found that the data field recording customers' Tax File Number was compromised, in that entries in that field were not unique. Yet, Tax File Numbers are intended to be unique—the one Tax File Number may not be shared by two people. ANAO found that up to 7 000 customer records—3 500 pairs of records—shared the same Tax File Number. ANAO's analysis of Centrelink's data indicated that, in many cases the single Tax File Number was shared in Centrelink's records by a couple, or a parent-child combination, or a sibling combination.
Recording customer identity (Chapter 5)
ANAO examined 8.3 million lines of Proof of Identity data to determine the usefulness of the information recorded in ISIS, in substantiating the identity of customers. This involved checking that the POI documents recorded in the database were associated with unique serial numbers or registration numbers.
ANAO's analysis revealed that, as at September 2005, many ISIS records displayed entries inconsistent with Centrelink's policy for recording POI information. Rather than recording valid serial numbers for particular POI documents, thousands of records displayed apparently false serial numbers, such as 99999, 123456, and xxxxx. In addition, many other records displayed a text entry, such as, Citizenship papers, Unknown, and Sighted, rather than a valid serial number.
ANAO's analysis showed that only 72.6 per cent of POI records citing Australian Citizenship Certificates contained unique values. In addition, 96.6 per cent of POI records citing Current Australian Passports, and 56.6 per cent of POI records citing Australian Birth Certificates, contained unique values.
Overall, ANAO's analysis of four primary POI documents revealed that up to 30 per cent of the recorded details on ISIS were insufficient or unreliable in terms of uniquely identifying or substantiating the identity of customers.
ANAO also noted that, since September 2001, Centrelink had introduced a range of system level controls and quality assurance procedures designed to improve the quality of POI information recorded in ISIS. ANAO accepts that Centrelink has made a significant improvement in the quality of POI data entered into ISIS over the past two or three years, and that current procedures are superior to those in place prior to 2001. However, ANAO's analysis included all POI data recorded on ISIS as at September 2005—recent and historical—as historical POI data is still used, in many cases, when processing a new claim for a previous Centrelink customer.
Integrity of the primary key (Chapter 6)
Centrelink uses the CRN as the primary key for ISIS. Within any database, the primary key is of great importance. In a well managed database, each customer is allocated one, and only one CRN. In addition, no one CRN is shared by two records within the database.
ANAO found that Centrelink's primary key was compromised by the existence of up to 25 000 duplicate CRNs. That is, in 25 000 cases the same CRN had been allocated to two different customers. In addition, ANAO identified up to 500 000 customers with multiple CRNs. That is, those customers had been registered at least twice, under two different CRNs. While the raw numbers appear substantial, they represent approximately 0.2 per cent and 3 per cent, respectively, of all customer records in ISIS.
The existence of duplicate CRNs means that the primary key may not be relied upon to uniquely identify Centrelink's customers within ISIS. The effect of multiple CRNs is that customer information may become fragmented across two or more different records. This situation presents a risk of duplicate benefit payments or an inappropriate combination of benefit payments—one on each of the customer's unrelated records.
ANAO's analysis indicated that up to 1 000 Centrelink customers possessed a current benefit determination on each of two separate records. In many of these cases, one benefit determination appeared to be linked to a payment while the second did not10. In a minority of cases, the data indicated that a customer was current for the same benefit on two records, or that the two records supported incompatible benefit determinations. 11
ANAO provided Centrelink with relevant details and Centrelink investigated the circumstances of these cases. Centrelink then advised ANAO that some of these cases were previously known to exist and that alternative controls were in place to avoid duplicate payments.12
Therefore, ANAO found that, while the fact that up to 500 000 customers have multiple records presents a risk of overpayment, that risk had been realised in only a very small number of cases. Nevertheless, ANAO considers that Centrelink should address the underlying data integrity issues, rather than rely on an incomplete set of alternative controls to mitigate these risks.
Implications of data integrity issues (Chapter 7)
ANAO found that the inconsistent recording of customer's names and addresses creates a number of problems and reduces the integrity of customer data generally. These problems are compounded by the use of dummy values for some date fields, the existence of training records in the production environment and anomalies in the recording of Tax File Numbers. During the course of this audit, ANAO observed the:
- improper use of data fields—all name elements appearing in the surname field, leaving the first and second name fields blank;
- reversal of first and second names across two records;
- data entry errors and variations in spelling, including the use or non-use of hyphens and/or spaces in two-word name elements;
- inconsistencies in recording addresses; and
- use of values outside those defined as legal values in Centrelink's data dictionary.
ANAO considers that inaccurately recording customer details could inhibit Centrelink's ability to effectively analyse its customer data for compliance and fraud detection purposes. Inaccurate data could also reduce the effectiveness of Centrelink's data matching with other agencies and organisations. Obsolete and dummy records could make it difficult for Centrelink to calculate accurate counts of customers or to conduct modelling or data profiling activities that rely on customer age.
ANAO found that poor data integrity in Centrelink's electronic POI records could impact on its capacity to effectively detect and prevent fraud, or to engage in data matching activities where a high degree of confidence in the identity of its customers is required.
ANAO found that fragmenting customer information across two or more records, through the inadvertent allocation of multiple CRNs to individual customers, presented the greatest risk to maintaining the integrity of Centrelink payments. With two or more—unlinked—customer records, a customer may have two current benefit determinations, be identified as deceased on one record but not the other, or display inconsistencies in personal information across those records.
ANAO noted that Centrelink had in place other controls to guard against overpayment, in the cases of multiple CRNs known to Centrelink. This audit found that only a small number of overpayments appeared to be associated with multiple CRN customers. However, ANAO found that, with up to 500 000 customers on ISIS, who have multiple CRNs, Centrelink would be in a stronger position to manage these risks if it were to resolve the underlying data integrity issues.
The ANAO made five recommendations to improve the accuracy and integrity of Centrelink's electronic customer data. The ANAO made two recommendations aimed at improving the usefulness of Centrelink's data integrity error detection and reporting system. The ANAO also recommended that Centrelink continue to monitor and improve the customer POI information held in ISIS, and improve the integrity of the ISIS primary key, the CRN.
Centrelink thanks the Australian National Audit Office for the way in which this audit was conducted. The professionalism of the officers, their evident technical expertise, the working relationship that was fostered and the willingness of all parties to address operational issues throughout the course of this audit has greatly aided Centrelink in quickly implementing continuous improvements to the administration, accuracy, completeness and consistency of its data holdings.
1 Other records include historical records for customers previously in payment, along with records for organisations and children.
2 One of the computing environments stores information on Centrelink customers residing outside Australia.
3 For example, that a customer's recorded date of death did not precede his or her recorded date of birth, or that a customer's marital status (single or partnered) aligned with the payment rate for a benefit that was paid at either a single or partnered rate.
4 The primary key is a means of uniquely identifying each record within the database and a mechanism to link data across various elements of the database.
5 And, therefore, had multiple records in the database.
6 Quality On-Line (QOL) monitors the completeness and correctness of information used in processing customer's claims. QOL was introduced as a quality assurance process to ensure that payments made by Centrelink and the services provided are correct. The QOL system is based on a second person comprehensively checking the correctness of the work of the CSO who initially processed a customer's claim and entered the customer's data into ISIS.
7 The Random Sample Survey (RSS) Programme involves sampling a number of Centrelink customers from each of the main payment streams to verify the accuracy of information provided by those customers. This process occurs each year and also checks the accuracy of calculated payment rates, based on the customer information.
8 False positive results can arise when a data integrity error checking program tests data against obsolete business rules. These data integrity checking programs report errors where they should not.
9 Where the partner is still alive and in receipt of payment.
10 Some benefit determinations are not payment-related. For example, a person may have a current benefit determination to receive a Low Income Health Care Card or for JobSeeker Registration, which allows access to the Job Network.
11 For example, the recipient of a Carer payment may not be in receipt of another income support payment, such as Age Pension, NewStart Allowance or Parenting Payment. However, such a person may be entitled to receive a Carer Allowance or a Family Tax Benefit payment. [More information is available in Centrelink's publication, A Guide to Australian Government Payments].
12 Centrelink had implemented a duplicate payments filter—a control within the payments system—to stop payment on the second record of a known duplicate pair. Centrelink advised the ANAO that at 18 January 2006, 1 283 of the 2 000 records had been investigated. Centrelink identified one case of overpayment and six cases where a customer had been issued a second Low Income Card.