Business Continuity Management and Emergency Management in Centrelink
The audit assessed whether Centrelink has effective Business Continuity Management and/or associated risk management procedures and plans in place that: minimise the likelihood of a significant business outage; and in the event of such an outage, minimise disruption of critical services to customers. The audit also assessed whether Centrelink services satisfy special community demands in times of emergency.
Centrelink delivers the Government's social policy agenda and other programs. In 2001-02, it paid around $55 billion to over 6.3 million customers. Business Continuity Management (BCM) strategies and plans are essential to ensure the agency can continue to deliver these important programs in the event of a crisis.
The Government has required Centrelink to play an increasing role in responding to major emergencies affecting Australians (such as the 2002 Bali terrorist bombings). Centrelink's emergency management (EM) performance is therefore of great importance to the Australian community and the Parliament.
The primary objective of the audit was to assess whether Centrelink has effective BCM and/or associated risk management procedures and plans in place that: minimise the likelihood of a significant business outage; and, in the event of such an outage, minimise disruption of critical services to customers. The audit also assessed whether Centrelink services satisfy special demands in times of emergency.
Accordingly, the ANAO examined Centrelink's frameworks, approaches, strategies, plans, capabilities and recent performance in both BCM and EM.
Key audit findings
Centrelink's BCM framework, elements and main approaches (Chapters 2 and 3)
The ANAO found that Centrelink's BCM framework effectively addresses the main elements of business continuity outlined in the better practice literature, namely crisis response, crisis management, interim processing and business process recovery. For example, Centrelink's crisis management organisational structure is logical, as it is based on a Crisis Command Centre structure, includes appropriate managers from Centrelink's network, specifies appropriate Business Resumption Teams, and clearly defines the roles and responsibilities of the key business continuity participants. The Crisis Command Centre structure and Business Resumption Teams also generally work well in practice.
Centrelink has recently established a Business Continuity and Emergency Management team, and an IT Service Continuity Management team. This new structure should improve the alignment of BCM and EM in Centrelink. In implementing these changes, the ANAO cautions that Centrelink must clearly distinguish the objectives and operating requirements of BCM and EM. The new structure also potentially allows a unit to have a widely recognised and accepted role to co-ordinate and oversight BCM in Centrelink. However, as these changes are very recent, and roles have not yet been fully determined, the ANAO emphasises the need for a single unit to have a clear and unambiguous oversight responsibility for BCM across the whole organisation, even if this change would require the unit to oversight associated information and technology activities undertaken by other units.
The Business Continuity and Emergency Management team provides advice to other parts of Centrelink, and has recently compiled a database of business continuity plans and emergency plans, which it will analyse to improve central oversight and disseminate better practices. However, the ANAO found that Centrelink could enhance central oversight of, and guidance to many staff and managers in its network on, BCM and EM. To clearly and comprehensively outline its approach to BCM, and to improve communication of prime BCM objectives, methods and responsibilities, the ANAO also found that Centrelink should develop, and appropriately distribute, an overarching BCM document.
Most of Centrelink's BCM related plans and processes incorporated a risk management process consistent with that used for broader risk management in Centrelink. The ANAO found risk management and BCM to be well aligned at the operational level, although there is scope for further improvement through enhancing the consistency of business continuity plans and underlying risk management methodologies. There is also scope to upgrade communication between managers with overarching responsibility for BCM and risk management, to further enhance BCM in Centrelink.
Centrelink's BCM framework is underpinned by organisational processes consistent with the ANAO's Better Practice Guide on BCM that focus on: project initiation; identifying critical business processes; designing and implementing treatments; rehearsal; and training.
Identifying and addressing critical business processes (Chapters 4 and 5)
An important element of this audit was to establish whether Centrelink's business continuity strategies and plans covered all critical processes, which would ensure that all services could be recovered within a timeframe that would enable Centrelink to meet its specified business objectives. Centrelink uses two main mechanisms to identify key business processes and undertake Business Impact Analysis, namely Business Criticality Reviews, and continuity elements required to be incorporated into new projects.
The ANAO found that the 2002 Business Criticality Review provided a reasonable approach to identifying key business processes and undertaking a Business Impact Analysis. However, the review should have considered a number of business processes that were omitted, including data and telecommunication systems, and some information and technology (I&T) applications and infrastructure. The review could also have considered the impact of an incident affecting an Area Support Office. Centrelink has advised that a Business Criticality Review to be undertaken in 2003-04 will address these concerns.
The ANAO considers that Centrelink's project management process provides an effective framework for treating both business continuity and risk for new projects. However, the ANAO found that a lack of central recording and oversight of the business continuity elements of new project plans contributed to a lack of effectiveness of the project management process to adequately address business continuity.
Centrelink's business continuity capability is strongly bolstered by the nature of its business, especially the delivery of services to customers throughout its network. For example, if Centrelink were to lose a Customer Service Centre or Call Centre, it can generally quite readily service customers at alternative sites. This flexibility provides extensive built-in redundancies and makes many business processes less ‘critical' than they may at first appear. Many of Centrelink's in-house business support processes also sustain its capacity to manage crises, including the existence of substantial capabilities in corporate communication, social workers, people management, and building management. Centrelink is also able to ‘fly-in' technology and other resources from the National Support Office to the network.
Information technology and telecommunications (Chapter 5)
Centrelink's I&T infrastructure and applications (especially its mainframe processing of customer entitlements) and telecommunications constitute its most critical processes.
At the time of audit fieldwork, the ANAO found that Centrelink's I&T framework had a number of shortcomings but was generally consistent with established BCM practices. The ANAO notes that Centrelink has recently embraced an Information Technology Infrastructure Library (ITIL) framework. IT Service Continuity Management (ITSCM) is a component of ITIL. Fully implementing this ITSCM component should substantially assist Centrelink to improve I&T business continuity management.
The ANAO noted that Centrelink has two data centres, in separate locations, which provide back-up capability for mainframe processing. This capability is likely to be substantially enhanced by the establishment of a proposed single logical data centre, comprising inter-operations of both data centre facilities. At the time of drafting the audit report, Centrelink had begun to formally consider the consequences of simultaneous devastation of both data centres, and its off-site backup storage facility.
Centrelink has a number of risk exposures in its I&T applications, hardware and system software, especially in regard to its mid-range equipment and network environments. The ANAO considers that Centrelink should extend its risk-based analysis of hardware and system software to make it comprehensive and consolidated. Centrelink has undertaken to re-assess the recovery times of its various I&T platforms and to address risks that expose them to system failure.
The ANAO observed that Centrelink's principal applications were not consistently supported by appropriate levels of documentation. Documentation was frequently out-of-date, incomplete and/or relatively inaccessible.
Telecommunications is also a critical dependency for Centrelink, as it enables Centrelink to transmit both data and voice, required to conduct business. The ANAO found that Centrelink's data network provides the required resilience, redundancy and flexibility to ensure high availability, and hence business continuity. Similarly, the ANAO found that Centrelink's voice telecommunications network, Centrelink Call, has taken effective steps to ensure business continuity and resumption. However, the ANAO suggests that Centrelink Call more closely examine its interactions with other Centrelink I&T continuity processes and plans, including third party suppliers, to ensure that recovery expectations are capable of being achieved during a major outage.
Adequacy of BCM strategies and plans, including maintenance and rehearsal (Chapters 2, 4, 5, and 6)
Business continuity plans (BCPs) and associated plans are an important element of virtually all BCM strategies. The ANAO found that Centrelink's BCPs and associated plans were consistent with theory and practice outlined in core BCM literature, including the ANAO better practice guide. However, the ANAO identified excessive variation in the structure, contents and coverage of plans used across the network. As well, these plans often do not differentiate between obligations to respond to local community emergencies and the need to respond to a crisis that disrupts Centrelink's own operations. The ANAO also suggests that Centrelink review its I&T continuity plans. Such a review should include those for the mainframe environment, which are narrowly defined.
Centrelink has recognised these limitations and is planning to address them through the BC and EM Framework project currently underway, and the introduction of ITIL, respectively.
Maintenance of plans involves updating them on a regular and timely basis to ensure, for example, that contact details of key personnel are correct and to incorporate improvements. While some parts of Centrelink's network regularly update plans, limitations in the higher-level management of BCM precluded Centrelink from being able to provide a high level of assurance of BCPs being maintained on a systematic and regular basis.
Business continuity should be treated as an ongoing process, rather than a one-off project. Centrelink is aware of the need for rehearsal as part of this continuous process, and in recent years has undertaken two major simulations involving the National Support Office, to rehearse its response to major crises. However, there have been no such exercises in the wider Centrelink network. Moreover, there is little evidence that staff in the Centrelink network regularly rehearse their roles, or that any other form of training for crisis response or business continuity has been undertaken on a formal and on-going basis. Again, Centrelink is addressing this issue through the BC and EM Framework project.
Capability of Centrelink staff to ensure continuity (Chapters 4, 5, and 6)
The primary objective of BCM is to provide a high level of assurance that an organisation can respond to a crisis. While a major focus is often on frameworks and plans, as the ANAO BPG states, ‘people are often overlooked as the most critical resource in ensuring continuity of business'.
Centrelink has recognised the importance of its staff in ensuring business continuity, and has included in its framework and plans: human resource management issues including occupational health and safety; protocols for communication; treatments for any psychological effects on staff; and support for the recovery team.
During fieldwork, the ANAO observed Centrelink's response to the loss of the Warrnambool Customer Service Centre (CSC) due to fire. This event highlighted the ability, commitment, skills and knowledge of Centrelink staff to undertake contingency processing and restore prime processing. The fire also demonstrated the high level of senior management support for such recovery efforts, the capacity of the recovery team members, and the effectiveness of interaction between all levels of Centrelink in a crisis.
The skills noted above are vital to any business continuity response. The ANAO notes the views of managers and staff at the Warrnambool CSC, and at the associated Victoria West Area Support Office, that other Centrelink offices would most likely be as successful as Warrnambool in responding to such an outage. This is due to the consistency of staff skills and commitment throughout the network, as well as other inherent strengths, such as those discussed in the paragraph above.
However, to further improve the capacity of staff to contribute to BC (and EM), the ANAO found that Centrelink needed to implement a more structured process to develop a competency and learning framework to train staff. The ANAO notes that Centrelink is addressing this through a Training and Communications Strategy, involving the Centrelink Virtual College, as part of the BC and EM Framework Project.
Emergency management in Centrelink (Chapter 7)
Centrelink has a legislated obligation to deliver special and emergency services to the Australian community as directed by the Government. The frequency and complexity of these services have been increasing in recent years, requiring Centrelink to more clearly establish its roles and responsibilities, and manage stakeholder expectations.
The ANAO found that Centrelink's current EM framework clearly articulated internal roles and responsibilities. However, the ANAO notes Centrelink's current project to more closely align its EM and BCM roles. The project will also address a number of improvements identified by the audit related to resourcing and equipping the National Crisis Command Centre.
Another important aspect of Centrelink's EM framework is the effectiveness of links between the agency and other EM stakeholders. The ANAO found that Centrelink's Area Support Offices had effective liaison links in place with their State and Territory counterparts. However, the ANAO considers that greater coordination and monitoring of this effort at the national level would ensure consistent coverage of, and knowledge about, Centrelink's emergency response roles and responsibilities among its own staff, as well as among other EM stakeholders across Australia.
Performance of BCM and EM in Centrelink (Chapter 2)
The audit also examined Centrelink's recent performance in BCM and EM, on the basis that good past performance may reflect effective BCM and EM frameworks and supporting processes, which may in turn indicate (but not guarantee) the likelihood of continued good performance, and vice versa.
While the ANAO could not obtain clear and comprehensive performance information, available evidence generally supported Centrelink's claim of an excellent record in BCM—that is, in providing continuous service to customers and in quickly restoring critical business services after an interruption.
Similarly, available evidence, while not comprehensive, indicated that Centrelink has delivered its EM roles and responsibilities to the high standards expected by stakeholders, including Centrelink customers, the Parliament and the Australian community. Stakeholder satisfaction ultimately depends predominantly on Centrelink delivering emergency payments on a timely basis to eligible customers, consistent with relevant policies and legislation.
Overall audit conclusion
Centrelink has comprehensive and detailed BCM and associated risk management frameworks, policies and plans that generally provide appropriate preventive controls to minimise the likelihood of outages to many of its critical business processes. As well, they provide effective corrective treatments to minimise disruptions of services to customers where these business processes are interrupted. It also has skilled staff, committed to the continuity of essential services to customers.
Centrelink has demonstrated its capacity to deliver its critical business processes by maintaining continuity of customer payments. Its BCM capability has proven to be effective in overcoming the loss of network offices, such as CSCs and Call Centres. Notwithstanding this good performance and inherent strengths, Centrelink has a number of continuity risks. In particular:
- some elements of its I&T environment do not have sufficient continuity controls and treatments, and in light of experiences with the ACT firestorm in January 2003, it is apparent that Centrelink has not adequately addressed risks associated with simultaneous catastrophic events to its data centres and off-site backup storage facility;
- the existing framework for BCM provides insufficient assurance as to the state of BCM preparedness throughout its service delivery network; and
- there are inadequacies in plan maintenance, rehearsal and staff training.
Centrelink noted many of these shortcomings during audit fieldwork, and is in the process of implementing strategies and practices to improve its BCM capacity. Importantly, Centrelink's planned implementation of the IT Service Continuity Management component of the ITIL framework should assist the agency to provide a more comprehensive, consistent and coherent approach to I&T BCM.
Centrelink has been able to satisfy increasing requirements to assist victims of community emergencies, despite some limitations in its EM framework and policies. This performance was based on flexible but robust systems to approve, deliver and record payments, mechanisms to liaise with other emergency service providers, and the efforts of skilled and committed staff. Opportunities for improvements identified in this audit, and in Centrelink's BC and EM framework project, should further improve the efficiency and effectiveness of Centrelink's EM response.
The ANAO made 11 recommendations to further improve Centrelink's BCM and EM capacity. Centrelink has agreed to all of the recommendations and, at the time of report tabling, had begun to address all of them.
A proposed report was issued to Centrelink. Centrelink advised the ANAO of its response to the audit as follows:
Centrelink has welcomed this audit and we have taken the opportunity to participate fully, gaining many benefits in the process. We have particularly appreciated the efforts and consultative approach of the ANAO audit team.
Centrelink has a proud record in Business Continuity and Emergency Management reflecting its role as an efficient and flexible agency for delivery of Australian Government services. Centrelink's role in times of crisis for our community, such as the Bali bombings, the Katherine floods and the Ansett collapse has earned praise throughout our community and furthered our reputation of excellence in emergency response. Coincidentally, during the period the audit was conducted, our Warnambool office was completely destroyed by fire, without disruption of services to our customers.
The January 2003 Canberra bushfires further highlighted Centrelink's ability in crisis response. The outstanding response to the Canberra community, even when many of our staff were personally affected is testament to our capability. During the bushfire emergency, our normal services were maintained and we were able to mobilise additional staff and resources to provide extra support to the community through effective co-operation with the Australian Capital Territory government. Despite the exceptional and unusual ferocity of the Canberra fires, Centrelink's I&T infrastructure was able to continue uninterrupted service delivery for our customers throughout Australia. The McLeod Inquiry into the Operational Response to the January 2003 Bushfires indicated that ‘Although it was probably the most severe fire experienced in the region in the last 100 years, the emergence of large destructive fires in the region, from time to time, is by no means unique.' These events have prompted a review of the risks faced by our data centres and off-site backup storage facilities. As our experience of the bushfires has shown, Centrelink's robust network and dedicated staff have allowed us to continue to provide effective service even when the organisation itself was affected.
Centrelink considers itself a leader in the field of Business Continuity and Emergency Management within the Commonwealth Public Sector. Centrelink is clearly a key player in the whole-of-government response to emergencies in the community with representation on key committees. Centrelink also plays a significant role in the application of Commonwealth remedies for Emergency Management Australia, the Department of the Prime Minister and Cabinet, the Department of Family and Community Services, the Department of Agriculture, Fisheries and Forestry and the Department of Transport and Regional Services. The level of inclusion and consultation from those agencies is a strong indication of our performance and credibility in the Commonwealth emergency management arena.
It is also important to note that the Government's continued confidence in Centrelink has been demonstrated by its significant commitment to Centrelink's IT Refresh program. This program will further our capability to deliver services on behalf of the Australian Government, including bolstering business continuity arrangements. The forthcoming implementation of Centrelink's recently revised Business Continuity and Emergency Management framework will provide improved integration, communication and consistency across our service delivery network. It will also facilitate continuity plan rehearsal, testing and refinement.