Business Continuity Case Study: lessons learned from malware attack

113

County Councils IT Systems hit by cyber attack

The attack, triggered when an employee clicked on a malicious attachment in an email, led to a shutdown of council Information Technology (IT) systems as the authority contained and then investigated the malware’s impact.
The malware was later identified as a new variant of the ‘JS.Nemucod’ JavaScript Trojan downloader, designed to encrypt files indiscriminately. The variant of malware has since been confirmed as a variant of “Teslacrypt v3” by anti-virus vendors.
It was detected as a result of users being unable to access files on the corporate network. Further technical analysis determined the files were being encrypted by malware which was determined to have been delivered by an email attachment containing a .zip file.
This type of malware is known as ‘ransomware’ due to the financial ransom demands to unlock the affected files. This new variant malware generated a so-called ‘zero day’ attack on the first organisation unlucky enough to open it.
Fortunately council systems and online services were fully restored after being out of action for almost a week. An effective fix to this piece of malware, which the council’s security vendors had not seen before, was developed and tested. Having taken swift action to prevent access to systems as a precaution, there was no evidence of personal data being lost.
The council has products in place to protect its systems in case of a malware attack, but the Council’s two separate anti-virus vendors confirmed that this particular variant had not been seen before making it very difficult to defend against using Anti-Virus (AV) signature techniques only.
Investigations suggest that this was a localised incident, involving a relatively small number of users, and there has been no evidence that this was a targeted attack on Lincolnshire County Council. Nonetheless, over 47,300 files were encrypted.
Because of the destructive capabilities of ransomware attacks, the council prevented access to all systems to contain and isolate the malware until it had been safely identified and removed. Following checks, systems were confirmed as being healthy, and there was minimal risk to information.
To reduce the risk of further instances of the malware being instigated, the council identified devices belonging to people who received the email to be cleansed. Access to all IT services which presented writable file shares to users were shut down.
There is no evidence to suggest that outbound emails matching the known characteristics of the malware left the Council’s email system before it was closed down and the only services directly affected by the attack were internal network file shares used by business units within the Council.

As well as taking fast action to contain the malware, the council activated its business continuity (BC) plans, to make sure any effect on service delivery (especially critical services) was monitored, and contingency plans were activated.
This process was managed by the Business Continuity Incident Management Team (BCIMT), chaired by Director of Finance & Public Protection in his role as ‘Gold’ (strategic lead). This also assured the council of its continuing ability to respond to civil emergencies.
As part of IT service restoration planning, pre-agreed prioritisation was given to systems and applications supporting safeguarding, child protection, specialist transport, registration services, Fire & Rescue Services (FRS), customer services and urgent payments.
Individual service areas invoked their own BC plans for the loss of IT (or followed policy direction set by the strategic lead), and additional solutions were found in case of longer-term disruption.
There was no evidence to suggest that the confidentiality of any personal information was compromised, nor any long-term loss of information availability (apart from those files that were created and then encrypted before the next scheduled back-up).
The approach undertaken was verified by the East Midlands Cyber Crime Unit who also investigated the incident and liaised with Council IMT security leads.
Identifying the lessons
This report to Council on the effectiveness of the corporate response to the malicious malware attack of 26th January 2016 identifies lessons learned and actions required to further improve our response to similar events.
It presents an analysis of the event through ‘management reviews’ of incident management systems (including a specific analysis of staff communication cascades), ‘structured debriefs’ and various incident reports. These have been used to identify both ‘strengths’ in the response to the attack, as well as ‘areas for improvement’, from which a main set of recommendations are formulated for consideration by Council.
Strengths;
 There was a positive response from our communities, our partners (especially the Council’s IT service provider and our security vendors who worked around the clock with our own Information Management Technology team to find a solution) and our own staff, who demonstrated goodwill, initiative and determination to maintain services.
 Although there were delays in reporting the incident, it was important to then be able to quickly detect abnormal file activity and internet traffic to ensure both human and technology reporting was effective. It was essential to have the ability to close down services to contain malware.

 The incident reinforces the value in our compulsory Information Governance training to make sure we do not rely, solely, on technical safeguards which can never be enough on their own.
 Co-ordinating both the IT investigation and solutions, and business continuity at strategic levels within the council, was invaluable – it gave a sense of urgency and common purpose to our response; setting clear objectives and priorities for business recovery, and redeploying staff where necessary. The invocation of Business Continuity Incident Management processes to oversee the response and recovery was agreed as a key strength in our response, and compares favourably with the previous IT outage incident in 2010.
 Lincolnshire Fire & Rescue also invoked their own (parallel) command & control structures to secure their own resilience (as a separate ‘category 1 responder’ under the Civil Contingencies Act).
 Invocation of BC plans (especially for those ‘most affected’ critical services), additional ‘work arounds’ (in particular urgent coordination between customer services centre and adult & children’s services), and the coordination provided by the BCIMT, all worked well and ensured all critical issues were managed to their resolution.
 The Chief Information Officer (CIO) performed an invaluable role as the public face (‘talking head’) of the Council, providing regular updates and reassurance to both communities and staff, whilst also correcting a number of inaccuracies in media reporting. She was very well supported in this role by the strategic communications team.
• The incident response highlights the value of the official council ‘twitter feed’ which has become the trusted Corporate twitter feed with followers – people who trust us and re-tweeted to the widest audience. Effective use of social media also helped manage demand on the customer services centre.
 Support from partner organisations (e.g. NHS and Districts) helped us get our message to a wider audience. The Police also immediately supported the council by ensuring civil emergency responses would be effectively coordinated from an alternative to the County Emergency Centre (CEC).
 A pragmatic approach to employee relations issues was taken, with full engagement of Corporate Management Board (CMB). Managers and staff were proactive, with the time being used to undertake supervision, appraisals, training and filing. Staff demonstrated flexibility to take AL/TOIL to help manage situation.
Areas for improvement;
 In spite of the growing threat, loss of IT through cyber-attack does not currently feature as a strategic risk. More recent corporate business continuity planning has concentrated on the interdependencies between the loss of premises, or power, and the provision of IT service provision but not on the vulnerability of our services to prolonged and complete loss of systems through malicious act.

 Using anti-virus products which collate a number of vendors may have reduced the risk of relying on a particular vendor to identify and/or protect against new threats. Considering what is blocked/ quarantined at the perimeter (for example .Zip files) may also reduce risk.
 The frequency and sequencing of BCIMT and the IMT(IT) partner & vendor meetings might have been better coordinated to aid information sharing and timely decision making. (Applying the ‘IMT’ acronym to these two ‘independent’ meetings caused some confusion and might lead to avoidable failures in governance and the management of responses. In future, and to differentiate, BCIMT will be renamed as Incident Management Group – IMG).
 An avoidable tension developed in respect of prioritising asset tracking & return, and the more urgent focus on restoring systems. The current registration and actual usage of IT assets / accounts proved inaccurate and unreliable, and exacerbated retrieval and recovery efforts.
 Service areas are entering into contracts to provide services to other authorities and 3rd party organisations without the knowledge of IMT. This places greater reputational risk onto LCC and may also incur contractual penalties.
 The implementation of stand-alone provision of critical IT systems, independent of the network and systems which became unavailable, was not generally possible (outside of full disaster recovery plan invocation) but may have been useful to support continuity of service in a number of business areas. Stand-alone provisions were put in place to support some critical social care functions.
 Earlier representation from Democratic Services within the BCIMT (IMG) may have helped address Members concerns that they were not fully communicated with.
 Unrelated incidents which followed the malware attack, such as issues with Wi-Fi provision and Members iPads, were incorrectly attributed to it.
 Prior to this incident, the Council’s IT service provider had not delivered its obligation to liaise with service areas in regards of ensuring BC plans are aligned to current IT capabilities and ensuring all planning assumptions are accurate.
 The incident identified clear gaps in individual service BC plans where there are no apparent contingencies for a total loss of IT beyond relatively short time periods. Business Continuity plans need to be reviewed regularly checked to ensure they work when tested, with allowance for each phase of expected system loss (1 day loss, 3rd day loss, etc.) and to include the full loss of IT for prolonged periods (rather than partial or limited losses).
 Whilst this malware attack did not provide a real exercise of the IT disaster recovery, and of commissioned providers’ business continuity plans, other ‘cyber security’ incidents may require their invocation.
 All BC plans (corporate and service level) should be uploaded and stored on to the County Council’s account on Resilience Direct (RD) the UK Government’s resilience secure, web-based, information sharing platform in order to provide additional access outside the Council’s network (this secure site can be accessed from any public browser) .
 Some internal instructions to staff (especially initial system shutdown and subsequent system recovery procedures) were confusing and ambiguous. Anecdotal evidence suggests local interpretation of policy direction was inconsistent.
 The publication of photocopied, handwritten (later typed) messages around County Offices was undoubtedly effective in disseminating information to some staff. However, the ability to communicate with all staff across diverse locations requires more formal arrangements (including delivery options).
 The need for an effective communication cascade to get key information out regularly was recognised following a previous IT outage incident in 2010. However, it quickly became apparent that proper (and tested) ownership of a ‘formal corporate communication cascade’ system has not been established. It needs to be noted that communication with staff was identified as a priority during the first BCIMT (IMG) meeting, and that once a final list of managers had been identified, the cascade worked well.
 Managers must be responsible for making sure their own contacts are kept up to date – including mobile phone numbers and office locations – as part of service level BC planning.