In this four part series, we’ll explore critical infrastructure IT and OT in the modern era, focusing on Electric Utilities in the United States. 

Part 1: Understanding IT and OT

Part 2: Security Landscape

Part 3: Bridging the Gap

Part 4: Looking Ahead

Understanding IT and OT

What are IT and OT?

Information Technology (IT) is the term we use when referring to the building, deployment, administration and final archive of the systems and processes that handle business data. At any modern utility, IT has fingers in every department: electronic payroll, dispatching crews in the field, GIS of the utility’s expanded grid network, automated phone systems for taking customer bill payments or outage reporting, outage maps, engineering modeling servers, email, web access—if it touches a computer in any way, the IT group is likely involved. They also handle the underlying administration of the systems that make all of this work: from imaging systems for employee workstations, to the network infrastructure enabling the company’s data to be made available, to email/phone/videoconferencing systems that directly enable collaboration across the enterprise. It’s safe to say that the People, Processes and Technology of any modern utility are irreversibly dependent on the systems and software that the functional IT team supports.

Operational Technology (OT) is the all-inclusive term we use when referring to the industrial control systems that allow our utilities to function and scale our operational efforts. At an electric/water/gas utility, their SCADA system is the heartbeat that keeps every single device in the field under control and enables remote grid operations at scales unheard of fifty years ago. The colossal amount of data made available allows for utility operational efficiencies we never dreamed of in the past, cutting massively into critical metrics like reduction in Mean Time To Restoration, or identifying (and remediating!) problems long before they become an operational impact. Add to that the modern data-driven concepts these systems and data acquisition processes enable like: 

  • Fault Location, Isolation, and System Restoration (FLISR) that allows modern electric utilities to isolate outages and reduce the impact of trees on the line and other risks to electric service delivery, reducing the need for field crews to be exposed to dangerous situations and cutting Mean Time To Restoration for distribution grid power
  • Distributed Energy Resource Management Systems (DERMS) allowing the operation of microgrids using community solar/battery/wind, reducing power generation costs and improving system reliability
  • Advanced analytical tools, enabling Operators to make more informed decisions about grid operations and reduce impact of system events as well as improving efficiency of dispatched crews in the field

In manufacturing, the ICS systems that operate and coordinate every part of the manufacturing plants has enabled “dark manufacturing” for the last couple decades, allowing for plants to run with minimal personnel at maximized efficiency, reducing product delivery times and improving throughput.

Why have IT and OT converged?

The convergence reason boils down to a simple root cause: companies need access to the data in these traditionally-isolated OT systems. Some examples of how modern electric utilities use this information are: 

  • Providing live outage restoration information via an outage web map
  • Coordinating grid outage restoration with SCADA, metering system data, and customer outage calls
  • Using field Remote Terminal Unit (RTU) data to monitor the operational health of multimillion-dollar capital assets like transformers or generation systems
  • Enabling modeling of electric grid power flows for system expansion and design efforts
  • Building statistical modeling to improve power plant scheduling based on weather, fuel costs, operational efficiency of units, live power pooling and peaking rates, etc

The processes and technology to enable these functions borrows heavily from, or is directly operated on, business IT hardware and software. Many of the operational concepts like virtualization, automation, virtual private networking, cybersecurity, etc can carry over into the OT environment to improve operations and reduce toil work, bringing efficiencies not present in older air-gapped OT environments. On top of that, data analysts are being employed more and more in modern utilities, using this data to build efficiencies that the utility didn’t have access to before—think of things like scheduling Bulk Energy Storage System (BESS) resources to dispatch power during peak demand and charge during cheap peak renewable generation times. 

What challenges does the convergence present?

The biggest challenges that modern utilities face with the convergence of IT and OT are the potential cybersecurity, regulatory, and operational impacts. 

Cybersecurity

We’ve all heard of the big Advanced Persistent Threat (APT) industrial attacks like Stuxnet or Colonial Pipeline by now. While being big news at the time they happened, most utilities assume it’s always going to happen to “someone else” or “someone bigger”. However, CISA’s recent public release of information surrounding the Volt Typhoon campaign should be front and center of every utility information security program across the country right now. This is a real, credible threat, where foreign threat actors have compromised and lived in the networks of critical infrastructure for at least five years now—possibly longer, as it was only discovered 5 years ago—from companies as small as 5 employees up to multi-state distribution utilities.

These APT’s are compromising IT systems and pivoting into OT systems—which are now linked together with firewalls or even worse, direct routed connections—and setting up persistence/testing utility response time to network events in preparation for larger scale service interruptions. If you take nothing else away from this post today, take this away: you need to read between the lines of what is publicly released about Volt Typhoon to understand the information those working in critical infrastructure have access to, and how bad this problem really is.

Regulatory

Every part of critical infrastructure in the US has some sort of regulatory body establishing operational and security restrictions against it—we’ll be focusing on NERC-CIP and CMMC 2.0 for the electric utility industry. 

NERC-CIP is currently under major revision to modernize it’s security posture, but even as version 5 stands there is significant additional administrative burden and overhead when your OT network is able to be accessed from or towards your IT network. Implementing tight system baselines and security settings, 24/7 operational data security inspection, and documenting the constant stream of updates and patching efforts is a burden that is foreign to traditional OT operations (if it was ever done at all!). The IT side of the house has dealt with this for years, implementing multiple layers of security tooling to serve their Defense-in-Depth strategy to try and keep threat actors at bay. 

CMMC 2.0 is something else to look at—for utilities that have contracts with government entities, or who supply military bases with power, this is already becoming reality…and if the Department of Defense has their way, every critical infrastructure company will be compliant with the highest level of CMMC 2.0.

Operational

In IT, decisions are made based on preserving the Confidentiality, Integrity and Availability of all company data, commonly called the CIA Triad. In the OT field, business decisions are made based on the Safety, Reliability and Availability of the OT systems—and as we all know, the IT and OT motivations are many times in direct conflict. Patching against critical vulnerabilities is a huge business risk reduction that could mean the difference between an APT compromising your turbine control systems or not, but how do you balance that against the operational need to run that plant 24/7/365? Is it worth the cost of a shutdown, alternative power purchase, and startup process to mitigate a potential risk? Can systems be designed better to minimize impact of meeting IT’s CIA goals, while still preserving OT objectives? 

Bringing these skillsets into the OT environment, to build processes that meet regulatory requirements but minimize disruption of operations, is one of the core parts of what we’ll talk about in the next article.