Processing Protected Data on High Performance Computing Clusters

Background

The markets for computing with big data sets are rapidly growing. Data analysts, biomedical researchers and a wide variety of other scientists are seeking to run simulations on large scales – using traditional high-performance computing centers as well as cloud computing – and with real-world data sets that have been specially curated and packaged. With this increasing intersection between big data and big hardware and specialized software, there is a growing need for securing the data in use to meet regulatory and privacy demands as well as preserving the organization’s competitive advantages.

Using traditional high-performance computing (HPC) clusters requires additional security for sensitive data. Typical traditional HPC systems execute large and complex compute tasks, such as sophisticated simulation and data analysis, utilizing hundreds to thousands of individual computers (“compute nodes”) that work together. HPC clusters typically operate in “batch” mode: a user submits a request for computation time to the “batch system” which then runs through a variety of steps to execute the job. A key feature of this mode of execution is that the user need not be connected when their job is launched or executed on the cluster. A second key feature is that the job (usually) executes with all the permissions and access afforded to the user when they are connected. Finally, many users' jobs can be executing simultaneously, using separate sets of compute nodes in the cluster.

Many data application domains require stringent access control, protection, logging, and auditing for storage and use of sensitive data. The most stringent controls require encryption of data at rest (stored on disk or tape), and in transit (while being transferred over a network). Additional controls may be required wherever data is decrypted or encrypted: wiping of memory, emptying of caches, and secure management of encryption keys.

The traditional way of applying encryption tools to protect data result in two protection states for a piece of protected data with respect to a specific user: either the data is encrypted, and not usable by the user, or it is decrypted and completely usable by the user. This traditional approach has a number of issues. These issues are particularly severe in a typical HPC cluster, which operates as a shared resource and in batch mode, providing storage and access to many users simultaneously. Available approaches to utilize encryption in HPC settings require significant changes to the HPC operational and execution environment, and only partially address these issues.

Description

LLNL has developed a new method for securely processing protected data on HPC systems with minimal impact on the existing HPC operations and execution environment. It can be used with no alterations to traditional HPC operations and can be managed locally. It is fully compatible with traditional (unencrypted) processing and can run other jobs, unencrypted or not, on the cluster simultaneously. The method has been prototyped and is continuing to be developed at LLNL.

Advantages

Livermore's method is scalable to very large data sets, protects against information leakage between managed information domains, and can be federated (work cooperatively across organization boundaries) with compatible systems. Additional advantages of LLNL's secure data processing method include:

The requesting user identity, as claimed in a user certificate, is explicitly verified to ensure that the requesting process is executing as assigned by the verified user.
The trusted components are explicitly identified, including how they are authenticated, what trusted information they have access to, and the specific version executing.
The user software never has access to the actual decryption keys and does not need modification. The user software can perform arbitrary local processing on the unencrypted data, except read or write output, outside the LLNL method.
All accesses to read or write protected data are logged and auditable. The log also provides authenticated provenance on all produced output. Provenance and chain of custody tracking is available for derived data objects on HPC clusters.
Data owners are explicitly identified, explicitly set enforceable policy, control individual access, and can revoke or deny access at any time in the future.

Potential Applications

The HPCrypt data protection system can be used to protect and log storage, transport and processing on HPC clusters of sensitive data, including health-, financial- and/or privacy-protected (HIPAA, FISMA, etc.), proprietary information, critical infrastructure information or sensitive data from any other domain. Because this methodology is a cybersecurity tool for securely working with big data on high-end computing systems, it is expected to be useful to partners and users in a wide variety of industries - information technology, communications, IoT, manufacturing, health care, banking/finance/insurance, government, and education, among others.

A different application comes about in the context of mutually distrustful parties seeking to collaborate on specific tasks. LLNL's system enables collaborators to perform a specific software process of interest to both (like training a new machine learning model) without revealing each collaborator’s sensitive input data. Once the collaborators codify their mutually-acceptable policy requiring the input data to be secured, LLNL's system enables the two organizations to securely use the data to jointly produce a useful output, without revealing to each other the protected input data required under the policy to be secured.

IT and Communications

Development Status

LLNL has filed for patent protection and has a copyright on the prototype code.

Reference Number

35918