Supercomputing: A Critical Report of NASA's High-End Computing Capabilities (HEC)
NASA's journey into space is fueled not just by rockets and spacecraft, but also by the enormous processing power of high-end computing (HEC), often known as supercomputing. The NASA Office of Inspector General recently issued an audit report (IG-24-009, March 14, 2024) that examined the management of NASA's HEC capabilities. The research emphasizes the importance of supercomputing in processing large amounts of data and conducting complex calculations, which are required for everything from predicting Mars missions to understanding the Earth's climate.
NASA has been at the forefront of HEC innovation since its creation, with its two primary facilities—the NASA Center for Climate Simulation and the NASA Advanced Supercomputing Facility—playing critical roles in the agency's research and mission preparation. However, the analysis indicates that the management structure of NASA's HEC resources may be impeding its objectives. Currently, HEC is overseen under the Earth Science Research Program of the Science Mission Directorate (SMD), which may not be the most effective organizational structure. This structure, combined with the lack of a thorough management strategy and commitment agreement, makes it difficult to deploy resources efficiently and respond rapidly to technological advances.
One of the most significant challenges raised in the paper is the oversubscription of NASA's HEC resources. The demand for computing time far outstrips the provided capacity, causing schedule delays and compelling NASA teams to buy their own HEC resources to meet deadlines. The Space Launch System team, for example, invests approximately $250,000 per year to operate their own HEC clusters. This issue is not unique; other NASA Centers have also resorted to purchase separate HEC assets to meet specific needs. This fragmented approach of managing HEC resources not only creates inefficiencies, but it also raises questions about NASA's overall cybersecurity posture.
The decentralized management of HEC assets has also resulted in weakened cybersecurity measures, exposing NASA to new threats. The research identifies instances where OCIO-mandated cybersecurity controls are evaded or ignored by Mission Directorates, who regard them as unduly stringent. Furthermore, the widespread usage of NASA's HEC assets by external and foreign national parties, which lack sufficient user activity monitoring or review systems, exacerbates the cybersecurity challenges. NASA's groundbreaking science and technology projects could be jeopardized if a concerted effort is not made to improve cybersecurity.
To solve these issues, the paper suggests that NASA's Associate Administrator select executive leadership to restructure the scope, ownership, and organization of HEC inside the agency. Furthermore, forming a tiger team to work on HEC challenges such as defining stakeholder requirements, finding technological gaps, and improving HEC asset allocation is critical. The research also underlines the importance of a more integrated HEC strategy that balances on-premises and cloud computing alternatives, as well as a dedicated management approach to cybersecurity.
NASA management agreed with the advice to hire executive leadership and partially agreed with the other recommendations, demonstrating a commitment to solve the highlighted concerns. However, the success of these programs will be determined by NASA's ability to create stakeholder collaboration, properly allocate resources, and implement comprehensive cybersecurity measures.
Finally, the paper emphasizes the importance of high-performance computing in allowing NASA to explore the universe and expand scientific understanding. NASA can ensure that its supercomputing capabilities continue to fulfill its ambitious mission goals by refocusing its HEC efforts on improved management, strategic resource allocation, and cybersecurity standards. As NASA navigates the complexities of the cosmos, the power of supercomputing will continue to be an invaluable tool in deciphering our universe's mysteries.