Skip to main content

X. Carol Song

Xiaohui Carol Song's Profile Photo

Senior Research Scientist

Carol Song leads the Scientific Solutions Group in RCAC in developing innovative advanced computing and data cyberinfrastructure solutions to support scientific discovery and learning. Carol has more than 20 years of experience in high-performance and distributed computing, and software engineering to connect domain scientists to advanced cyberinfrastructure technologies. Since 2005, she has been PI or CoPI on more than 30 sponsored research projects, representing more than $60 million in research funding. Carol is the PI of Anvil, a new large-capacity national computational system funded by NSF in 2020, continuing her leadership in HPC including the TeraGrid, XSEDE 1 & 2 projects since 2006. She has led and collaborated in many data infrastructure projects, including NSF-funded Data Interoperability, CI-TEAM, SDCI, SI2, DIBBs, Cybertraining projects, and most recently, a $4.5M NSF CSSI grant to develop a reusable, plug-n-play data framework, GeoEDF, to make large geospatial scientific datasets readily usable by domain scientists. She is the leader of the MyGeoHub.org geospatial science gateway that supports data-driven collaborative research and learning.

Carol has broad engagement with the science and cyberinfrastructure communities nationwide and internationally. She currently serves as a member of the Strategic Planning Advisory Committee for the Coalition for Academic Scientific Computation (CASC); External Advisory Committee member of TAMU’s NSF-funded ACES system; technical program committee member (System Software) of Supercomputing 2021; co-chair of the CODATA (International Council for Science: Committee on Data for Science and Technology) “FAIR Data for Disaster Risk Research (FAIR-DRR)” Task Group focusing on data issues in disaster risk research. Her long history of community service includes serving as the inaugural chair of the XD Service Provider Forum (SPF) 2011 - 2013, and member of the XSEDE Advisory Board 2012-2015; Program Committee Chair of the 2011 Symposium on Data-Driven Approaches to Drought; chair of workshops and external programs for PEARC 2018; editorial board member of the joint special issue “Deep Learning in Remote Sensing: Sample Datasets, Algorithms and Applications” of Global Change Data Repository (WDS, GCDD and Remote Sensing), 2019–2020; program committee roles and reviewer for many conferences such as TeraGrid and XSEDE, PEARC, Grid Computing Environments workshops, International Conference on Parallel Processing (ICPP), and High-Performance Geo-Computation at Geoinformatics (2009).

Carol has funded and mentored more than 60 graduate, undergraduate, and high school students in research projects. She is a mentor, advisor, and advocate in the Women-In-HPC, PEARC, and Supercomputing, etc. programs to help promote and guide the professional growth of early career professionals and broader participation of the underrepresented groups.

Education

  • B.S., Computer Science and Engineering, Tsinghua University, China
  • Ph.D., Computer Science, University of Illinois at Urbana-Champaign

Grants and Awards

  • NSF ACSS: Category I: Anvil - A National Composable Advanced Computational Resource for the Future of Science and Engineering (PI, $22.5M, 10/2020 – 9/2026)
  • NSF HDR Institute: Geospatial Understanding through an Integrative Discovery Environment (I-GUIDE) (Co-PI, $15M, 10/2021 – 9/2026)
  • NSF AccelNet: GLASSNET: Networking Global to Local Analyses to Inform Sustainable Investments in Land and Water Resources (Co-PI, $2M, 1/2021 – 12/2025)
  • NSF CCRI:ENS: Collaborative Research: Open Computer System Usage Repository and Analytics Engine (Co-PI, $1,183,897, 10/2020 – 9/2023)
  • NSF CSSI: Extensible Geospatial Data Framework towards FAIR (Findable, Accessible, Interoperable, Reusable) Science (PI, $4,571,811, 10/2018 – 9/2023)
  • NSF CC* Networking Infrastructure: Integrating Big Data Instrumentation into Campus Cyberinfrastructure (PI, $323,327, 7/2018 – 6/2020)
  • NSF Cyber Training CIU: Cross-disciplinary Training for Findable, Accessible, Interoperable, and Reusable (FAIR) science, (Co-PI, $498,148, 9/2018 – 8/2021)
  • NIH: A Generalized Framework and Flexible Software Environment for Power Analysis, (CoPI, $3.7M, 2018-2023)
  • NSF CBET: Process Engineering Models to Physical Input-Output Tables (PIOTs): A Novel Approach to Reproducible, Transparent and Fast Regional PIOT Development Via Collaborative PIOTHub (CoPI, $292,378, 9/2018 – 8/2021)
  • Data Infrastructure Building Blocks (DIBBs Implementation): Integrating Geospatial Capabilities into HUBzero (NSF, PI, $5.2M, 10/2013 - 9/2018)
  • XSEDE 2.0: Integrating, Enabling and Enhancing National Cyberinfrastructure with Expanding Community Involvement (NSF, PI of Purdue subaward, $1.8M, 2016-2021)
  • CI-NEW: Computer System Failure Data Repository to Enable Data-Driven Dependability (NSF, Co-PI, $763K, 2015 - 2018)
  • Mellon Grand Challenge: From Global to Local to Global: Attaining Long Run Sustainability in US Agriculture (Co-PI, $135K, 2017-2018)
  • Discovery Park Big Idea: From Global to Local Analysis of Systems Sustainability (Co-PI, $300K, 2017-2019)
  • Seed for Success (Excellence in Research) award, Purdue University, 2009-2014, 2017, 2019, 2020
  • 20th Annual Governor’s Award for Environmental Excellence (U2U team), January 2018
  • Purdue College of Agriculture 2015 TEAM award
  • USDA/NIFA 2015 Partnership Award

Selected Publications

  • C. Song, P. Smith, R. Kalyanam, X. Zhu, E. Adams, K. Colby, P. Finnegan, E. Gough, E. Hillery, R. Irvine, A. Maji and J. St. John. Anvil - System Architecture and Experiences from Deployment and Early User Operations. Practice & Experience in Advanced Research Computing Conference, July 10-14, 2022, Boston, MA. DOI: 10.1145/3491418.3530766
  • J. Woo, L. Zhao, C.Song, I. Haqiqi, D. Grogan and R. Lammers. A Collaborative Container-based Model Coupling Framework. Practice & Experience in Advanced Research Computing Conference, July 10-14, 2022, Boston, MA. DOI: 10.1145/3491418.3530298
  • R. Kalyanam, L. Zhao, C. X. Song, V. Merwade, J. Jin, U. Baldos, J. Smith. GeoEDF: An Extensible Geospatial Data Framework for FAIR Science. In Practice and Experience in Advanced Research Computing (PEARC ’20), July 27–31, 2020. ACM, New York, NY, USA. DOI: DOI: 10.1145/3311790.3396631
  • J. Shin, L. Zhao, C.X. Song, R. Kalyanam, and J. Jin. SACI - A Cloud Based Real-Time Sensor Data Management and Analysis Platform. Gateways 2020 conference, October 19-21, 2020.
  • L. Zhao, C. X. Song, L. Biehl, V. Merwade, M. Huber, J. Liu, U. Baldos, and I.Shunko. 2020. Building a Gateway Infrastructure for Interactive Cyber Training and Workforce Development. In Practice and Experience in Advanced Research Computing (PEARC ’20), July 27–31, 2020. ACM. DOI: 10.1145/3311790.3396639
  • Kalyanam, R., Zhao, L., Song, C.X., Biehl, L., Kearney, D., Kim, I.L., Shin, J., Villoria, N. and Merwade, V. 2018, October. MyGeoHub - A Sustainable and Evolving Geospatial Science Gateway. Future Generation Computing Systems (FGCS), special issue on science gateways. DOI: 10.1016/j.future.2018.02.005
  • Biehl, L.L., Zhao, L., Song, C.X. and Panza, C.G. Cyberinfrastructure for the Collaborative Development of U2U Decision Support Tools. Journal of Climate Risk Management, 2016. DOI: 10.1016/j.crm.2016.10.003
  • Villoria, N.B., Elliott, J., Müller, C., Shin, J., Zhao, L. and Song, C.X. Web-based download, visualization, and aggregation of future temperature and precipitation during the growing season of major crops. SoftwareX, November 2017. DOI: 10.1016/j.softx.2017.11.004
  • Villoria, NB, Elliott, J, Müller, C, Shin, J, Zhao, L, and Song, C., 2016, January. Rapid aggregation of global gridded crop model outputs to facilitate cross-disciplinary analysis of climate change impacts in agriculture. Environmental Modelling & Software. 75, pp.193-201. DOI: 10.1016/j.envsoft.2015.10.016
  • Rajib, M.A., Merwade, V., Kim, I.L., Zhao, L., Song, C. and Zhe, S., 2016, January. SWATShare–A web platform for collaborative research and education through online sharing, simulation and visualization of SWAT models. Environmental Modelling & Software, 75, pp. 498-512. DOI: 10.1016/j.envsoft.2015.10.032
  • McLennan, M., S. Clark, E. Deelman, M. Rynge, K. Vahi, F. McKenna, D. Kearney, and C.X. Song. “HUBzero and Pegasus: Integrating Scientific Workflows into Science Gateways”, Concurrency and Computation: Practice and Experience, March 2014. DOI: 10.1002/cpe.3257.
  • Kalyanam, R., L. Zhao, C.X. Song, Y. L. Wong, J. Lee and N. Villoria. “iData: A Community Geospatial Data Sharing Environment to Support Data-driven Science”, in Proceedings of Conference on Extreme Science and Engineering Discovery Environment, San Diego, CA, July 2013.
  • Katz, D.S., S. Callaghan, R. Harkness, S. Jha, K. Kurowski, S. Manos, S. Pamidighantam, M. Pierce, B. Plale, C.X. Song, and J. Towns. “Science on the TeraGrid,” Computational Methods in Science and Technology (CMST), Special Issue 2010, 81-97.
  • Kristof, P., B. Benes, C.X. Song, and L. Zhao. A system for large-scale visualization of streaming Doppler data. In Proceedings of 2013 IEEE International Conference on Big Data, pages 33-40, Silicon Valley, CA, October 6-9, 2013.
  • Zhao, L., C.X. Song, C. Thompson, H. Zhang, M. Lakshminarayanan, C. DeLuca, S. Murphy, K. Saint, D. Middleton, N. Wilhelmi, E. Nienhouse and M. Burek. "Developing an integrated end-to-end TeraGrid climate modeling environment", TeraGrid Conference, Salt Lake City, UT, July 2012.
  • Bina, M., P. Wyss, S.A. Lazarus, S.R. Shah, W. Ren, W. Szpankowski, G.E. Crawford, S.P. Park, X.C.Song. “Discovering sequences with potential regulatory characteristics,” Genomics 93 (2009) 314–322.
  • Smith, P., T. J. Hacker and C.X. Song. “Implementing an industrial-strength academic cyberinfrastructure at Purdue University,” IEEE International Symposium on Parallel and Distributed Processing (IPDPS), 2008.
  • Squicciarini, A., W. Lee, E. Bertino and C.X. Song. “End-to-End Accountability for Grid Computing Systems”, 2008 IEEE Asia-Pacific Services Computing Conference (IEEE APSCC 2008), Yilan, Taiwan, December 9-12, 2008.

The Latest News

Additional information and news on Carol's group and projects can be found at RCAC Scientific Solutions Group