When entering the University’s datacenter, it’s natural to wonder about the seemingly infinite sources of information being computed there. Could these systems be performing cutting-edge thermodynamic modeling? Or, perhaps are they helping to unravel some of the mysteries of quantum chemistry? That said, would your curiosity would change if the source of this data were closer to home? In fact, it’s much closer to home: the US Census Bureau or the Social Security Administration.
This last illustration reflects the research regularly conducted at the Maxwell School of Citizenship and Public Affairs. For those not familiar with the Maxwell School, it is Syracuse University’s home for interdisciplinary teaching and research in the social sciences, public policy, public administration, and international relations. Throughout Maxwell and Eggers Halls, faculty and students engage in a wide variety of empirical research on issues affecting us every day. Issues like this relate to public policy and sound governance at international, national, state, and local levels.
Diving deep into this research reveals just how computationally intensive it can be. Based primarily in the social sciences, this research can include topics such as the large-scale general equilibrium modeling of environmental policy, nonparametric estimation and cross-validation applications to economic policy analysis, the role of gene-environment interactions in social behavior, and spatial analysis of geographic, political, and economic data. Suffice it to say, a discussion on any one of those research topics and its use of technology would certainly be swimming in the deep end of the pool!
The Early Years of Research Computing in Maxwell
Historically, Maxwell has had a long-history of technical innovation and commitment to maintaining an infrastructure capable of supporting research. While not the genesis of the school’s research computing, a 2003 Syracuse University News article about an NSF-funded supercomputer cluster sets the stage:
“…, in 2001, another group of analysts, including Maxwell School of Citizenship and Public Affairs economics professor Jeffrey Racine, used computationally intensive “nonparametric” statistical methods they had recently developed to reassess the efficacy of right heart catheterization, using data obtained from UVA.”
This “Beowulf cluster” computing cluster was notable in the research performed in Maxwell’s Center for Policy Research. In its day, this system’s supercomputing power was considered “massive,” powered by 37 dual-node hyperthreaded 3.06 GHz Xeon processors. Fast-forward fourteen years later, and while the speed of computer processors are exponentially greater, the methodologies of parallel computing and the creative use of off-the-shelf hardware continues in Syracuse University’s current infrastructure – namely OrangeGrid and SUrge. In fact, the standard desktop PCs used in Maxwell currently contribute over 2000 cores towards OrangeGrid. These resources benefit research efforts across the SU campus. Particularly in the physical sciences and engineering who need reliable, high throughput computing (HTC).
Computing Power to the Masses
The current state of research computing in Maxwell has seen a tangible shift in recent years. While once primarily the realm of professors with the resources to support the required hardware, students in the Maxwell School are now routinely performing sophisticated research using University research computing resources.
Two equally important factors have led to this point. One is the availability of virtual computers (these are computers not physically near the researcher but are accessible to them across a network through remote desktop connection. The other factor is proactive communication to the Maxwell community about these resources by Maxwell Information and Computing Technology (ICT). This communication comes through several channels including face-to-face orientations, regular newsletters, word of mouth, and of course the ICT website.
Stanley Ziemba, Director of IT for Maxwell, further describes computing resources available to faculty and staff researchers alike:
“In addition to department labs, Maxwell researchers have remote access to virtual computers running in the AVHE and high-performance desktops hosted in ICT’s computer room. These virtual and physical high-performance computers offer more computing power and access to higher performance statistical software than a normal lab PC might have. Also, they both provide the convenience of being able to access the resource from virtually anywhere in the world. For instances where access to data must be physically limited, Maxwell features three secure rooms for sensitive data that require isolation from the internet.”
ICT support continues with regular consultation between ICT staff and the researcher. This group typically includes Computer Consultants Michael Fiorentino, Candi Patterson, and System Administrator Edward Godwin. Ziemba describes a common support scenario:
“A subset of the Information and Computing Technology Group at Maxwell supports the research for the school. We meet with the faculty or student and discuss their project. We work with them to provide the right computing resources and software to help them make their project a success. We also provide other services and support such as grant writing assistance, backup and encryption support and data custodian support.”
“We discuss space allocation for both the dataset and the output when we meet with the researcher. Data sets range for megabytes to terabytes. Typically, the data come in CSV files, sometimes encrypted depending on the nature of the data. We also discuss the computational requirements they have for the project. For example, if the researcher is running thousands of quantile regressions at a time, and needs the results to process the nested set of regressions, we will provision more CPU cycles as well as a lot of memory to optimize the completion of each run.”
Not all projects have the same resource requirements, and it’s worth noting that Maxwell ICT staff are neither subject matter experts nor experts with the statistical software commonly used by the researchers. This should not be surprising given the wide variety of data historically used. In recent years these data sources have included:
- SSA (Social Security Administration)
- NDI (National Death Index)
- NCHAP (North Carolina Healthcare Administrative Professionals)
- QDR (Quantitative Research Repository)
- LSOG (Longitudinal Study of Generations)
- NIH (National Institute of Health)
- NHATS (National health of Aging Study Trends)
- HRS (Health and Retirement Study)
- NELS (National Educational Longitudinal Study)
One Student’s Experience
Xiuming (Audrey) Dong is a doctoral student in Maxwell’s Economics department and has been actively using Syracuse University’s Academic Virtual Hosting Environment (AVHE) since September 2016. Here she runs STATA MP remotely eight to twelve hours per day cleaning up and analyzing US Census data from the American Community Survey (ACS). This computing and analysis, which is part of her required second-year Ph.D. field research, will culminate in a paper titled “The Labor Market Effects of Temporary Legalization: Evidence from the Deferred Action for Childhood Arrivals.”
Dong was well versed in STATA having already learned it during her undergraduate studies at the University of Texas at Austin. However, her own personal computer was quickly becoming a bottleneck, hindering her research. She found herself leaving her PC on running simple analyses for eight-hours or more, only to have the machine become non-responsive and lockup – this was, unfortunately, a too-common experience for her. Fortunately for Dong, the frustration of working locally lasted only a few weeks before another student shared with her the advantages of working in the AVHE virtual environment.
After an introductory meeting with the Maxwell School’s Information and Computing Technology group where Director of IT, Stanley Ziemba and a subset of his staff discuss computing needs with Dong, her virtual computer hosted in the AVHE was provisioned and made available to her. While some minor tuning of the AVHE was needed to adjust the virtual system’s resources to the size of her data sets, such adjustments of virtual resources were quickly and easily performed.
Ziemba explains: “Both the virtual and physical high-performance computers that the faculty and student access remotely, offer more computing power and access to higher performance statistical software than a normal lab might have. They also provide the convenience of being able to access the resource from virtually anywhere in the world.”
As a Ph.D. student, Dong recognizes the need for the quantitative analysis that she’s performing. She is also appreciative of the system and support received from the University (and Maxwell in particular) as she transitions from qualitative to quantitative analysis. Dong explains: “People can be convinced to buy your argument by your data. These (data analysis) skills are becoming commonplace in the public realm.” To her, data analysis skills are “necessary and efficient tools, ” and all her peers “are using some sort of technology in their analysis.” Behind the scenes, Maxwell ICT is there providing the support she needs. According to Ziemba, the typical support track follows this path: “We meet with the faculty or student and discuss their project. We work with them to provide the right computing resources and software to help them make their project a success. We also provide other services and support such as grant writing assistance, backup and encryption support and data custodian support.”
Reflecting on the school’s use of these resources, Dong hopes her story may help overcome any sense of hesitancy by faculty to contact Maxwell ICT and learn about how and where virtual computing can be incorporated into research. She continues: “faculty should not be worried that moving this direction will take a long time.” In her experience, “The most impressive thing (of the virtual computing experience) was how fast the ICT group responds and how fast they solved my problems – every time!”
To learn more about support for research computing in the Maxwell School, please email firstname.lastname@example.org. And for research computing questions and support for anyone in the greater Syracuse University community, please email email@example.com.