Mercè Crosas is the Chief Data Science and Technology Officer at the Institute for Quantitative Social Science (IQSS) at Harvard University. She has more than 10 years of experience leading the Dataverse project and more than 15 years of experience building data management and analysis systems in industry and academia. She is part of numerous committees and working groups focus on research data management, data citation, and data standards, including serving on BARI’s Advisory Council along with IQSS Executive Director Cris Rothfuss.
Crosas is currently co-PI of the Dataverse Project, with IQSS director Gary King, and supervises the Zelig project for statistical analysis, the Consilience project for text analysis, and the Data Science Services and Data Curation teams at IQSS. She collaborates with a wide range of data related projects, including: the Harvard Privacy Tools project led by Salil Vadhan (http://privacytools.seas.harvard.edu/), the Data Provenance project with Margo Seltzer (http://projects.iq.harvard.edu/dataprovenance), the Structural Biology Grid Data project with Piotrek Sliz (https://data.sbgrid.org/), the Massachusetts Open Cloud (Orran Krieger and Piyanai Saowarattitada at Boston University), and several BARI projects.
Crosas holds a Ph.D. in Astrophysics and a B.S. in Physics. More at http://scholar.harvard.edu/mercecrosas and @mercecrosas
Tell us about the Data Rescue initiative.
February and March have been busy months for the Data Rescue initiative. Boston alone hosted three Data Rescue events at Harvard, MIT, and Northeastern, organized by the Environment Data & Governance Initiative (EDGI, https://envirodatagov.org/) and groups from local Universities. The focus of these events, called archive-a-thons, was to identify government websites with public data that might be at risk, and access, curate, and eventually archive the data for long term preservation.
How did you get involved with these events?
The Harvard Library and the Dataverse team at IQSS helped organized the first Boston event at Harvard, together with Harvard graduate students and EDGI members Toly Rinberg, Maya Anjur-Dietrich, Andrew Bergman, and entrepreneur Brendan O’Brien.
How did it go?
The event at Harvard was attended by 80 enthusiastic and motivated students and staff ready to help towards this effort. Over the course of the day, participants seeded thousands of URLs, and researched and harvested dozens of data sets. A copy of the data collected will be published and archived at a Dataverse repository, and made publicly available to researchers.
There seems to be quite a bit of energy around data saving and sharing efforts. Is this a new trend?
These data rescue efforts have been a success, but are not unique. For many years, librarians and archivists have been backing up public government data to ensure their preservation and accessibility to researchers. Likewise, making public data easily accessible and reusable is one of the missions of the Dataverse project.