LSST Data Management
I was the Subsystem Scientist responsible for the Data Management System for the Large Synoptic Survey Telecope from January 2012 through September 2017. At preset, I serve as the Coordinator of the LSST’s DM System Science Team (SST). The SST’s task is to ensure LSST’s DM system delivers excellent science quality data products and services (we’re the Product Owners of the LSST’s Data Management system).
LSST’s team of 80+ researchers and IT experts, distributed among LSST in Tucson, Princeton University, University of Washington, SLAC, NCSA and IPAC, is developing the data processing system for LSST. The work includes the science pipelines (the image processing software to detect and characterize sources on images, detect moving objects, build co-added images, difference images, etc.), the middleware layer that manages the robust execution on large computing clusters, the distributed database (call qserv) to store and serve the resulting catalogs, as well as the user interfaces enabling the access to and analyses of data.
Given the size and expected quality of the dataset, we’re breaking new grounds in virtually all these areas:
- Image processing: We’re developing a new (general purpose) O/IR image processing system and researching algorithms capable of controlling measurement systematics to the level required for characterization of Dark Energy. We’re pushing the state-of-the art in identification and linking of Solar System objects, to make LSST the most efficient asteroid-finding mission in history.
- Middleware: Our science pipelines must scale from operating on a single laptop with a hard drive, to a 100,000+ core cluster with a mass store system (tape library), and live through 8+ years of construction and 10+ years of operations. The transient alert system must reliably difference images and detect transients in 60 seconds.
- Infrastructure: 6.4 GB images will be transmitted within seconds over 10+ gigabit networks from Chile to NCSA in Illinois. LSST is one of the drivers of multi-gigabit long-haul network infrastructure build-up between Chile and North America.
- Database: Our database must store 10+ PB of LSST catalogs, and enable complex analyses and data mining. We are developing a fault-tolerant distributed database product, based on existing well-developed open-source tools, to meet this need.
- User interfaces: We are developing UIs designed to enable querying and visualization of the LSST data set, as well as enable machine-driven access to it.
Group Members at UW
- Mario Juric
- Melissa Graham
- Colin Slater
- Chris Suberlak
- Grant: NSF/AST-1227061