Article: Savransky, D; Shapiro, J; Bailey, V; De Rosa, R; Wang, J; Ruffio, JB; Nielsen, E; Tallis, M; Perrin, M; “Mining the GPIES database”, Proceedings of SPIE, Adaptive Optics Systems VI, 10703
Abstract: The Gemini Planet Imager Exoplanet Survey (GPIES) is a direct imaging campaign designed to search for new, young, self-luminous, giant exoplanet. To date, GPIES has observed nearly 500 targets, and generated over 30,000 individual exposures using its integral field spectrograph (IFS) instrument. The GPIES team has developed a campaign data system that includes a database incorporating all of the metadata collected along with all individual raw data products, including environmental conditions and instrument performance metrics. In addition to the raw data, the same database also indexes metadata associated with multiple levels of reduced data products, including contrast measures for individual images and combined image sequences, which serve as the primary metric of performance for the final science products. Finally, the database is used to track telemetry products from the GPI adaptive optics (AO) subsystem, and associate these with corresponding IFS data.
Here, we discuss several data exploration and visualization projects enabled by the GPIES database. Of particular interest are any correlations between instrument performance (final contrast) and environmental or operating conditions. We show single and multiple-parameter fits of single-image and observing sequence contrast as functions of various seeing measures, and discuss automated outlier rejection and other fitting concerns. We also explore unsupervised learning techniques, and self-organizing maps, in particular, in order to produce low-dimensional mappings of the full metadata space, in order to provide new insights on how instrument performance may correlate with various factors. Supervised learning techniques are then employed in order to partition the space of raw (single image) to final (full sequence) contrast in order to better predict the value of the final data set from the first few completed observations. Finally, we discuss the particular features of the database design that aid in performing these analyses, and suggest potential future upgrades and refinements.