Computer Science Seminar
Marrying Domain Knowledge and Computational Methods
When: Monday, October 27, 
2014
 Where: PGH 232
 Time: 11:00 AM
Speaker: Dr. Ashish Mahabal, California Institute of Technology (Caltech)
Host: Prof. Ricardo Vilalta
Astronomy datasets have been large and are getting larger by the day (TB to PB). This necessitates the use of advanced statistics and machine learning for many purposes. However, the datasets are often so large that small contamination rates imply large number of wrong results. This makes blind applications of methodologies unattractive. Astronomical transients are one area where rapid follow-up observations are required based on very little data.
We show how the use of domain knowledge in the right measure at the right juncture can improve classification performance. We will bring up various computational methods, some established, some not so established that are being used for detecting outliers and choosing optimal ones for best science returns. With an eye on PB-sized datasets coming up soon, we use time-series data from existing sky-surveys like the Catalina Real-Time transient Survey (along with auxiliary data) which has covered 80% of the sky several tens to a few hundreds of times over the last decade.
We will also bring up an unconnected problem with some parallels - our JPL collaboration for the search of Cancer biomarkers in Early Detection Research Network (EDRN).
Bio:
Ashish Mahabal is a Senior Research Scientist in Astronomy at Caltech. He is interested in astronomical transients and has worked on several sky surveys and is the co-chair of the LSST Transients and Variable Stars group. He works on Big Data, data fusion, machine learning and real-time classification of anomalies. Received his PhD in 1998 from the Inter-university Centre for Astronomy and Astrophysics at Pune in India.