Making Machine Learning Robust to Missing and Dependent Observations

Abstract: Statistical estimation typically assumes access to uncensored and independent observations. In practice, data is commonly censored due to measurement errors, legal restrictions, and data collection or sharing practices. Moreover, observations are commonly collected on a network, a spatial or a temporal domain and may be intricately correlated. We present recent work on statistical estimation from censored and dependent data. We first present a framework for statistical learning under truncated samples. Truncation is a strong type of censoring, which occurs when samples falling outside of some set S are not observed, and their count in proportion to the observed samples is also not observed. We then provide a statistical learning framework for samples that are weakly dependent.

Bio: Constantinos Daskalakis is a professor of computer science and electrical engineering at MIT. He holds a diploma in electrical and computer engineering from the National Technical University of Athens, and a Ph.D. in electrical engineering and computer sciences from UC-Berkeley. His research interests lie in theoretical computer science and its interface with economics, probability, learning and statistics. He has been honored with the 2007 Microsoft Graduate Research Fellowship, the 2008 ACM Doctoral Dissertation Award, the Game Theory and Computer Science (Kalai) Prize from the Game Theory Society, the 2010 Sloan Fellowship in Computer Science, the 2011 SIAM Outstanding Paper Prize, the 2011 Ruth and Joel Spira Award for Distinguished Teaching, the 2012 Microsoft Research Faculty Fellowship, the 2015 Research and Development Award by the Vatican Giuseppe Sciacca Foundation, the 2017 Google Faculty Research Award, the 2018 Simons Investigator Award, and the 2018 Rolf Nevanlinna Prize from the International Mathematical Union. He is also a recipient of Best Paper awards at the ACM Conference on Economics and Computation in 2006 and in 2013.