Machine Learning Seminar: Polynomial approximation, moment matching and optimal estimation of the unseen
Wednesday, April 6, 2016
3:30 pm - 5:00 pm
Gross Hall 330
Yihong Wu, University of Illinois at Urbana-Champaign
Estimating the support size of a distribution is a classical problem in statistics, dating back to the early work of Fisher, Good-Turing, and the influential work by Efron-Thisted on "how many words did Shakespeare know." This problem has also been investigated by the CS theory community under the umbrella of property testing. In the sublinear regime where the alphabet cardinality far exceeds the sample size, the challenge lies in how to extrapolate the number of unseen symbols based on what have been observed so far.