• multiclass classification with bandit feedback using adaptive regularization

    جزئیات بیشتر مقاله
    • تاریخ ارائه: 1392/07/24
    • تاریخ انتشار در تی پی بین: 1392/07/24
    • تعداد بازدید: 1003
    • تعداد پرسش و پاسخ ها: 0
    • شماره تماس دبیرخانه رویداد: -
     we present a new multiclass algorithm in the bandit framework, where after making a prediction, the learning algorithm receives only partial feedback, i.e., a single bit indicating whether the predicted label is correct or not, rather than the true label. our algorithm is based on the second-order perceptron, and uses upper-confidence bounds to trade-off exploration and exploitation, instead of random sampling as performed by most current algorithms. we analyze this algorithm in a partial adversarial setting, where instances are chosen adversarially, while the labels are chosen according to a linear probabilistic model which is also chosen adversarially. we show a regret of , which improves over the current best bounds of  in the fully adversarial setting. we evaluate our algorithm on nine real-world text classification problems and on four vowel recognition tasks, often obtaining state-of-the-art results, even compared with non-bandit online algorithms, especially when label noise is introduced.

سوال خود را در مورد این مقاله مطرح نمایید :

با انتخاب دکمه ثبت پرسش، موافقت خود را با قوانین انتشار محتوا در وبسایت تی پی بین اعلام می کنم
مقالات جدیدترین رویدادها