• improving monte carlo tree search by combining rave and quality-based rewards algorithms

    جزئیات بیشتر مقاله
    • تاریخ ارائه: 1396/11/30
    • تاریخ انتشار در تی پی بین: 1396/11/30
    • تعداد بازدید: 354
    • تعداد پرسش و پاسخ ها: 0
    • شماره تماس دبیرخانه رویداد: -

    monte-carlo tree search is a state-of-the-art method for building intelligent agents in games and has been focus of many researchs through past decade. by using this method, the agents are able to master the games through building a search tree based on samples gathered by randomized simulations. in most of the researchs, the reward from simulations are discrete values representing final state of the games (win, loss, draw), e.g., rє {-1, 0, 1}. in this paper, we introduce a method which modifies reward for each playout. then it backpropagates the reward through uct and amaf values. rave algorithm is used to evaluate the nodes more accurately in each tree breadth. we implemented the algorithm along with last-good-reply, decisive-move and poolrave heuristics. in the end we used leaf parallelization in order to increase the samples gathered by simulations. all implementations are examined in the game of hex in 9 × 9 board. we show the proposed method can improve the performance in the domain discussed.

سوال خود را در مورد این مقاله مطرح نمایید :

با انتخاب دکمه ثبت پرسش، موافقت خود را با قوانین انتشار محتوا در وبسایت تی پی بین اعلام می کنم
مقالات جدیدترین رویدادها
مقالات جدیدترین ژورنال ها