• spoken query based word spotting in digitized tamil documents

    نویسندگان :
    جزئیات بیشتر مقاله
    • تاریخ ارائه: 1392/07/24
    • تاریخ انتشار در تی پی بین: 1392/07/24
    • تعداد بازدید: 1019
    • تعداد پرسش و پاسخ ها: 0
    • شماره تماس دبیرخانه رویداد: -
     this paper presents an integrated approach to spot the spoken keywords in digitized tamil documents by combining word image matching and spoken word recognition techniques. the work involves the segmentation of document images into words, creation of an index of keywords, and construction of word image hidden markov model (hmm) and speech hmm for each keyword. the word image hmms are constructed using seven dimensional profile and statistical moment features and used to recognize a segmented word image for possible inclusion of the keyword in the index. the spoken query word is recognized using the most likelihood of the speech hmms using the 39 dimensional mel frequency cepstral coefficients derived from the speech samples of the keywords. the positional details of the search keyword obtained from the automatically updated index retrieve the relevant portion of text from the document during word spotting. the performance measures such as recall, precision, and f-measure are calculated for 40 test words from the four groups of literary documents to illustrate the ability of the proposed scheme and highlight its worthiness in the emerging multilingual information retrieval scenario.

سوال خود را در مورد این مقاله مطرح نمایید :

با انتخاب دکمه ثبت پرسش، موافقت خود را با قوانین انتشار محتوا در وبسایت تی پی بین اعلام می کنم
مقالات جدیدترین رویدادها
مقالات جدیدترین ژورنال ها