タイトル | Spoken Language Communication Technology: Development of the SprinTra WFST Speech Decoder |
著者(日) | ディクソン, ポール R.; 堀, 智織; 柏岡, 秀樹 |
著者(英) | Dixon, Paul Richard; Hori, Chiori; Kashioka, Hideki |
著者所属(日) | 情報通信研究機構ユニバーサルコミュニケーション研究所音声コミュニケーション研究室(NICT); 情報通信研究機構ユニバーサルコミュニケーション研究所音声コミュニケーション研究室(NICT); 情報通信研究機構ユニバーサルコミュニケーション研究所音声コミュニケーション研究室(NICT) |
著者所属(英) | Spoken Language Communication Laboratory, Universal Communication Research Institute, National Institute of Information and Communications Technology (NICT); Spoken Language Communication Laboratory, Universal Communication Research Institute, National Institute of Information and Communications Technology (NICT); Spoken Language Communication Laboratory, Universal Communication Research Institute, National Institute of Information and Communications Technology (NICT) |
発行日 | 2013-01-17 |
発行機関など | 情報通信研究機構(NICT) National Institute of Information and Communications Technology (NICT) |
刊行物名 | 情報通信研究機構英文論文集 Journal of the National Institute of Information and Communications Technology |
巻 | 59 |
号 | 3-4 |
開始ページ | 15 |
終了ページ | 20 |
刊行年月日 | 2013-01-17 |
言語 | eng |
抄録 | In this paper we describe the NICT Weighted Finite State Transducer (WFST) based speech decoder named SprinTra. The paper starts with a brief introduction to WFSTs and the accompanying mathematical notation. This is followed by an introduction to the use of WFSTs in speech recognition, here we give a brief description of the WFST components used in a typical speech recognition system, and explain how they are combined and optimized to yield very efficient decoder search spaces. After describing these preliminaries we move on to a high level description of the features and architecture of SprinTra. Our focus was to design and implement an engine suitable for research and deployment usage. To bring the state-of-the art speech recognition technology to as many users as possible SprinTra can run on many platforms and be additionally accessed through various programming interfaces and scripting layers. The supporting tools are also designed with portability and good usability, and this allow users and non-speech recognition experts to easily construct state-of-the-art speech recognition systems. The description of SprinTra's core features includes a description our on-the-fly algorithm we have proposed to allow for memory efficient composition of class N-gram models. |
内容記述 | 形態: 図版あり Physical characteristics: Original contains illustrations |
キーワード | WFST; Speech recognition; Decoder; On-the-fly composition |
資料種別 | Technical Report |
ISSN | 1349-3205 |
NCID | AA12009289 |
SHI-NO | AA0065494003 |
URI | https://repository.exst.jaxa.jp/dspace/handle/a-is/21860 |