Мобильная версия

Доступно журналов:

3 288

Доступно статей:

3 891 637

 

Скрыть метаданые

Автор Owens, M.
Автор OʼBoyle, P.
Автор McMahon, J.
Автор Ming, J.
Автор Smith, F., J.
Дата выпуска 1997
dc.description This paper presents results from a series of missing-word tests, in which a small fragment of text is presented to human subjects who are then asked to suggest a ranked list of completions. The same experiment is repeated with the WA model, an n-gram statistical language model. From the completion data two measures are obtained: (i) verbatim predictability, which indicates the extent to which subjects nominated exactly the missing word, and (ii) grammatical class predictability, which indicates the extent to which subjects nominated words of the same grammatical class as the missing word. The differences in language model performance and human performance are encouragingly small, especially for verbatim predictability. This is especially significant given that the WA model was able, on average, to use at most half the available context. The results highlight human superiority in handling missing content words. Most importantly, the experiments illustrate the detailed information one can obtain about the performance of a language model through using missing-word tests.
Издатель SAGE Publications
Тема language model evaluation
Тема missing-word tests
Тема statistical language modeling
Название A Comparison of Human and Statistical Language Model Performance using Missing-Word Tests
Тип Journal Article
DOI 10.1177/002383099704000404
Print ISSN 0023-8309
Журнал Language and Speech
Том 40
Первая страница 377
Последняя страница 389
Аффилиация Owens, M., The Queenʼs University of Belfast
Аффилиация OʼBoyle, P., The Queenʼs University of Belfast
Аффилиация McMahon, J., The Queenʼs University of Belfast
Аффилиация Ming, J., The Queenʼs University of Belfast
Аффилиация Smith, F., J., The Queenʼs University of Belfast
Выпуск 4
Библиографическая ссылка ABORN, M., RUBENSTEIN H., & STERLING, T.D. (1959). Sources of contextual constraint upon words in sentences. Journal of Experimental Psychology, 57, 171–180.
Библиографическая ссылка BAHL, L.R., JELINEK, F., & MERCER, R.L. (1983). A maximum likelihood approach to continuous speech recognition. IEEE Transactions on Speech and Audio Processing, 1, 77–83.
Библиографическая ссылка BROWN, P., DELLA PIETRA, S., DELLA PIETRA, V., LAI, J., & MERCER, R. (1992). An estimate of an upper bound for the entropy of English. Computational Linguistics, 18, 1, 31–40.
Библиографическая ссылка FILLENBAUM, S., JONES, L.V., & RAPOPORT, A. (1963). The predictability of words and their grammatical classes as a function of rate of deletion from a speech transcript. Journal of Verbal Learning and Verbal Behavior, 2, 186–194.
Библиографическая ссылка GUPTA, V., LENNING, M., & MERMELSTEIN, P. (1992). A language model for very large vocabulary speech recognition. Computer Speech and Language, 6, 331–344.
Библиографическая ссылка JELINEK, F., MERCER, R.L., BAHL, L.R., & BAKER, J.K. (1977). Perplexity: A measure of difficulty of speech recognition tasks. Journal of the Acoustic Society of America, 62, 63.
Библиографическая ссылка JELINEK, F., & MERCER, R.L. (1980). Interpolated estimation of Markov source parameters from sparse data. In E. S. Gelsema & L. N. Kanal (eds.), Pattern Recognition in Practice (pp. 381–397). Amsterdam: North-Holland Publishing Company.
Библиографическая ссылка KUĈERA, H., & FRANCIS, W.N. (1967). Computational analysis of present-day American English. Providence, Rhode Island: Brown University Press.
Библиографическая ссылка McMAHON, J., & SMITH, F.J. (1996). Improving statistical language model performance with automatically generated word hierarchies, Computational Linguistics, 22, 217–247.
Библиографическая ссылка OʼBOYLE, P.L., OWENS, M., & SMITH, F.J. (1994). A weighted average n-gram model of natural language, Computer Speech and Language, 8, 337–349.
Библиографическая ссылка ROSE, T.G., & LEE, M.J. (1994). Language modeling for large vocabulary speech recognition, IOA Meeting on Large Vocabulary Speech Recognition, Cambridge University, England.
Библиографическая ссылка SHANNON, C.E. (1951). Prediction and entropy of printed English. Bell System Technical Journal, 50–64.
Библиографическая ссылка SHILLCOCK, R.C., & BARD, E.G. (1993). Modularity and the processing of closed class words. In G. Altmann & R. Shillcock (eds.), Cognitive models of speech processing: The sperlonga meeting II (pp. 163–185). Hillsdale, NJ: Erlbaum.
Библиографическая ссылка SMITH, F.J., & DEVINE, K. (1985). Storing and retrieving word phrases. Information Processing Management, 21, 215–224.
Библиографическая ссылка SUTCLIFFE, R.F. E., OʼSULLIVAN, D., McELLIGOTT, A., & SHEAHAN, L. (1995). Creation of a semantic lexicon by traversal of a machine tractable concept taxonomy. Journal of Quantitative Linguistics, 2, 33–42.
Библиографическая ссылка TAYLOR, W.L. (1953). “Cloze procedure”: A new tool for measuring readability. Journalism Quarterly, 30, 415–433.
Библиографическая ссылка UEBERLA, J. (1994). Analyzing a simple language model: Some general conclusions for language models for speech recognition. Computer Speech and Language, 8, 153–176.

Скрыть метаданые