Информация о публикации

Просмотр записей
Инд. авторы: Fedotov A.M., Tusupov J.A., Sambetbayeva M.A., Fedotova O.A., Sagnayeva S.K., Bapanov A.A., Tazhibaeva S.Z.
Заглавие: Classification model and morphological analysis in multilingual scientific and educational information systems
Библ. ссылка: Fedotov A.M., Tusupov J.A., Sambetbayeva M.A., Fedotova O.A., Sagnayeva S.K., Bapanov A.A., Tazhibaeva S.Z. Classification model and morphological analysis in multilingual scientific and educational information systems // Journal of Theoretical and Applied Information Technology. - 2016. - Vol.86. - Iss. 1. - P.96-111. - ISSN 1992-8645. - EISSN 1817-3195.
Внешние системы: РИНЦ: 27004356; SCOPUS: 2-s2.0-84963617429;
Реферат: eng: The article describes the issues of building models of documentary and factographic search in multilingual scientific and educational information systems, working with documents of rather free structure. A model of information system document classification is proposed based on the use of tolerance relation, taking into account possible absence of a priori defined classifiers. Particular attention is paid to formation of feature space, taking into account the morphology of the document language. The article contains an overview of morphological text analyzers that can be used to determine the normal form of word. The features of application of morphological analyzers are described; their advantages and disadvantages are listed. The rules for normalization of words of the Kazakh and the algorithm to handle both vocabulary and nonvocabulary (including non-existent) words are developed. A multilingual thesaurus of scientific and technical concepts (terms) on information technology in English, Russian and Kazakh is developed. The system of term normalization and interlingual compliance is implemented for it. © 2005 - 2016 JATIT & LLS. All rights reserved.
Ключевые слова: Tolerance relation; Morphology; Morphological text analyzers; Educational information systems; Document classification; Factographic search;
Издано: 2016
Физ. характеристика: с.96-111
Цитирование:
1. D. Knuth, “The Art of Computer Programming”, Sorting and Searching, Second Edition, Massachusetts: Addison-Wesley, 1998. ISBN 0-201-89685-0.
2. A.I. Mikhailov, A.I. Chernyi and R.S. Gilyarevskii, “Fundamentals of Informatics”, Moscow: Nauka, 1968.
3. A.I. Mikhailov, A.I. Chernyi and R.S. Gilyarevskii, “Scientific Communications and Informatics”, Moscow: Nauka, 1976.
4. Yu.I. Shokin, A.M. Fedotov and V.B. Barakhnin, “Problems of Information Retrieval”, Novosibirsk: Nauka, 2010.
5. Yu.M. Arskii, R.S. Gilyarevskii, I.S. Turov, and A.I. Chernyi, “Infosphere: Information Structures, Systems, and Processes in Science and Society”, Moscow: VINITI, 1996.
6. A.I. Rakitov, “Encyclopedia of Philosophy”, vol. 5, Moscow: Sovetskaya Entsiklopediya, 1970, p. 298.
7. Kazakh grammar. Phonetics, word formation, morphology, syntax, Astana, 2002.
8. M. Balakaev, Modern Kazakh, Astana, 2006.
9. A.M. Fedotov, O.L. Zhizhimov, O.A. Fedotova, V.B. Barakhnin. “A model of information system to support scientific and educational activities”, Vestnik NSU Series: Information Technologies, vol. 12, no 1, 2014, pp. 89-101. ISSN 1818-7900.
10. S.R. Ranganatan, Colon Classification, 6th ed., Bombay: Asia, 1963.
11. V.B. Barakhnin, A.M. Fedotov, “Building models of documentary and factographic retrieval in digital libraries”, Automatic Documentation and Mathematical Linguistics. vol.48, no. 6, 2014, pp. 296-304. ISSN 0005-1055, EISSN 1934-8371.
12. Yu.A. Shreider, Equality, Similarity, Order, Moscow: Nauka, 1971.
13. G. Salton, Dynamic information and library processing. N.J.: Prentice Hall, 1975.
14. M.F. Porter, “An algorithm for suffix stripping”, Program, vol. 14, no. 3, 1980, pp. 130-137.
15. P. Willett, “The Porter stemming algorithm: then and now”, Program: Electronic Library and Information Systems, vol. 40, no. 3, 2006, pp. 219-223.
16. Segalovich, “A fast morphological algorithm with unknown word guessing induced by a dictionary for a web search engine”, 2003, pp. 273-280.
17. I.V. Segalovich, M.A. Maslov, “Russian morphological analysis and synthesis with generation of models of inflection for words not described in the dictionary”, Moscow: Dialog, vol. 2, 1998, pp. 547-552.
18. Mystem morphological analyzer of text in Russian [e-resource]; Yandex Company [site], 2003-2013,URL: http://company.yandex.ru/technologies/mystem/
19. Library phpMorphy, URL: http://phpmorphy.sourceforge.net
20. A.A. Rybanov, “Automated determination of quantitative characteristics of text”, Modern scientific research and innovations, vol. 34, no. 2, 2014, p. 5.
21. K.B. Bektaev, Statistical and information typology of Turkic text. Almaty, 1978, p.183.
22. K.B. Bektaev, R.G. Piotrovsky, “Mathematical methods in linguistics”, Probability theory and simulation of language standard. Almaty: Publishing house of KazSU n.a. Kirov, 1973, 281 p.
23. K. Bektayev, Big Kazakh-Russian and Russian- Kazakh dictionary, Almaty: “Altyn Kazyna”, 1999.
24. A.A. Sharipbayev, G.T. Bemanova, “Building logical semantics of the words in the Kazakh language”, Knowledge-Ontologies-Theories: Proc. of All-Russian Conf. with int. participation, October 3-5, 2011, Novosibirsk, 2011.
25. U. Tukeev, D.R. Rakhimova, “Augmented attribute grammar in meaning of natural languages sentences”, SCIS-ISIS, The 6th International Conference of Soft Computing and Intelligent Systems, The 13th International Symposium on Advanced Intelligent Systems (November 20-24), Kobe, Japan, 2012, pp. 1080-1084.
26. A.A. Sharipbayev, G.T. Bekmanova, B.Zh. Ergesh, A.K. Buribaeva, M.Kh. Karabalaeva, “Intelligent morphological analyzer based on semantic networks”, Proceedings of the international scientific conference “Open Semantic Technology for Intelligent Systems” (OSTIS-2012), Minsk: BSUIR, February 16-18, 2012, pp. 397-400.
27. Y.I. Shokin, A.M. Fedotov, O.L. Zhizhimov, O.A. Fedotova, “The evolution of information systems: from websites to information resource management systems”, Vestnik NSU Series: Information Technologies, vol., 13, no. 1, 2015, pp. 117-134. ISSN 1818-7900.
28. Y.I. Shokin, A.M. Fedotov, O.L. Zhizhimov, O.A. Fedotova, “Electronic library management system at integrated Distributed Information System of SB RAS”, Infrastructure of scientific information resources and systems: Collection of scientific articles of the Fourth All-Russian Symposium, Moscow: Computing Center of the Russian Academy of Sciences, vol. 1, 2014, pp.11-39. ISBN: 978-5-19601-103-6.