A Literature Review: Stemming Algorithms for Indian Languages
| International Journal of Computer Trends and Technology (IJCTT) | |
© - August Issue 2013 by IJCTT Journal | ||
Volume-4 Issue-8 | ||
Year of Publication : 2013 | ||
Authors :M.Thangarasu, Dr.R.Manavalan |
M.Thangarasu, Dr.R.Manavalan"A Literature Review: Stemming Algorithms for Indian Languages"International Journal of Computer Trends and Technology (IJCTT),V4(8):2582-2584 August Issue 2013 .ISSN 2231-2803.www.ijcttjournal.org. Published by Seventh Sense Research Group.
Abstract:- Stemming is the process of extracting root word from the given inflection word. It also plays significant role in numerous application of Natural Language Processing (NLP). The stemming problem has addressed in many contexts and by researchers in many disciplines. This expository paper presents survey of some of the latest developments on stemming algorithms in data mining and also presents with some of the solutions for various Indian language stemming algorithms along with the results.
References-
[1] Alkula, R. From plain character strings to meaningful words: Producing better full text databases for inflectional and compounding languages with morphological analysis software. Information Retrieval, 4, (2001), 195-208.
[2] Krovetz, R. Viewing morphology as an inference process. In Proceedings of the Sixteenth Annual InternationalACM/SIGIR Conference on Research and Development in Information Retrieval (SIGIR’03) (Pittsburg, PA, 27 June – 1 July 1993). ACM Press, New York, NY, 1993, 191-202.
[3] Nilsson, M. Hierarchical clustering using non-greedy principal direction divisive partitioning. Information Retrieval, 5, 4 (2002), 311-321.
[4] Popovic, M., and Willett, P. The effectiveness of stemming for natural-language access to Slovene textual data. Journal of the American Society for Information Science, 43, 1 (1992), 384-390.
[5] Savoy, J. A stemming procedure and stopword list for general French corpora. Journal of the American Society for Information Science, 50, 10 (1999), 944-952.
[6] Kalamboukis, T. Z. Suffix stripping with modern Greek. Program, 29, 3 (1995), 313-321.
[7] Abu-Salem, H., Al-Omari, M., and Evens, M. W. Stemming methodologies over individual query words for an Arabic information retrieval system. Journal of the American Society for Information Science, 50, 6 (1999), 524-529.
[8] Rosell, M., Improving clustering of Swedish newspaper articles using stemming and compound splitting. In 14th Nordic Conference on Computational Linguistics (NoDaLiDa 2003). http://www.nada.kth.se/~rosell/publications/papers/improvingClustering03.pdf
Keywords : — Tamil morphology, Tamil stemmer, Light stemmer, Improved stemmer, Natural Language Processing.