Implementation of a Multi-Objective Genetic Algorithm on Word Segmentation in Modern Greek

Z. Detorakis and G. Tambouratzis (Greece)


Genetic Algorithms, Minimum Description Length, stem, suffix.


A genetic algorithm (GA) is presented in this article aiming at the automated extraction of morphological information from a corpus and ultimately at the creation of a computational model capable of distinguishing the stem of a word from its inflectional suffix. A multi objective approach of a GA (MGA) is introduced, where different objective functions are used for the selection of each parent that participates in the reproduction operation of the GA. The system is presented with a training corpus, and subsequently used to segment a different test corpus. The effect that various parameters, relevant to the GA, have on the performance of the system is examined and conclusions are drawn on their optimum values.

