


Computing
Back
 Automata and Dictionaries
F. Guenthner, D. Maurel
Automata and Dictionaries is aimed at students and specialists in natural language processing and related disciplines where efficient text analysis plays a role. Large linguistic resources, in particular lexica, are now recognized as a fundamental prerequisite for all natural language processing tasks. Specialists in this domain cannot afford to be ignorant of the stateoftheart lexiconmanagement algorithms. This monograph, which is also intended be used as an advanced text book in computational linguistics, fills a gap in natural language processing monographs and is complementary to other publications in this area.
This book is also a source of examples, exercises and problems for software engineering in general. The algorithms that are presented are excellent examples of nontrivial problems of graph construction, graph handling and graph traversal. Even though published in scientific journals, they have not been presented in an easily accessible form so far to teachers and students. These algorithms will also be of interest for the training of software engineers.
Chapter 1 of Automata and Dictionaries provides the applicationoriented motivation for solving the problems studied in the rest of the book. It introduces and exemplifies several key notions of lexiconbased natural language processing in a way accessible to any computer science student.
Chapter 2 surveys the main solutions of the problem, using as an example a very small toy lexicon. Chapter 3 defines the underlying mathematical notions, immediately illustrating theory with practical examples, which makes this part quite readable.
Chapters 4 and 5 are dedicated to the two central notions of lexicon construction: the algorithms of determinization and minimization. The standard form of both algorithms is presented, but also their variants and some special cases that occur frequently in practice. The operation of the algorithms is described step by step in examples, introducing the beginner into the world of epsilontransitions, state heights and reverse automata.
Chapter 6 goes a step further into complexity. It is based on algorithms published by scholars from 1998 to now. They are presented here with the same clarity as the preceding, more classical, algorithms. This remarkable achievement owes much to the rigorous structuration of this chapter. These algorithms have variants for transducers, which are presented in Chapter 7 with the same pedagogical skill.
The last chapter studies time and space complexity of the algorithms and explains several tricks useful to speed up their operation.
16 December 2005
ISBN 190498732X
Buy from Amazon: UK US


