No longer maintained, but feel free to have a look.

- Multi-purpose automatic topic indexing

Current repository:
Old repository:

Maui extends the keyphrase indexing algorithm Kea and is a GNU GPL Licensed library.

It performs the following tasks:
  • keyphrase extraction
  • automatic tagging
  • term assignment with a controlled vocabulary, thesaurus or a taxonomy
  • subject indexing
  • extracting most relevant concepts and entities from Wikipedia
It can also be used for terminology extraction and semi-automatic topic indexing. 

About the name: In Māori mythology, Māui is a culture hero.
He fished out the North Island  of New Zealand with a hook
made out of his jaw-bone (above).

KEA - Keyphrase extraction algorithm -

I have extended the original version of the keyphrase extraction algorithm Kea-3.0 (designed for free indexing) into a new version that performs controlled indexing Kea-4.1 (also known as Kea++).

Given a document and a thesaurus or controlled vocabulary (Kea accepts any vocabulary in the SKOS format), Kea selects a list of phrases from this vocabulary describing the document's main topics. (See examples of Kea's performance on different domains).

ELKB - Electronic Lexical Knowledge Base

Java package for accessing and exploring Roget's Thesaurus, originally developed by Mario Jarmasz, University of Ottawa. ELKB includes several NLP-applications: for detecting lexical chains in text, determining semantic distance between words and phrases, clustering words based on their meaning and solving a word quiz.

Other interesting projects: