Maui - Multi-purpose automatic topic indexing
Current repository: github.com/zelandiya/maui
Old repository: maui-indexer.googlecode.com
Maui extends the keyphrase indexing algorithm Kea and is a GNU GPL Licensed library.
It performs the following tasks:
It can also be used for terminology extraction and semi-automatic topic indexing.
- keyphrase extraction
- automatic tagging
- term assignment with a controlled vocabulary, thesaurus or a taxonomy
- subject indexing
- extracting most relevant concepts and entities from Wikipedia (only in older versions, or in MauiPro, see below).
MauiPro for commercial application:
MauiPro is a re-implementation of Maui GPL that ensures scalability and easy integration in commercial applications. It can be used as a service, via an API or via a licensed hosted server, e.g. EC2 on Amazon. Get in touch to find out more.
About the name: In Māori
is a culture hero.
He fished out the North Island of New
Zealand with a hook
made out of his jaw-bone (above).
KEA - Keyphrase extraction algorithm -
I have extended the
original version of the keyphrase extraction algorithm Kea-3.0 (designed
for free indexing) into a new version that performs controlled indexing
Kea-4.1 (also known as Kea++).
Given a document and a thesaurus or controlled vocabulary (Kea accepts
any vocabulary in the
SKOS format), Kea selects a list of phrases from this vocabulary describing
the document's main topics. (See examples
of Kea's performance on different domains).
ELKB - Electronic Lexical Knowledge Base
Java package for
accessing and exploring Roget's
Thesaurus, originally developed by Mario Jarmasz, University of Ottawa.
ELKB includes several NLP-applications: for detecting lexical chains in
text, determining semantic distance between words and phrases, clustering
words based on their meaning and solving a word quiz.
Other interesting projects: