Open-source

Maui
- Multi-purpose automatic topic indexing

Current repository: github.com/zelandiya/maui
Old repository: maui-indexer.googlecode.com

Maui extends the keyphrase indexing algorithm Kea and is a GNU GPL Licensed library.

It performs the following tasks:
  • keyphrase extraction
  • automatic tagging
  • term assignment with a controlled vocabulary, thesaurus or a taxonomy
  • subject indexing
  • extracting most relevant concepts and entities from Wikipedia (only in older versions, or in MauiPro, see below).
It can also be used for terminology extraction and semi-automatic topic indexing. 

MauiPro for commercial application:

MauiPro is a re-implementation of Maui GPL that ensures scalability and easy integration in commercial applications. It can be used as a service, via an API or via a licensed hosted server, e.g. EC2 on Amazon. Get in touch to find out more.

About the name: In Māori mythology, Māui is a culture hero.
He fished out the North Island  of New Zealand with a hook
made out of his jaw-bone (above).


KEA - Keyphrase extraction algorithm -
www.nzdl.org/kea

I have extended the original version of the keyphrase extraction algorithm Kea-3.0 (designed for free indexing) into a new version that performs controlled indexing Kea-4.1 (also known as Kea++).

Given a document and a thesaurus or controlled vocabulary (Kea accepts any vocabulary in the SKOS format), Kea selects a list of phrases from this vocabulary describing the document's main topics. (See examples of Kea's performance on different domains).




ELKB - Electronic Lexical Knowledge Base
www.nzdl.org/elkb

Java package for accessing and exploring Roget's Thesaurus, originally developed by Mario Jarmasz, University of Ottawa. ELKB includes several NLP-applications: for detecting lexical chains in text, determining semantic distance between words and phrases, clustering words based on their meaning and solving a word quiz.



Other interesting projects:

 




 







Comments