Open Calais is a free Web Service from Thomson Reuters that helps you automatically
analyze your unstructured textual documents – by converting the unstructured text into a
meaningful structured text - metadata. We call it –aboutness. You can send document by
document using the HTTP Restful API and receive back the metadata for every document
you send. The metadata is a list of entities, events, relations, facts and topics. You can learn
more by reading here.
To get more familiar with Open Calais and how to submit documents please read
To start using Open Calais please read this Get Started page.
There are several code samples to use in Java, python, .Net, Ruby and more here.
Few tips on some of the cool capabilities of Open Calais:
Open Calais provides for every entity extracted – the relevance score – how relevant this
entity to the document. This is a very important capability to help you filter the value from
the huge data out there. For example many documents mention Google but much less of the
many documents are really about Google. Relevance score helps you find the documents
that are about particular entity.
SocialTags are categories applied to a document that are coming from the Wikipedia
taxonomy. Very interesting. Each category is relevant to the document (although the term
might not be explicitly mentioned in the document) and these categories are also articles in
Generic Relations helps you find any kind of relationships between entities to entities or to
Disambiguating entities: Companies, Geographies, people – are disambiguated and resolved
to a normalized name, increasing the accuracy of the extraction and providing a consisted
tagging across many different documents.
Open Calais is one of the first Linked Data services out there. You can read more about
Linked Data in Open Calais here. It is linked to Freebase, DBpedia, IMDB and more.
For questions – please email us.
Useful Stuff and Link