Yesterday I had the pleasure of sitting down with Thomas Tague and Krista Thomas from Thomson Reuters.., I know, thats a lot of “T-h-o-m”s. They are here in San Jose this week at the Semantic Technology conference to make some announcements about their Open Calais semantic service including some new publishing partners as well as some new features.
While we have the music and newspaper industries flailing around trying to find relevancy and profits in todays web driven world Thomson Reuters gazed into their future a few years back and saw the rise of blogs, social networks and user generated content potentially start to erode percentage points off of their bottom line. They listened to their customers who were showing an interesting in all things “web 2.0” and instead of dismissing them and staying true to the old ways they started looking into ways give their customers what they wanted. This train of thought led them down the path of acquiring Clearforest, a company providing software solutions for semantic tagging and text processing to enterprises in various industries.
After the acquisition TR took the basis of the Clearforest tagging offering and unveiled the Open Calais project. TR has some really smart folks in their org, not only in tech group but also on the business development team. When they launched the semantic service the key word was “Open”. From the get go it was about giving people an enterprise level semantic processing engine with zero barrier to test it out to see what kinds of integrations and services developers would come up with. For a company that spends billions of dollars a year producing and aggregating content they knew that to evolve the service they needed to enlist numbers only seen when handing over the keys to the car and letting developers take it for a spin at no cost.
Today the Open Calais service has over 10,000 developers building services and applications on top of their platform and sees around 3,000,000 calls come each day. While startups and the VC community burned millions chasing the “semantic web” to build homegrown engines and services TR has commoditized the hell out of the semantic web providing a great service for FREE that anyone can sign up for and instantly benefit from.
One really interesting project that is making use of the free service is Harvard and their Media Cloud research project. The project aims to prove an open service to track all forms of media as it is produced and spreads around the globe from main news outlets through to blogs and the rest of the long tail of content production.
The free service will suffice a lot of developers/publishers needs providing 40,000 API calls a day and up to four transactions per second. If you are interested in giving it a test run sign up for a key here.
For those developers and enterprises out there that need a little more heft in the transaction department or need an SLA in place to make the folks who are singing your paychecks a little more comfortable they also provide a professional service that touts 2,000,000 per day at a 20 call per second clip as well as a premise based solution that you could bring inside your firewall should you have such constraints due to your data retention/storage needs. Quick side note to clarify, at no time does the Open Calais web service store your data, they process it, provide you back the meta data about it, then throw it away. They state as much in their terms but thought I would point that out since I know how much everyone loves reading terms of service pages.
So back to the big news they are announcing.
Expect this to be the initial list in a long line of content producers/partners that the project will be announcing this year.
On the technology side they announced yesterday what will be included in their new 4.1 release as well as their 4.2 update that will be coming out shortly. With these two updates TR is providing what they refer to as ’social tags’ which provide more natural language tag results to content such as “new movie releases” instead of just “entertainment” as well as support for Spanish.
One other little fun/timely offering they are unveiling is a add-on in what they call a “fact pack” to their company ontology that will provide recession related content such as layoffs, earnings, accounting changes, etc.
If the TR folks keep down the path they have been following in a couple years they will be well on their way to being the defacto semantic engine that everyone in the web community will turn to in order to tag and gain relevancy and insight into their content.
Follow my real time posts from the conference as well as a recap of on the event here later in the week.
CDYKES: Great article and tremendous exmaple of the leverage that can be gained through a ‘web services’ model. With 10,000 developers developing to this service, applications and web businesses are being built that could never have been envisioned by a single company. The resulting business development funnel is fat with prospects for the ultimate end game carrot: the monetized API.