Automated linking with rich text editors
The Web Dev Zone is brought to you in partnership with Mendix. Discover how IT departments looking for ways to keep up with demand for business apps has caused a new breed of developers to surface - the Rapid Application Developer.
The web is built of links, of pages linking to other resources on the internet. But making those links manually is tedious. This is another area where modern inline editors could do better.
I was using Wikipedia the other day and it occurred to me how primitive it is to have all the inner links to other Wikipedia articles defined manually, surely these should have been automated by now (i.e., marking a word or two would link you to the relevant article).
And indeed, this is a usability problem that can already be fixed with the Semantic Interaction stack underneath Create.js.
Annotate.js is a VIE widget built by Szaby Grünwald. It works very similarly to a spell checker in traditional text editors — you write text, and it highlights the potential entities you might want to link to. You then can either accept or decline these link suggestions by clicking them. In case of multiple potential matches, you can also disambiguate between them by selecting from an offered list.
Here is a quick video of Annotate.js in action:
You can also try it yourself with an online demo.
Connecting to entities
The big question with automatic linking is where the entities come from. There are services like OpenCalais that can provide these suggestions for your content, but most of them are focused only on shared knowledge bases of big companies, famous people, and major cities.
Unless you're running a newspaper, it is unlikely that these are the things your content is about.
Apache Stanbol is an open source engine that can provide the enhancements for you. Out of the box it provides suggestions based on the Wikipedia knowledge repository. But more importantly, you can feed it with your own entities.
This way the enhancements you get for your content can be tuned to be meaningful to your content and your audience. If you write about medicine, they could be about symptoms and diseases, or if you're writing about technology, they could be specific open source projects and their contributors. With Stanbol, the choice is yours.
The current downside of Stanbol is that you'll have to run it yourself, but there may be solutions coming for that as well.
What Annotate.js does is not merely creating links, but it also marking the machine-readable relationship between them and the HTML content being edited. This can then be used by yet another set of tools — like search engines — to understand and organize the content better.
It is easy to see Create.js (like Drupal did, unfortunately) as just an easy way to add nice inline editing features to your CMS. However, while that is a good initial step, the addition of being able to interact with your content on the semantic level can do a lot more. Automated linking is just another demonstration of that.
As the ecosystem around Create.js and VIE matures, and it ships in more systems, there will be things that we can't even imagine now built on the stack.
If your CMS is properly decoupled, you can benefit from that immediately.