The Null Device
Posts matching tags 'natural language processing'
Three Israeli computer scientists have developed an algorithm for detecting sarcasm in online comments. Named SASI (Semi-Supervised Algorithm for Sarcasm Identification), the algorithm looks at linguistic features of the sentences to guess whether or not they're sarcastic. It was developed with Amazon product reviews as training data. Potential applications of such algorithms could include systems that gauge the positivity or negativity of public opinion about a subject from online discussions.
Travel search engine of the day: Adioso. This is a new natural-language-based flight search system. It differs from sites like Kayak in that, rather than accepting simple queries in a set of fields (origin, destination, dates), it accepts queries as natural-language sentences, and allows a good deal of fuzziness. So, for example, if you want to go from London for a weekend in Barcelona in late November, you can ask for "London to Barcelona weekend late November", or if you just want to get out cheaply, you can ask for "London to anywhere under GBP100".
Well, you can if your destination is supported. The site appears to be Australian, and thus Australia and popular destinations from there (south-east Asia, the UK and US, and places along the "Kangaroo Route" to London) are well supported, while Europe (minus sunny holiday spots) is a bit patchy. The site found no flights from London to either Berlin or Stockholm, and drew a blank altogether at Reykjavík (the closest match it could find was Tel Aviv; I guess that sort of sounds like Reykjavík, if you're shouting across a noisy room or something). Flights across Australia it handles well, though, finding better prices than Kayak. In any case, the site claims to be in beta (though whether it's an old-fashioned beta or a Google-style permanent beta is uncertain), so with any luck, they'll improve it.
Researchers at MIT are developing software which paraphrases English-language text; for example, their software is able to take the sentence "The surprise bombing injured 20 people, 5 of them seriously," and rewrite it as "Twenty people were wounded in the explosion, among them five in serious condition." The system uses techniques adapted from computational biology to match fragments of sentences to others, skirting the entire area of semantics altogether; online translation systems like Babelfish/Systran use similar techniques. (via Techdirt)
(This is quite distinct from automatic summarisation software, which takes a text and delivers the "gist" of it. Some years ago, a large chunk of various intelligence agencies' research budgets was spent on this area, in an attempt to more easily cope with the flood of signal intelligence. And, unless the CIA and such have such systems in use, chances are it still is.)
Anyway, back to paraphrasing; when I read about this, the first thought that came to me was that it would be very useful to student plagiarists seeking to avoid detection (say, by Google searches on key sentences, or matches against other submitted assignments). Which made me wonder: have any plagiarists ever tried covering their tracks by passing an essay through several passes of Babelfish? (Given some of the grammar I've seen in student essays, I'm sure it wouldn't look too amiss.)
A few bits lifted from Techdirt. Firstly, secretive Stalinist cult-state North Korea has staked its claim to the Internet Age. The rigidly centralised, computer-poor nation claims to have invented the computer drink. Ah, good; we needed one of those.
But what it lacks in utility, it makes up for in entertainment value. The Ectaco Personal Translator proved the perfect icebreaker during a dinner party in rural France. It turned "thank you for the great dinner" into "it was disgusting," and "you are very beautiful" into "how much?" What better way to break the ice with a roomful of total strangers in a foreign country whose language you don't know?
An interesting programming contest, which involves writing a program which automatically summarises news items in haiku form. Given the complexity of the task, any satisfactory solutions are going to have to be quite interesting... (via Slashdot)