18 April 2012 by Jim Giles
Magazine issue 2860.
Remember that classic line? (Image: Everett Collection/Rex Feature)
“OF ALL the gin joints in all the towns in all the world, she walks into mine.” It’s a classic quote from the film Casablanca, but can a computer grasp the magic of such memorable lines?
“When I started this experiment, people said ‘it’s cultural, a computer can’t catch it’,” saysCristian Danescu-Niculescu-Mizil at Cornell University in Ithaca, New York.
The doubters may have to think again. Danescu-Niculescu-Mizil and colleagues have taught a computer to identify memorable quotes with an accuracy approaching that of humans. It means computers might one day help writers test their latest catchy lines.
The Cornell team amassed quotes from theInternet Movie Database (IMDb), which contains a list of lines flagged by users as memorable. The context in which a line is uttered can make a quote more notable, so as a control, the team paired each notable quote with an ordinary one from the same context. It was the same length and spoken by the same character at around the same point in the film.
The computer analysed the pairs of quotes – around 2200 in total – for language patterns, unusual words, and word combinations. Unusual words were defined as those that rarely cropped up in text taken from news publications.
The computer managed to identify several characteristics peculiar to the memorable quotes, creating a model that could identify them. “The phrases contain surprising combinations of words, but at the same time they have a syntactic structure that is common, so they are easy to use,” says Danescu-Niculescu-Mizil, who will present the research at a conference on computational linguistics in Jeju, South Korea, in July.
The analysis also showed that memorable lines often have a property the team dubs “generality”: they can be widely used because they don’t contain words that tie them to a specific context. For example, the [youtube http://youtube.com/w/?v=8gciFoEbOA8] – “You’re going to need a bigger boat” – might not be judged an all-time classic had Roy Scheider said “the” bigger boat.
The model was able to distinguish between memorable and non-memorable quotes with 64 per cent accuracy. Humans scored 78 per cent.
The team suggests that political campaigners could use the model to assess their slogans.
Take the test at bit.ly/H2mSV2
.