MIT's new AI: So smart it can predict what happens next from a still image

The AI can add animations to static images, although the researchers acknowledge that it rarely generates the “correct” future.

Image: MIT

Researchers have developed a deep-learning system that can do the very human task of interpreting what’s happening in a photo and guessing what’s likely to happen next.

Better yet, the system, developed by machine-learning researchers at MIT, can express its idea of a plausible future by adding animations to still images, such as waves that would ultimately crash, people who might move in a field, or a train that might roll forward on its tracks.

The work could provide a new direction for exploring computer vision by giving machines the ability to understand how objects move in the real world.

The researchers achieved their objective by training two deep networks on thousands of hours of unlabelled video. As the researchers note, annotating video is expensive, but there’s no shortage of unlabeled video to begin training machines to read other signals about the world.

Carl Vondrick, a PhD student at MIT, who specializes in machine learning and computer vision, told New Scientist that the ability to predict movements in a scene could ensure tomorrow’s domestic helper robots don’t become a hindrance. You wouldn’t, for example, appreciate a robot pulling a chair out from under you as you’re about to sit down, he said.

The model was trained on two million videos from Flickr amounting to 5,000 hours of content covering four main scene types, including golf courses, beaches, train stations, and hospital rooms, consisting mostly of images of babies.

As New Scientist notes, the videos the model produces are grainy and short, lasting about one second, but they do capture the right general movement of a given scene, such as a train moving forward or a baby scrunching up its face.

However, the model still has much to learn about how the world works, such as that a train doesn’t infinitely depart from a scene. Still, it did show that machines can be taught to dream up brief plausible futures.

The model is also able to “hallucinate” fictional but reasonable motions for each of the scene categories they explored.

The model was based on a machine-learning technique called adversarial learning, where two deep networks compete against each other. One network generates synthetic video while the other tries to discriminate between generated video and real videos.

Vondrick has previously also trained deep-learning models on hundreds of hours of unlabeled YouTube videos and TV programs such as ‘The Office’ to predict human interactions and gestures, such as a handshake, hug or kiss.

Follow Arabian Post

Select Arabian Post as your preferred source on Google and MSN News for trusted business news and Arab politics and updates.

Arabian Post on MSN News Arabia business and politics

Follow Arabian Post on Google News business coverage

Arabian Post Telegram channel Dubai news updates

Arabian Post Medium articles business insights

Notice an issue?

Arabian Post strives to deliver the most accurate and reliable information to its readers. If you believe you have identified an error or inconsistency in this article, please don't hesitate to contact our editorial team at editor[at]thearabianpost[dot]com. We are committed to promptly addressing any concerns and ensuring the highest level of journalistic integrity.

Search

MIT's new AI: So smart it can predict what happens next from a still image

Read more research from MIT

Follow Arabian Post

Notice an issue?

Kwikset Kevo Convert Smart Lock Goes Up For Preorder At $149

UAE Banks Federation explores a unified framework for handling customer complaints for banking sector

Dealing.com claims record for tokenised stock access

Gemcorp closes first Saudi Shariah financing deal

Game Boy telescope adapter files released free

Sharjah takes BookXcess brand into two continents

ADI Chain lands $50 million expansion investment

Abu Dhabi business licences climb as investment broadens

More from Hyphen Digital Network

Contact

Advertising Enquiries

Syndication & PR Distribution

MIT's new AI: So smart it can predict what happens next from a still image

🕚 Tue 29 Nov 2016 | AP News

Read more research from MIT

Follow Arabian Post

Notice an issue?

Share

Kwikset Kevo Convert Smart Lock Goes Up For Preorder At $149

UAE Banks Federation explores a unified framework for handling customer complaints for banking sector

Related news

More from Hyphen Digital Network

Contact

Advertising Enquiries

Syndication & PR Distribution