Despite what the media tends to depict, artificial intelligence is being put to better use than winning video games and board games. In fact, two of the world’s leading tech giants have begun using AI to help the blind perceive the world in helpful new ways.
According to the World Health Organization, about 285 million people are visually impaired worldwide. Though the vast majority of all visual impairment can be prevented or cured, many people don’t – or can’t – receive treatment and must learn to adjust to their surroundings using other senses. But beyond practical functions, these individuals also can’t appreciate some simple activities like flipping through photos or watching videos. With these limitations in mind, Facebook and Baidu sought to make the lives of the visually impaired a little easier and more enjoyable.
Facebook gives image captions ‘context’
Last week, Facebook unveiled a new AI-powered feature that describes photos for blind or visually impaired individuals. Since user-generated captions don’t always include details about what is actually in the photo, visually impaired people often miss out on the content of the photo and context of the caption. Facebook’s new system develops “automatic alternative text” that generates and reads out loud a specific description of the objects that a photo might contain, in addition to the user-generated caption.
For example, a photo of a couple on a boat dock may have a user-generated caption that reads “Finally landed!” Without seeing the photo, someone might assume the caption refers to an airport. Facebook’s automatic alternative text may now add descriptive language such as “image contains two people, outdoors, ocean, sky, boat.” Prior to this feature, screen readers would only read the name of the person who posted the photo, along with the word “photo.” With the context of the automatic alternative text, a visually impaired individual is provided a whole new level of understanding.
In a press release, the company described Facebook’s object recognition system as powered by deep learning neural networks with billions of parameters trained by millions of examples. For all its value, the automated alternative text is currently only available in English on iOS screen readers, though Facebook says they plan to expand the feature to other languages and operating systems soon.
Baidu helps visually impaired people perceive the real world
Across the Pacific, Chinese search giant Baidu is also developing a service to help the visually impaired perceive the world through sound. But rather than describing the content of an image, Baidu’s DuLight aims to describe the objects one would experience in the real world. The device attaches to an individual’s ear like a Bluetooth headset. When DuLight is pointed forward, it scans the objects within its vicinity and, using Baidu’s highly sophisticated image recognition system, conveys the objects to its wearer.
Baidu claims the device can identify more than just everyday objects – DuLight can apparently recognize product labels, street signs, landmarks, and even the faces of friends.
This isn’t such an outlandish claim, given that Baidu’s deep learning system has previously shown outstanding success in recognizing human faces, besting the likes of Microsoft, Facebook, and Google in an AI competition. Though Baidu did eventually admit it broke some rules of the competition, the feat was nonetheless impressive. That same software now informs DuLight to help the visually impaired perceive friends, family, and even what’s in the fridge.
Image credit: Getty Images