Researchers have developed two new smartphone-based systems that can accelerate development of driverless cars by identifying a user's location and orientation in places where GPS does not function. These can also identify the various components of a road scene in real time on a regular camera or smartphone, performing the same job as sensors costing millions.
Although the systems cannot currently control a driverless car, the ability to make a machine "see" and accurately identify where it is and what it's looking at is a vital part of developing autonomous vehicles and robotics.
"Vision is our most powerful sense and driverless cars will also need to see but teaching a machine to see is far more difficult than it sounds," said professor Roberto Cipolla from University of Cambridge who led the research.
The first system, called SegNet, can take an image of a street scene it hasn't seen before and classify it, sorting objects into 12 different categories -- such as roads, street signs, pedestrians, buildings and cyclists - in real time.
It can deal with light, shadow and night-time environments, and currently labels more than 90 percent of pixels correctly. "Users can visit the SegNet website and upload an image or search for any city or town in the world, and the system will label all the components of the road scene. The system has been successfully tested on both city roads and motorways," the authors noted.
"It is remarkably good at recognising things in an image because it has had so much practice," added Alex Kendall, PhD student. SegNet was primarily trained in highway and urban environments, so it still has some learning to do for rural, snowy or desert environments -- although it has performed well in initial tests for these environments.
There are three key technological questions that must be answered to design autonomous vehicles: where am I, what's around me and what do I do next. SegNet addresses the second question while a separate but complementary system answers the first by using images to determine both precise location and orientation.
The second localisation system runs on a similar architecture to SegNet and is able to localise a user and determine their orientation from a single colour image in a busy urban scene. The system is far more accurate than GPS and works in places where GPS does not, such as indoors, in tunnels, or in cities where a reliable GPS signal is not available.
The localisation system uses the geometry of a scene to learn its precise location, and is able to determine, for example, whether it is looking at the east or west side of a building, even if the two sides appear identical.
"In the short term, we're more likely to see this sort of system on a domestic robot - such as a robotic vacuum cleaner, for instance," Cipolla added. The researchers presentd details of the two technologies at the International Conference on Computer Vision in Santiago, Chile, recently.