A robotic vision of the future
In the world of tomorrow, robots and AI will play an increasingly important role. At this year's Global Mobile Internet Conference (GMIC), major industry players set out their vision of the future.

By Xue Hua
In the world of tomorrow, robots and artificial intelligence (AI) will play an increasingly important role. At this year’s Global Mobile Internet Conference (GMIC), the movers and shakers of the industry took to the stage to set out their visions of the future of robotics and AI.
Industry players’ predictions on AI were particularly interesting. Microsoft outlined five AI concepts: artificial intelligence, collective intelligence, adaptive intelligence, invisible intelligence, and human-machine interfaces. The software giant sees a future where we will be able to leverage AI to mine, systematize, and adapt big data on human behavior to create cognitive capabilities, and strengthen our own cognitive abilities and empower humanity through deep learning. Microsoft touted the example of using AI to help a visually challenged engineer “hear” the world around him through a pair of special glasses. The device combines computer vision and natural language processing to “see” the outside world and describe it to the user through natural language.
The audience at GMIC saw demonstrations of a number of similarly revolutionary robotics and drone applications. SuitX, an affordable and accessible exoskeleton, was one such application. SuitX CEO Homayoon Kazerooni spoke of how their product design was inspired by a desire to design a tool that can help personnel like paramedics, nurses, and high-risk assembly line workers that’s affordable, uses as little hardware as possible, and is consumer friendly.
SuitX exoskeleton
More lifelike robots suit scenarios where they directly interact with humans. As technology develops, AI will become more capable and intelligent, and will be able to learn how to live among us more naturally, which is one of the goals of the AI-robotics mix.
David Hanson, CEO and founder of Hanson Robotics, gave a live demonstration of the humanoid Sophia, who told the audience in Mandarin: “I hope everyone has a nice day. I’m a robot but I look like a real person. I’ve just come to say hello. Goodbye.” Sophia incorporates a range of technologies that enable it to express human facial emotions, including flexible facial skin, components that allow it to mimic facial movements, voice recognition, and cameras that can track people’s movements.
Qualcomm also showed off a three-wheel robot that can navigate obstacles and return to its starting point, and is working on using smartphone technology in robots.
Another major theme at GMIC was commercial drones. 3D Robotics CEO Chris Anderson predicts that the commercial drone market will exceed US$20 billion within five years, overtaking the consumer market. Commercial drones will be used in agriculture, construction, insurance, and the energy sector to carry out tasks such as surveying equipment. “This year the United States will approve the commercial use of drones without the need for a pilot’s license. This is good news. It shows that we have already entered the era of the commercial drone,” said Anderson.
Exciting innovations to come
Another hot topic at GMIC was visual recognition. Intelligent recognition, such as visual recognition, plays a crucial role in technology and platform support. Professor Tang Xiaoou from the Chinese University of Hong Kong and SenseTime CEO Xu Li both spoke at length about their predictions for computer vision.
From the perspective of the whole vision chain, they believe computer vision can be divided into three steps: imaging, perception, and understanding. Imaging is taking photographs. Technology can be used to better represent the content of images by suppressing noise. Perception involves acquiring content in images via sensors and acquiring perceptual input via algorithms that we compute. Understanding is actually based on what we call visual recognition, and includes facial detection, facial recognition, and the analysis of facial attributes to determine things like age and gender.
Research on vision goes back many decades. Today, trends have moved on from specialized systems to being purely data driven. Once the whole vision chain has been completely opened up, it will be possible in the future to produce more powerful visual services and products that will quietly and steadily change our lives.
What this makes clear is that we’re currently experiencing a stage of huge innovation, which will gather momentum as nations enter the Augmented Innovation stage, enabled by broadband, data centers, cloud computing, big data, and Internet of Things. This enablement through the maturity of digital infrastructure is crucial because normal tasks for conventional robots are in fact extremely hard; for example, it took Pieter Abdeel’s team at Berkeley University years to teach the robot BRETT to fold a pile of laundry, largely because each pile is different and there’s so much going on that can confuse a robot, like a crumpled sock. Even grasping an object requires a huge amount of processing and preprogrammed information, meaning that sophisticated systems like humanoid robots need very powerful computing capabilities.
But, cloud robotics has the potential to make massive gains, with research in this area happening across the globe. Access to a cloud computing infrastructure would give robots access to the massive processing power and data they need to perform complex, compute-intensive tasks, and let them offload things like image processing and voice recognition. More excitingly, it would make the possibility of downloading new skills feasible, much like sci-fi fans witnessed in the Matrix when the ability to do Kung Fu and fly a helicopter are downloadable. As nations move into the stage of Augmented Innovation through greater ICT maturity, such capabilities will no longer only exist in the realms of cool sci-fi movies – robots will really start making life better on the individual level, and live up to what the name of Berkeley’s task-learning robot BRETT stands for: the Berkeley Robot for the Elimination of Tedious Tasks.
The world of tomorrow is poised to be an exciting time, augmented by robot bodies and AI brains that will make life better in a hyper connected world.