With the impending confluence of 5G, AI, and AR/VR, text, voice, photos, images, and video streaming will act as seemingly magical channels that connect the physical and the digital worlds. But, the barriers to entry for AI are high – as is cost.
In Making Up the Mind, the neuroscientist Chris Frith describes how our perception of the world is not direct, but instead relies on "unconscious reasoning". Before we can perceive an object, the brain must infer what the object is based on the information that reaches our senses. And this constitutes humans' most important ability – the ability to predict and handle unexpected events. As part of this process, sight is the fastest and most accurate channel for information acquisition – we capture 80 percent of our information about the outside world through the eyes. Computer vision – or vision AI – is to AI what sight is to humans.
We want to harness computer vision as a window into a currently unknown world, and thus provide industry equipment with intelligent eyes. Equipment will be able to simultaneously capture details, such as color, light, shape, and distance; understand and predict; and also self-learn to better understand this "unknown world". In turn, people will be freed from vast amounts of repetitive labor as machines do more work for us.
Vision AI is bringing intelligence to a wealth of industries, including the automotive, public safety, entertainment, logistics, healthcare, and transportation domains, transforming the traditional industry landscape while creating new and mass use cases. These use cases include drones, autonomous driving, facial recognition gates, AR/VR, sorting logistics parcels, cancer diagnosis, pathology slides, and smart cities, among other "intelligent twins".
However, in the face of the massive demand for new vision AI use cases, we can't rely on algorithms alone.
Suppose we wanted to develop an intelligent surveillance camera that could be used during, for example, security check areas, in shopping malls, or to check car number plates. This kind of camera would eliminate a huge amount of simple and repetitive manual labor and achieve massive cost savings. Naturally, it would require an intelligent system with a level of capability that approaches human-level intelligence to, say, recognize license plates and rapidly respond to the inferred results.
For vendors, the biggest concern is developing the product to get the greatest value at the lowest possible cost. In terms of cost, this would include the cost of developing the intelligent system, the cost of the hardware and software that support the system, and maintenance and management costs. Vendors would also need to focus on data security.
If developers want to deploy models to edge devices, they would also need to take into account equipment specifications, including specifications for chips (CPUs/GPUs), memory, network bandwidth, and stability. It’s extremely challenging to efficiently develop high-quality inference cameras on extremely limited performance chips. First, it involves the storage and pre-processing of raw data collected by end devices. If massive amounts of data uploaded to the cloud consumes a lot of bandwidth, serious latency occurs in service response.
Second, model training not only involves algorithms and strong computing power supported by hardware, but also the development of underlying components that support the deep learning framework and that handle system platform compatibility issues, which are both very complex.
Third, it requires a focus on technology as well as services. For example, how can we manage and maintain massive amounts of equipment and skills for using this equipment? How should we deal with individual privacy and security issues?
This applies to AI application development in various industry scenarios. For individuals, groups, or enterprises, developing an intelligent system is expensive and has high barriers to entry. This has become the main challenge for quickly developing and implementing vision AI applications.
Huawei’s AI, IoT, and engineering capabilities have culminated in HiLens – a complete and reliable one-stop vision AI application development, deployment, and managed service platform for developers, enterprises, and hardware manufacturers. HiLens focuses on solving current development problems, offering a variety of development methods that integrate devices chips, and cloud, as well as managed services with device-cloud synergy for vision hardware.
It reduces the entry barriers to development and helps users implement customized vision AI application development in multiple fields and scenarios, including shopping malls, campuses, homes, connected vehicles, transportation, healthcare, and construction sites.
Huawei has also released the HiLens camera with AI inference capabilities, which functions like a "smart eye" and was designed for developers to develop vision applications that can be deployed on devices and the cloud. The HiLens visual device integrates the Ascend 310 chip, which can process 100 frames per second and detect faces in milliseconds. In addition, the built-in lightweight containers minimize the use of resources and network bandwidth, and can be quickly downloaded and started. Skills generated on the HiLens development platform can be directly deployed to the HiLens visual device, making testing much easier.
The HiLens platform provides a Skill Market that provides a range of preset skills for developers and enterprises covering, for example, connected vehicles and construction sites, and applications like object detection, motion recognition, and speech recognition. Developers can also release the skills they’ve developed in the Skill Market for others to buy and use.
Huawei HiLens is a customizable and efficient vision application development platform. In combination with the powerful HiLens camera, it will lower entry barriers to development through coordinated device-cloud management and integrated software and hardware development solutions, and connect upstream and downstream enterprises through the HiLens Skill Market.
As a general-purpose computer vision AI development platform, HiLens also includes Huawei Cloud's other services, which are fully catered to the developer experience in terms of ease of use, reliability, development efficiency, privacy, hardware adaptation, system compatibility, and personalized services. With HiLens, developers can maximize development efficiency, model accuracy, and hardware management efficiency.
Developers only need to take four simple steps to customize a model that meets the needs of a specific scenario:
The customized model can be deployed to any registered device in as little as 10 minutes. Developers can later upgrade or uninstall the skills of devices with one click via the HiLens Skill management platform.
For model development, HiLens adopts a development approach that integrates chips, devices, and the cloud. For development on the cloud side, ModelArts provides comprehensive AI computing infrastructure, including hardware infrastructure, and a development environment that supports multiple languages and deep learning frameworks. As a result, developers can use any development language and deep learning framework during development.
On the device side, HiLens provides a set of unified Skill Framework Python APIs that support different development languages and development frameworks, for example, TensorFlow, MXNet, Caffe.
HiLens also encapsulates an underlying system platform adaptation layer and automatically adapts to the underlying chip through the HiAI Engine. This enables seamless integration with the device, so that developers don't have to worry about underlying hardware adaptation or system platform compatibility. Development is thus easier and more efficient.
For model training and inference, Huawei HiLens uses a unique inference model that can be deployed on devices and the cloud. The model performs preliminary data processing and inference using the deployment environment and raw data unique to each device. On the cloud side, it carries out online training based on the personalized needs of the device. This enables automated online learning, updates the device, and boosts model accuracy, making the camera smarter.
HiLens also provides communications components with persistent and reliable device-cloud connections. This gives the device decision-making capabilities in scenarios where connections are intermittent or where there are no connections with the cloud, so that it remains synchronized when disconnected to the network.
The device side supports the HiAI mobile computing architecture with a built-in hardware acceleration unit dedicated to NPUs. This offers about 50 times more energy efficiency and 25 times better performance than CPU processing. This means the open source development board can deliver strong AI computing power, hardware acceleration support, and powerful performance.
The device side is pre-installed with a highly efficient, high-performance, and optimized inference engine for deep learning, delivering an extra performance boost for model training and inference.
The future belongs to intelligence. Huawei will continue to explore AI and foster a robust AI ecosystem. Huawei hopes that HiLens will spark innovation and inspire developers to develop imaginative skills that provide whole new experiences to help shape a fully connected, intelligent world.