Read this to find out about:
- The types of generic AI-based applications for which off-the-shelf solutions exist today
- How NXP enables local speech recognition with no cloud connection in its comprehensive reference design solutions
- Two low-power implementations of people counting which run on a Lattice FPGA
The conditions are ripe for embedded developers to create their own Artificial Intelligence (AI) applications:
- Component technology, even a 32-bit microcontroller, supports neural network inferencing at the edge. Tools introduced recently by component manufacturers provide for efficient targeting of trained machine learning tools to their hardware.
- A broad range of model training frameworks is available for embedded developers to use
- Third parties can provide large sets of labelled generic data sets such as images, or OEMs can use tools and hardware for collecting and curating their own custom data set
But in such a new and complex field, it is likely that OEMs’ engineering teams will need to undergo an intensive process of education before building an AI-based product. Figure 1 shows the many elements in the process of developing an AI application. A full-custom AI development project starting today could be expected to take a minimum of two years before a finished product gets to market. This time could be considerably longer for an OEM with no previous experience in machine learning techniques and technologies.
Microprocessor, microcontroller and FPGA manufacturers are now introducing sophisticated toolchains to support the development of inference engines and their compilation on their products, in an effort to ease and accelerate the AI development process.
Fig. 1: The process for developing a new machine learning application to run on embedded hardware. (Image credit: NXP Semiconductors)
But in fact it is possible to embed machine learning capability in a new production-ready design within weeks rather than years, provided your product needs to perform one of a small number of common, generic AI functions.
This is possible because semiconductor suppliers have recognized that many OEMs share a common requirement for AI-enabled applications such as speech recognition, image recognition, and people detection and counting. They have responded to this need by providing ready-made, off-the-shelf reference designs for these applications. As we shall see, some of these designs are production-ready systems that can be dropped into existing product designs with no or little modification.
Machines which hear speech – a hit with consumers
The adoption of technologies such as Amazon’s Alexa Voice Service, Apple’s Siri® voice recognition software and the Google Assistant™ virtual personal assistant shows that consumers are comfortable with speaking their commands to a machine. Speech recognition is a classic field for AI, since it involves distinguishing common patterns of sound that are masked by numerous variations in the pitch and volume of the voice, accent, and enunciation, while filtering out extraneous audible noise.
The conventional development pathway for this application would involve the collation and curation of a large set of voice samples, and then using it to train, validate and test a bespoke learning model.
It would be much easier and quicker to embed a speech-recognition system already developed by a third party and this is exactly what NXP Semiconductors enables with its speech-recognition reference design, the SLN-LOCAL-IOT, featured in this issue, as shown in Figure 2. NXP also provides a similar system, the SLN-ALEXA-IOT, for implementing Amazon’s Alexa Voice Service technology. The reference design boards consist of a production-ready i.MX Voice Solution Board, backed by software for audio signal capture and processing, and for speech recognition, all running on a low-cost i.MX RT1060 family crossover microcontroller.
It enables OEMs to easily and cheaply add local voice control to any end product, with no connection to the internet required. With this NXP reference design, OEMs can quickly add voice controls to home thermostats, washing machines, fridge-freezers, light switches and many other types of device. NXP will support the implementation of custom wake words and commands.
Fig. 2: NXP’s i.MX RT106x Voice Solution Board
The i.MX Voice Solution Board itself is small, and because it requires no SRAM, eMMC storage or Power Management IC (PMIC), it also has a reasonable bill-of-materials cost. According to NXP, the cost is some $10 lower than that of a typical speech recognition system based on an applications processor.
People detection is a different application for machine learning, but like voice control requires the recognition of a common pattern: the image of a human body in countless variations. Like NXP, Lattice Semiconductor has succeeded in implementing a complex AI application on a highly constrained piece of hardware: a small, ultra-low power iCE40 FPGA.
Lattice provides the reference design as a complete hardware/software kit. The hardware platform is a Himax HM01B0 UPduino Shield, as shown in Figure 3. It is based on the UPduino 2.0 board, a rapid prototyping development board in the Arduino form factor offering the performance and I/O capabilities of the iCE40 UltraPlus FPGA: 5,280 Look-Up Tables (LUTs), 1Mbit of embedded memory, 120kbits of block RAM and eight multiply-accumulate blocks. It also includes the Himax HM01B0 low-power image sensor module and two I2S microphones, supporting AI applications that use either visual or audio inputs or both.
The reference designs are fully supported in the latest version 2.0 of Lattice’s SensAI™ development environment: SensAI provides project files and documentation for human presence detection using Compact Convolutional Neural Networking (CNN) IP for the Lattice FPGA.
The performance of the iCE40-based people detection application is impressive, especially given that it consumes as little as 1mW of power when sampling at a frequency of one or two frames per second. It can detect a person as far as 5m away from the camera, and even if the person’s image occupies as little as 10% of the total frame area.
Helpfully, Lattice supplies with the reference design software its training data set and the input files that it uploaded to the model training framework. This means that the reference design can be used not only as an off-the-shelf solution for people detection, but as the basis for an OEM’s own, custom people detection system: developers can take the Lattice data set and run their own model training process to change the speed, accuracy, range or hardware footprint of the inference engine in the iCE40.
Lattice supplies the same, production-ready hardware and software for people counting, an application that it runs on its larger ECP5-85 FPGA. This FPGA offers much greater hardware capabilities than the iCE40 with 85,000 LUTs and 3.7Mbits of block RAM. This people-counting reference design is hosted on Lattice’s Video Interface Platform, a system which consumes less than 1W and which provides multiple video interfaces such as MIPI CSI-2, eDP, HDMI, GigE Vision and USB 3.0.
Fig. 3: The Himax HM01B0 UPduino Shield hosts Lattice’s people detection AI application
Lattice’s people counting application can detect and count multiple people in a frame. It can detect the image of a body as small as six pixels, and can detect people as far as 8m away from the camera at various orientations.
As with the people detection application on the iCE40, this people counting application is a production-ready design, supplied with the training data set and the input files to the machine learning framework.
A growing range of ready-made solutions
The NXP i.MX RT voice control reference design could be of interest to manufacturers of home appliances, home automation equipment, consumer electronics devices such as set-top boxes and wireless access points, lighting equipment and many other device types.
Likewise, the people detection and counting applications from Lattice could be useful in building automation, access control, security and surveillance and building automation and control systems.
But these are not the only AI designs that can be applied broadly, and electronics manufacturers can expect to see the emergence of more readymade implementations of machine learning technology.
For example, demonstrations provided by Lattice for its iCE40 and ECP5 FPGAs include applications for hand gesture recognition, face detection, face tracking, and speed sign detection. And NXP has released a reference design for face recognition in end products such as home appliances, the SLN-VIZN-IOT, which is featured in this issue of FTM.
Running on an i.MX RT1060 family crossover microcontroller, it offers an inference time of <750ms and can recognise more than ten different users’ faces. It is supplied with production-grade face recognition algorithms.
Fastest route to AI implementation
While much of the literature about AI in the embedded world shows the developer how to master the complex process of acquiring training data sets, training a model and implementing the model in an inference engine, some OEMs might choose to completely bypass the long AI development workflow and take advantage of the designs that NXP, Lattice and others have already developed.
The availability of these reference designs is a reminder that the implementation of AI does not have to be difficult, risky or time-consuming.