Mipsology’s Zebra looks like a winner

Mipsology is a 5-year-old company, based in France and California, with a differentiated product that solves a real problem for some customers. The company’s product, Zebra, is a deep learning compute engine for neural network inference. While these engines are not uncommon, Zebra unlocks a potentially important platform for inference using the field programmable gate array (FPGA). There are two parts to this story, which is one of the challenges Mipsology faces.

Inference — the phase where deep learning goes to work

Deep learning has two phases: training and inference. In training, the engine learns to do the task for which it is designed. In inference, the operational half of deep learning, the engine performs the task, such as identifying a picture or detecting a computer threat or fraudulent transaction. The training phase can be expensive, but once the engine is trained it performs the inference operation many times, so optimizing inference is critical for containing costs in using deep learning. Inference can be performed in the cloud, in data centers or at the edge. The edge, however, is where there is the greatest growth because the edge is where data is gathered, and the sooner that data can be analyzed and acted upon, the lower the cost in data transmission and storage.

Specialized AI chips are hot, but the mature FPGA is a player too

For both training and inference, specialized processors are emerging that reduce the cost of using deep learning. The most popular deep learning processor is the graphics processing unit (GPU), principally Nvidia’s GPUs. GPUs rose to prominence because Nvidia, seeing the computational potential of its video cards, created a software platform, CUDA, that made it easy for developers and data scientists to use the company’s GPUs in deep learning applications. The GPU is better suited to training than inference, but Nvidia has been enhancing its GPUs’ inference capabilities. Other specialized processors for deep learning inference include Google’s Tensor Processing Unit (TPU) and FPGAs.

FPGAs have been around since the 1980s. They are chips that can be programmed so the desired tasks are implemented in electronic logic, allowing very efficient repetitive execution, which is ideal for some deep learning inference tasks. Mipsology lists several advantages of FPGAs over GPUs for inference, including a lower cost of implementation, a lower cost of ownership and greater durability. While FPGAs have been used in some implementations, including on Microsoft’s Azure platform, these chips have not received the attention that GPUs have.    

Zebra is where inference meets FPGAs

Mipsology’s Zebra compute engine makes it easy for deep learning developers to use FPGAs for inference. Zebra is a software package that provides the interface between the deep learning application and the FPGA, so that specialized FPGA developers do not to have to be brought in to exploit the benefits of the processors. Zebra is analogous to nVidia’s CUDA software; it removes a barrier to implementation.

Bringing together the puzzle pieces

FPGAs are mature and powerful potential solutions that lower the cost of inference, a key to expanding the role of deep learning. However, the programming of FPGAs is often a barrier to their adoption. Zebra is an enabling technology that lowers that barrier. In the world of specialized solutions based on broadly applicable technologies such as deep learning, there are opportunities for products and services to make it easier to assemble the pieces and lower the cost of development. Zebra is exploiting one of these opportunities.

AI chips: Explosive growth of deep learning is leading to rapid evolution of diverse, dedicated processors

Artificial intelligence (AI) utilization has been accelerating rapidly for more than 10 years, as decreases in memory, storage and computation cost have made an increasing number of applications cost-effective. The technique of deep learning has emerged as the most useful. Large public websites such as Facebook (Nasdaq: FB) and Amazon (Nasdaq: AMZN), with enormous stores of data on user behavior and a clear benefit from influencing user behavior, were among the earliest adopters and continue to expand such techniques. Publicly visible applications include speech recognition, natural language processing and image recognition. Other high-value applications include network threat detection, credit fraud detection and pharmaceutical research.

Deep learning techniques are based on neural networks, inspired by animal brain structure. Neural networks perform successive computations on large amounts of data. Each iteration operates on the results of the prior computation, which is why the process is called “deep.” Deep learning relies on large amounts computation. In fact, deep learning techniques are well known; the recent growth is driven by decreasing costs of data acquisition, data transmission, data storage and computation. The new processors all aim to lower the cost of computation.

The new chips are less costly than CPUs for running deep learning workloads

Each computation is limited and tends to require relatively low precision, necessitating fewer bits than found in typical CPU operations. Deep learning computations are mostly tensor operations — predominantly matrix multiplication — and parallel tensor processing is the heart of many specialized AI chips. Traditional CPUs are relatively inefficient in carrying out this kind of processing. They cannot process many operations at the same time, and they deliver precision and capacity for complex computations that are not needed.

Nvidia (Nasdaq: NVDA) GPUs led the wave of new processors. In 2012, Google announced that its Google Brain deep learning project to recognize images of cats was powered by Nvidia GPUs, resulting in a hundredfold improvement in performance over conventional CPUs. With this kind of endorsement and with the widespread acceptance of the importance of deep learning, many companies, large and small, are following the money and investing in new types of processors. It is not certain that the GPU will be a long-term winner; successful applications of FPGAs and TPUs are plentiful.