Bitmain's AI chip officially debuts, a stunning shot from the "mining machine" overlord

Bitmain's AI chip officially debuts, a stunning shot from the "mining machine" overlord

The long-rumored Bitmain artificial intelligence chip has finally been released.
At the AI ​​WORLD 2017 World Artificial Intelligence Conference held yesterday, Bitmain CEO Zhan Ketuan announced the company's artificial intelligence brand SOPHON (Shuanfeng, meaning "calculating the universe and enriching cognitive intelligence") and brought the world's first tensor acceleration computing chip BM1680, as well as heavyweight products such as the board SC1/SC1+ and the intelligent video analysis server SS1. This brand name, which comes from the science fiction novel "The Three-Body Problem", reflects Bitmain's ambition for their AI chips.
From mining machine leader to artificial intelligence <br/>In Liu Cixin's novel "The Three-Body Problem", SOPHON is an intelligent robot. It is a strong artificial intelligence entity created by the Trisolarans to lock down the earth's technology. In today's hot artificial intelligence, Bitmain named their new AI brand SOPHON, and the meaning behind it is self-evident.
In the past two years, the sudden outbreak of artificial intelligence has attracted many chip manufacturers to join in in different forms, among which Nvidia's GPU and Xilinx's FPGA are the most representative. At the same time, a number of ASIC chips designed specifically for AI have also been launched, and the global competition for AI chips has entered a white-hot stage. As a consistent supporter of ASIC chips, Mr. Zhan Ketuan of Bitmain believes that ASIC is simpler to design than CPU and GPU, and is more suitable for implementing deep learning algorithms, so they chose to launch ASIC chips to help artificial intelligence. SOPHON TPU chip BM1680 is their new attempt in this field.

BM1680 chip architecture diagram According to the introduction, this is a dedicated custom chip for tensor computing acceleration processing for deep learning applications, suitable for inference prediction (Inference) and training (Training) of deep neural networks such as CNN, RNN, and DNN. The chip consists of 64 NPUs, and the specially designed NPU scheduling engine (Scheduling Engine) can provide powerful data throughput capabilities and input data into the neuron core (Neuron Processor Cores). BM1680 uses an improved systolic array structure. A single chip can provide 2TFlops single-precision accelerated computing capabilities, the on-chip 32MB SRAM has high bandwidth, and there is a DDR4 memory interface outside the chip. A single chip can support up to 16GB DDR memory. This is the data handed over by BM1680.
Bitmain also integrates highly customized BMDNN Chip Link technology in the chip to provide a stable, flexible, low-latency link on the high-speed SerDes, which can enable multiple BM1680 chips to work together as a unified system to provide higher processing capabilities. After its debut, BM1680 quickly became a dark horse in the field of AI chips, which is inseparable from Bitmain's background.
Founded in 2013, Bitmain is a company that focuses on the research and development of ultra-high performance computing. The company has successfully developed and mass-produced a number of customized ASIC chips and complete systems. Bitcoin mining machines are one of their important products. According to reports, 80% or even 90% of the world's mining machines are provided by them. In the past few years, they have also been promoting the iteration of ASIC chips for mining machines to meet the needs of updates. Official data shows that the fifth-generation chip BM1387 independently developed by Bitmain is the world's lowest power consumption and highest performance computing acceleration chip, and its mass production scale reaches billions. It was during this research and development process that they accumulated more ASIC design and development experience, and the launch of AI chips was a natural result.
In addition to this chip, Bitmain also brought two deep learning acceleration cards and an intelligent video analysis Sophon SS1 to accelerate the popularization of artificial intelligence.
According to the official introduction, Bitmain provides two deep learning acceleration board products, Sophon SC1 and SC1+. SC1 has a high-performance BM1680 chip, while SC1+ is a dual BM1680 cascade architecture. The chips are interconnected through high-speed SerDes Chiplink, bringing a new acceleration experience for deep learning computing.
SC1 and SC1+ have similar architectures, and are both connected to the system through the PCIE bus. They have a single-card computing power of up to 2TFlops / 4TFlops (single precision), and a single-chip On Chip SRAM of up to 32MB. The larger SRAM is suitable for loading the entire neural network model. At the same time, the board is equipped with 16GB or 32GB DDR4 memory, and the large-capacity storage is suitable for storing larger neural network models.

"On this card, we can run classic Nets such as Googlenet, VGG, etc. If the card is inserted into the server, we can realize face detection, pedestrian detection, attribute analysis, face recognition, etc.," emphasized Zhan Ketuan.

SOPHON SS1, a deep learning server built based on the latest SOPHON SC1/SC1+ deep learning accelerator card and a deep understanding of image recognition algorithms, is another new product from Bitmain. It is designed to provide powerful deep learning acceleration capabilities for a variety of application scenarios such as video surveillance and Internet images.
Bitmain revealed that SOPHON SS1 provides a complete set of deep learning solutions for video and image recognition technology. The core components of the system are two SOPHON SC1 (or SC1+) deep learning acceleration cards, which are connected to the application system through the PCIE interface. The application system of SS1 is built on the X86 CPU for startup, storage management and deep learning SDK coordination. The entire SS1 system is condensed into a 4-rack unit (4U) chassis, integrating power supply, cooling, network, multi-system interconnection and file system. Customers can achieve rapid secondary development or system integration on this basis, which facilitates users to use the deep learning system to the greatest extent.

The Bitmain server integrates face/body detection, machine-non-human video structured analysis and other DEMOs, showing the industry application solution capability scenarios, and implementing video analysis and security industry solutions. Demonstrating human detection, vehicle detection, etc. BITMAIN will also quickly iterate the video structured API.
It is worth mentioning that SOPHON has the ability to develop the full stack of software and hardware, and the tool chain capabilities at all levels, from hardware, drivers, instruction sets, linear algebra acceleration core math library, RUNTIME library, BM Deploy's Inference deployment tool, FFT acceleration library, deep learning framework (Caffe, Darknet, Tensorflow, MXNet, etc.). It truly realizes the collaborative design and integrated optimization of software and hardware, and achieves the best optimization performance of deep learning applications on hardware.

According to Bitmain's roadmap, its second-generation chip 1682 will be released next month, also using 16-nanometer technology, with power consumption of about 30 watts and computing power of about 3T. The third-generation chip will be released in September next year, using 12-nanometer technology, with power consumption still around 30 watts and computing power of 6T, which should be no problem. We will only support more data precision for this chip, supporting 16-bit and 8-bit, Zhan Ketuan added.

Some understanding of artificial intelligence From mining chips to the AI ​​chip market with many players, according to Zhan Ketuan, they made this decision based on the following considerations. From their perspective, after making some achievements in Bitcoin, they felt like they were holding a hammer in their hands, and everywhere they looked were nails. In the process of looking for nails, they found that deep learning was a very suitable nail to be dealt with by a hammer, so they started to make this deep learning computing chip.
When it comes to chip development and selection, they believe that the biggest challenge for deep learning computing is still power consumption, including the power consumption of large-scale clusters and chips. Another big challenge is the memory wall. However, they have made it clear that deep learning is actually based on multi-dimensional matrix computing, and advocate that cloud-based deep learning should move towards tensor processing.

After considering how to perform various calculations on multi-dimensional matrices and how to perform Tensor calculations, they believed that the traditional CPU architecture was no longer suitable, so they determined that the architecture of high-performance chips used for deep learning in the cloud needed to slowly move towards the Tensor architecture. This is why they chose the systolic array architecture for chips. This architecture can use hardware to implement multi-dimensional data handling and scheduling of computing tasks, achieving very high performance, and is more suitable for acceleration in the cloud. It is worth mentioning that this is the same as Google's TPU architecture.
After explaining his views on cloud chips, Mr. Zhan Ketuan also expressed his views on terminal deep learning: He believes that this is even more difficult. This kind of architecture is limited by the power consumption of a single chip and cannot be too large. Generally speaking, it is difficult for such a chip to exceed 10 watts, so designing this kind of architecture is a very challenging task.
Finally, Zhan Ketuan pointed out that Bitmain’s mission or goal in the field of deep learning and AI is the same as what they have done in digital currency. Through continuous iteration of chips and products bit by bit, generation after generation, they strive to make their products perfect and the best, and to serve users and applications that need deep learning acceleration services.
He also stressed that the AI ​​industry should cooperate with each other. In today's era, business cooperation is greater than competition. Especially in the field of artificial intelligence, there are countless virgin lands waiting for us to develop. We should work together to make this pie bigger. We should have more in-depth cooperation with partners, including open source, and slowly build an ecosystem.
By Li Shoupeng, Semiconductor Industry Observer

<<:  2X fork suspended, can Bitcoin breathe a sigh of relief?

>>:  SEC Chairman: ICO transactions are vulnerable to price manipulation, the Commission will continue to develop regulatory plans

Recommend

Spam attacks once again cause Bitcoin network node count to drop

Last week, an unknown person or organization sent...

Illustration of evil moles on a woman's face

What is the position of the moles on our body and...

What does the Qisha Marriage Palace represent?

The Seven Killing Stars have the characteristics ...

The face of a woman who can give up delicious food to maintain her figure

There is a type of person who really makes people...

ARK: Bitcoin can create a better green energy system

According to a bitcoin research report published ...

What does it mean when a woman has pale eyebrows?

Eyebrows play a very important role in physiognom...

What is the personality fortune of a woman with double wisdom lines?

The wisdom line represents your brain and persona...

A handshake reveals a person's character

A handshake reveals a person's character Shak...

A face prone to midlife crisis

A face prone to midlife crisis If you have recent...

Women with triangular eyes are often dominated by their wives.

Is it good for a woman to have triangular eyes? I...