Take a fresh look at your lifestyle.

Quadric Accelerator Takes On AI, Computer Vision

Silicon Valley startup Quadric has built an accelerator designed to speed both AI and standard computer vision algorithm workloads for edge devices for example robots, factory automation and medical imaging. The company's hardware architecture is really a novel hybrid width=\”640\” height=\”361\”>

Vineyard robots

Quadric's three co-founders, Veerbhan Kheterpal, Daniel Firu and Nigel Drego, previously founded 21, a bitcoin mining company that was sold to Coinbase. Quadric, Burlingame, Calif., didn't begin designing chips. Instead, it originally built agricultural robots that may walk down and up Napa Valley vineyards exploring the vines and sending alerts when it saw irrigation leaks or pests.

\”When i was building it, we realized that it wasn't likely to be a viable product constructed from the drone supply chain at $5 to $10,000,\” said Kheterpal. \”It would have to be built from the tractor logistics costing $50,000, and carry large PCs with GPUs on it, with tons of cameras. That's when we attempted to look under the hood of all that robotics software determined that which was fundamentally causing this energy need to go track of platforms like Nvidia and Intel.\”

The company pivoted to building an accelerator chip – \”the chip we wished we had,\” according to Firu.

A seed funding round was launched in 2022, accompanied by a sequence A round that generated $13 million from prospective customers including Quadric's lead investor, Japanese automotive Tier-One Denso. Quadric's total funding is $18 million.

Turing complete

Quadric's employs an instruction-driven architecture that takes elements from width=\”300\” height=\”298\”>

\”You can load into one for reds and then store from the perpendicular side,\” Firu said. \”That allows for some pretty interesting stuff to occur in the software level. You can also start to do such things as data re-mappings and rotations of images and such things as that using this paradigm.\”

Meanwhile, software-controlled static memories (not cache) on-chip offer space for large data structures. Quadric allows API access to these so developers can define arbitrary data structures inside. In the Q16 chip, the memories are 8 GB, enough to fit \”two or three frame buffers at HD in there, or perhaps an entire neural network of weights,\” said Firu.

Software stack

Quadric built its software stack before silicon. Customers have been using it with the company's architecture simulator, or with FPGAs, for a year, Kheterpal said. Quadric's stack abstracts away the architecture and instruction set with an LLVM-based compiler, with a C++ API on the top.

Source Mode supports different width=\”640\” height=\”359\”>

A future update towards the stack will offer you a no-code Graph Mode, that will support TensorFlow or ONNX versions of neural networks. That will incorporate a TVM-based deep neural network (DNN) compiler which automatically generates code.

\”We're combining the power of no-code with the flexibility to have your own custom code, and mix them in interesting methods to achieve the application,\” said Kheterpal. \”Most platforms will only offer an AI-specific architecture with some type of DNN compiler – but how about customization? Why not a DNN that isn't supported? How about operators that are not supported? We do not have those restrictions since this is a Turing complete core, the cores can perform any operation. The code flexibility gives developers the ability to write whatever algorithm they need.\”

Chip roadmap

Quadric's Q16 chip, featuring 256 Vortex cores inside a 16 x 16 array in 16 nm silicon, offers 4 INT8 DNN TOPS. It can run ResNet-50 at 200 inferences per second (for INT8 parameters at 224 x 224 image size), consuming typically 2W.

Quadric's roadmap includes a second generation of the architecture, plus a tapeout of a Q32 chip (an array of 1,000 cores), \”probably in 7 nm,\” said Firu. While the Q16 is just an accelerator (it would sit alongside a system host processor) the Q32 under development could also include Arm or RISC-V cores to act as host.

An M.2-format developer kit, with a Q16 processor alongside 4 GB of external memory directly mapped towards the Q16's universal storage, is available now.