Three UC Riverside engineers have received a $1.2 million grant from the National Science Foundation to develop a new generation of energy-efficient, energy-elastic, and real-time-aware GPUs suitable for use in resource-constrained environments such as emerging embedded and autonomous systems, including aerial drones and autonomous vehicles.
The three Marlan and Rosemary Bourns College of Engineering faculty members receiving the award are Daniel Wong, Hyoseung Kim, and Nael Abu-Ghazaleh, all of whom are professors of electrical and computer engineering.
In order to operate correctly and safely, autonomous systems, such as aerial drones and autonomous vehicles, need to perform millions of simultaneous calculations at speeds that even the most advanced state-of-the-art computing can’t quite match.
Graphics processing units, or GPUs, commonly provide the computational power necessary to enable these autonomous systems. GPUs are widely used in supercomputing and cloud computing to dramatically speed up applications, such as image processing, deep learning, and other computationally intensive workloads. The added speed comes at a cost: GPUs used in parallel computing consume large amounts of energy, which limits their use in self-contained, often battery-operated environments like vehicles and drones.
Effective GPUs for autonomous systems need to be energy efficient and able to execute workloads in real time. For example, for an autonomous vehicle to safely navigate on the road, it has to be able to process various sensor information, such as camera and Lidar, and make a decision within milliseconds to prevent the vehicle from crashing.
However, modern embedded GPUs have several limitations when used in autonomous system settings. GPUs tend to be energy inefficient, leading to insufficient computational power and limited autonomous system capability. GPU hardware and software need to be timing aware to successfully perform real-time operations and meet the workload deadlines required to provide correct and safe operations.
To solve these issues, the UC Riverside project will provide solutions that span both software and hardware in order to enable real-time embedded GPUs for autonomous systems.
“Current GPUs consume almost the same amount of power when actively processing a workload and when idle, wasting energy,” Wong said.
The UC Riverside team will design “energy-elastic” hardware, which lets the GPU consume power based on the amount of work it has to do.
“If it is doing little work, it will consume less power; if it needs to do more work, it will consume more power,” Wong added.
GPU hardware consists of many schedulers, which are unaware of the timing requirement of workloads. Therefore, if multiple workloads are running on the GPU, it is possible that some workloads may miss the deadline due to competition for hardware resources. The UCR group will create timing-aware hardware and software, allowing the various hardware schedulers to prioritize workloads to ensure deadlines are met.
The researchers will design real-time scheduling software that coordinates with hardware schedulers. By having hardware that keeps the software updated on the workload’s progress, the software can make better scheduling decisions and improve real-time operations.
“Today’s software and hardware do not coordinate to enforce workload deadlines,” Kim said. “This project would enable multiple workloads to run safely together and increase the capabilities provided by embedded GPUs in autonomous systems.”