Table of Contents
Fetching ...

Pointer: An Energy-Efficient ReRAM-based Point Cloud Recognition Accelerator with Inter-layer and Intra-layer Optimizations

Qijun Zhang, Zhiyao Xie

TL;DR

Pointer, an efficient Resistive Random Access Memory (ReRAM)-based point cloud recognition accelerator with inter- and intra-layer optimizations with three techniques for point cloud acceleration, adopts ReRAM-based architecture and proposes topology-aware intra-layer reordering which improves the execution order for better data locality.

Abstract

Point cloud is an important data structure for a wide range of applications, including robotics, AR/VR, and autonomous driving. To process the point cloud, many deep-learning-based point cloud recognition algorithms have been proposed. However, to meet the requirement of applications like autonomous driving, the algorithm must be fast enough, rendering accelerators necessary at the inference stage. But existing point cloud accelerators are still inefficient due to two challenges. First, the multi-layer perceptron (MLP) during feature computation is the performance bottleneck. Second, the feature vector fetching operation incurs heavy DRAM access. In this paper, we propose Pointer, an efficient Resistive Random Access Memory (ReRAM)-based point cloud recognition accelerator with inter- and intra-layer optimizations. It proposes three techniques for point cloud acceleration. First, Pointer adopts ReRAM-based architecture to significantly accelerate the MLP in feature computation. Second, to reduce DRAM access, Pointer proposes inter-layer coordination. It schedules the next layer to fetch the results of the previous layer as soon as they are available, which allows on-chip fetching thus reduces DRAM access. Third, Pointer proposes topology-aware intra-layer reordering, which improves the execution order for better data locality. Pointer proves to achieve 40x to 393x speedup and 22x to 163x energy efficiency over prior accelerators without any accuracy loss.

Pointer: An Energy-Efficient ReRAM-based Point Cloud Recognition Accelerator with Inter-layer and Intra-layer Optimizations

TL;DR

Pointer, an efficient Resistive Random Access Memory (ReRAM)-based point cloud recognition accelerator with inter- and intra-layer optimizations with three techniques for point cloud acceleration, adopts ReRAM-based architecture and proposes topology-aware intra-layer reordering which improves the execution order for better data locality.

Abstract

Point cloud is an important data structure for a wide range of applications, including robotics, AR/VR, and autonomous driving. To process the point cloud, many deep-learning-based point cloud recognition algorithms have been proposed. However, to meet the requirement of applications like autonomous driving, the algorithm must be fast enough, rendering accelerators necessary at the inference stage. But existing point cloud accelerators are still inefficient due to two challenges. First, the multi-layer perceptron (MLP) during feature computation is the performance bottleneck. Second, the feature vector fetching operation incurs heavy DRAM access. In this paper, we propose Pointer, an efficient Resistive Random Access Memory (ReRAM)-based point cloud recognition accelerator with inter- and intra-layer optimizations. It proposes three techniques for point cloud acceleration. First, Pointer adopts ReRAM-based architecture to significantly accelerate the MLP in feature computation. Second, to reduce DRAM access, Pointer proposes inter-layer coordination. It schedules the next layer to fetch the results of the previous layer as soon as they are available, which allows on-chip fetching thus reduces DRAM access. Third, Pointer proposes topology-aware intra-layer reordering, which improves the execution order for better data locality. Pointer proves to achieve 40x to 393x speedup and 22x to 163x energy efficiency over prior accelerators without any accuracy loss.

Paper Structure

This paper contains 19 sections, 2 equations, 10 figures, 1 table, 1 algorithm.

Figures (10)

  • Figure 1: The workflow of PointNet++, which consists of two major stages named point mapping and feature processing. The point mapping stage includes farthest point sample (FPS) and neighbor search. The feature processing stage includes aggregation, feature computation, and reduction. In the aggregation step, for each sampled point $P_i$ with feature vector $F_i$, its neighboring points $P_j$'s feature vectors $F_j$ are also fetched. Then their difference $\mathcal{D}(F_i, F_j)$ is computed. Then an MLP $\mathcal{M}$ performs feature computation, generating $\mathcal{M}(\mathcal{D}(F_i, F_j))$ for each $P_j$. Finally all results are reduced by computing the maximum of each column.
  • Figure 2: (a) Multiply-accumulate operation with ReRAM. (b) The ReRAM array used as vector-matrix multiplier.
  • Figure 3: Inter-layer coordination and intra-layer reordering, using the example in Fig. \ref{['setabstractlayer']}. The upper sub-figures (i)(ii)(iii) illustrate the point execution order in each layer. The number within circles is the index of points $P_i$, and the $O_i$ outside the circles is the process order within each layer. The bottom sub-figures (a)(b)(c) illustrate how inter-layer coordination and intra-layer reordering improve the on-chip buffer hit rate thus performance. The buffer content shows the available content at the start point of each time step. (a) Basic ReRAM-based acceleartor. It simply schedules the execution by index order. (b) Accelerator with inter-layer coordination. It schedules the execution order in layer 1 based on the receptive field of points in layer 2. (c) Accelerator with both inter-layer coordination and intra-layer reordering (i.e., Pointer). The intra-layer reordering determines the execution order of layer 2. The inter-layer coordination still determines the execution order of layer 1.
  • Figure 4: An example of the pyramid-shaped receptive field. This example is consistent with Fig. \ref{['setabstractlayer']} and Fig. \ref{['workflow']}.
  • Figure 5: An obvious overlap between the receptive fields of two neighboring points in the last layer. (a) Original point cloud. (b) Green points are the output points in the last set-abstraction layer, the red and blue points are two neighboring points. (c) Green points are original point cloud, the red and blue points are the receptive fields of the two points in (b) respectively. There is a large overlap between these two fields.
  • ...and 5 more figures