Table of Contents
Fetching ...

Non-interfering On-line and In-field SoC Testing

Tobias Strauch

TL;DR

The paper tackles the challenge of robust in-field and on-line testing for aging and SEU-induced faults in safety-critical SoCs. It presents System Hyper Pipelining (SHP), a hybrid of Barrel CPU interleaving and C-slow retiming, enabling non-interfering multi-threaded operation paired with ultra-fast SEU detection and recovery. A RTL ATPG-based GIF framework provides 100% coverage of testable stuck-at faults with negligible area overhead, and an integrated EDA workflow enables automated test generation and coverage analysis. The proposed architecture and methodology offer a practical path to continuous health monitoring and fault recovery in production environments while maintaining ISO 26262 safety compliance.

Abstract

With increasing aging problems of advanced technologies, in-field testing becomes an inevitable challenge, on top of the already demanding requirements, such as the ISO26262 for automotive safety. SOCs used in space, automotive or military applications in particular are worst affected as the in-field failures in these applications could even be life threatening. We focus on on-line and in-field testing for Single Event Upsets (SEU, caused by a single ionizing particle) and aging defects (such as delay variation and stuck-at faults) which may appear during normal operation of the device. Interrupting normal operations for aging defects testing is a major challenge for the OS. Additionally, checkpointing with rollback-recovery can be costly and mission critical data can be lost in case of an SEU event. We eliminate many of these problems with our non-interfering in-field testing and recovery solution. We apply a hardware performance improvement technique called System Hyper Pipelining (SHP), which combines well-known context switching (Barrel CPU) and C-slow retiming techniques. The SoC is enhanced with an SEU detection and ultra-fast recovery mechanism. We also use an RTL ATPG framework that enables the generation of software-based self-tests to achieve 100% coverage of all testable stuck-at-faults. The paper finishes with very promising performance-per-area and test-cycles-per-net results. We argue that our robust system architecture and EDA solution, designed and developed primarily for in-field testing of SoCs, can also be used for production and on-line testing as well as other applications.

Non-interfering On-line and In-field SoC Testing

TL;DR

The paper tackles the challenge of robust in-field and on-line testing for aging and SEU-induced faults in safety-critical SoCs. It presents System Hyper Pipelining (SHP), a hybrid of Barrel CPU interleaving and C-slow retiming, enabling non-interfering multi-threaded operation paired with ultra-fast SEU detection and recovery. A RTL ATPG-based GIF framework provides 100% coverage of testable stuck-at faults with negligible area overhead, and an integrated EDA workflow enables automated test generation and coverage analysis. The proposed architecture and methodology offer a practical path to continuous health monitoring and fault recovery in production environments while maintaining ISO 26262 safety compliance.

Abstract

With increasing aging problems of advanced technologies, in-field testing becomes an inevitable challenge, on top of the already demanding requirements, such as the ISO26262 for automotive safety. SOCs used in space, automotive or military applications in particular are worst affected as the in-field failures in these applications could even be life threatening. We focus on on-line and in-field testing for Single Event Upsets (SEU, caused by a single ionizing particle) and aging defects (such as delay variation and stuck-at faults) which may appear during normal operation of the device. Interrupting normal operations for aging defects testing is a major challenge for the OS. Additionally, checkpointing with rollback-recovery can be costly and mission critical data can be lost in case of an SEU event. We eliminate many of these problems with our non-interfering in-field testing and recovery solution. We apply a hardware performance improvement technique called System Hyper Pipelining (SHP), which combines well-known context switching (Barrel CPU) and C-slow retiming techniques. The SoC is enhanced with an SEU detection and ultra-fast recovery mechanism. We also use an RTL ATPG framework that enables the generation of software-based self-tests to achieve 100% coverage of all testable stuck-at-faults. The paper finishes with very promising performance-per-area and test-cycles-per-net results. We argue that our robust system architecture and EDA solution, designed and developed primarily for in-field testing of SoCs, can also be used for production and on-line testing as well as other applications.
Paper Structure (30 sections, 3 figures, 2 tables)

This paper contains 30 sections, 3 figures, 2 tables.

Figures (3)

  • Figure 1: a) Simplified single clock design. b) Applying barrel technique c) Applying C-slow retiming d) Applying System-Hyper-Pipeling (SHP) e) SEU detection and recovery based on C-slow retiming and applyed on SHP
  • Figure 2: Average thread performance (Favg) of different scenarios running a) Original design, b) Design with barrel, c) C-slow retiming and d-f) SHP technique .
  • Figure 3: Snapshot of the coverage viewer GUI.