Table of Contents
Fetching ...

TPE-Net: Track Point Extraction and Association Network for Rail Path Proposal Generation

Jungwon Kang, Mohammadjavad Ghorbanalivakili, Gunho Sohn, David Beach, Veronica Marin

TL;DR

This work introduces TPE-Net, an end-to-end track point extraction network for rail path proposal generation in autonomous trains. By jointly performing rail-area segmentation and regression to locate center points and left-right rail distances, it produces pixel-level triplets that are spatially clustered into track segments and then assembled into a path tree of all feasible ego-paths. The method achieves strong, real-time performance on RailSem19 with TP-rail pixel and path-level metrics around or above 0.92–0.95, while avoiding reliance on camera parameters or 3D data. Although state-of-the-art methods on private data may exceed these results, TPE-Net offers an end-to-end, geometry-free approach with practical applicability for real-time rail-path reasoning and risk assessment.

Abstract

One essential feature of an autonomous train is minimizing collision risks with third-party objects. To estimate the risk, the control system must identify topological information of all the rail routes ahead on which the train can possibly move, especially within merging or diverging rails. This way, the train can figure out the status of potential obstacles with respect to its route and hence, make a timely decision. Numerous studies have successfully extracted all rail tracks as a whole within forward-looking images without considering element instances. Still, some image-based methods have employed hard-coded prior knowledge of railway geometry on 3D data to associate left-right rails and generate rail route instances. However, we propose a rail path extraction pipeline in which left-right rail pixels of each rail route instance are extracted and associated through a fully convolutional encoder-decoder architecture called TPE-Net. Two different regression branches for TPE-Net are proposed to regress the locations of center points of each rail route, along with their corresponding left-right pixels. Extracted rail pixels are then spatially clustered to generate topological information of all the possible train routes (ego-paths), discarding non-ego-path ones. Experimental results on a challenging, publicly released benchmark show true-positive-pixel level average precision and recall of 0.9207 and 0.8721, respectively, at about 12 frames per second. Even though our evaluation results are not higher than the SOTA, the proposed regression pipeline performs remarkably in extracting the correspondences by looking once at the image. It generates strong rail route hypotheses without reliance on camera parameters, 3D data, and geometrical constraints.

TPE-Net: Track Point Extraction and Association Network for Rail Path Proposal Generation

TL;DR

This work introduces TPE-Net, an end-to-end track point extraction network for rail path proposal generation in autonomous trains. By jointly performing rail-area segmentation and regression to locate center points and left-right rail distances, it produces pixel-level triplets that are spatially clustered into track segments and then assembled into a path tree of all feasible ego-paths. The method achieves strong, real-time performance on RailSem19 with TP-rail pixel and path-level metrics around or above 0.92–0.95, while avoiding reliance on camera parameters or 3D data. Although state-of-the-art methods on private data may exceed these results, TPE-Net offers an end-to-end, geometry-free approach with practical applicability for real-time rail-path reasoning and risk assessment.

Abstract

One essential feature of an autonomous train is minimizing collision risks with third-party objects. To estimate the risk, the control system must identify topological information of all the rail routes ahead on which the train can possibly move, especially within merging or diverging rails. This way, the train can figure out the status of potential obstacles with respect to its route and hence, make a timely decision. Numerous studies have successfully extracted all rail tracks as a whole within forward-looking images without considering element instances. Still, some image-based methods have employed hard-coded prior knowledge of railway geometry on 3D data to associate left-right rails and generate rail route instances. However, we propose a rail path extraction pipeline in which left-right rail pixels of each rail route instance are extracted and associated through a fully convolutional encoder-decoder architecture called TPE-Net. Two different regression branches for TPE-Net are proposed to regress the locations of center points of each rail route, along with their corresponding left-right pixels. Extracted rail pixels are then spatially clustered to generate topological information of all the possible train routes (ego-paths), discarding non-ego-path ones. Experimental results on a challenging, publicly released benchmark show true-positive-pixel level average precision and recall of 0.9207 and 0.8721, respectively, at about 12 frames per second. Even though our evaluation results are not higher than the SOTA, the proposed regression pipeline performs remarkably in extracting the correspondences by looking once at the image. It generates strong rail route hypotheses without reliance on camera parameters, 3D data, and geometrical constraints.
Paper Structure (11 sections, 5 equations, 6 figures, 1 table)

This paper contains 11 sections, 5 equations, 6 figures, 1 table.

Figures (6)

  • Figure 1: The top image shows a sample input image captured by a forward-looking camera. The bottom images are the corresponding detected possible ego-paths. Other routes seen in the input image are discarded from the outcome as they are not possible ego-paths.
  • Figure 2: Overall diagram of our proposed rail path extraction algorithm. In the first stage, left, right, and center rail pixels are detected and associated through a fully convolutional network. Next, track segments are generated by linking the extracted track points in each sub-region. Finally, a path graph is created in the shape of a tree, covering all the possible ego-paths. In the path tree, S stands for start node, SW stands for switch node, and E stands for end node of each detected ego-path. After filtering the detected paths, polynomial fitting is performed on the extracted rail pixels to visualize extracted rails better.
  • Figure 3: Outputs of the first version of the network for its different regression tasks within all pixels of the input image. The outputs are all in the form of 1-channel heatmaps. Here, the heatmap value gets larger as the color goes darker.
  • Figure 4: In the switch region, pixels inside the shared rail area (annotated in green) have more than one left/right distance. Here, the extracted track point shown by a red dot corresponds to the rightmost rail track.
  • Figure 5: A detailed structure of the proposed fully convolutional TPE-Net that outputs segmentation and triplet coordinates' regression within 2D images of rail scene. There are two different designs for the regression branch in our proposed network.
  • ...and 1 more figures