Table of Contents
Fetching ...

Rotation Invariance in Floor Plan Digitization using Zernike Moments

Marius Graumann, Jan Marius Stürmer, Tobias Koch

TL;DR

The paper addresses extracting structured floor-plan information from rotated or noisy raster scans by proposing an end-to-end pipeline that converts pre-processed images into a Region Adjacency Graph (RAG) and classifies indoor elements using a Graph Neural Network. A key contribution is per-polygon normalization and an invariant ratio that ensures Zernike moments capture the full polygon shape within a circular support, improving rotation invariance ($F_P = \\sqrt{\\frac{A}{A_P}}$, $A = c\\, r^2 \\pi$, $c \\le \\frac{\\lambda(M)}{R_M^2 \\pi}$). The work also introduces a wall-splitting post-processing step to partition walls into room-associated segments, enabling downstream tasks such as room connectivity and 3D reconstruction. Experiments on CubiCasa and CVC datasets show substantial gains in F1 and IoU on rotated data, with improved robustness when using smaller invariant ratios, while highlighting generalization challenges on more complex RAGs.

Abstract

Nowadays, a lot of old floor plans exist in printed form or are stored as scanned raster images. Slight rotations or shifts may occur during scanning. Bringing floor plans of this form into a machine readable form to enable further use, still poses a problem. Therefore, we propose an end-to-end pipeline that pre-processes the image and leverages a novel approach to create a region adjacency graph (RAG) from the pre-processed image and predict its nodes. By incorporating normalization steps into the RAG feature extraction, we significantly improved the rotation invariance of the RAG feature calculation. Moreover, applying our method leads to an improved F1 score and IoU on rotated data. Furthermore, we proposed a wall splitting algorithm for partitioning walls into segments associated with the corresponding rooms.

Rotation Invariance in Floor Plan Digitization using Zernike Moments

TL;DR

The paper addresses extracting structured floor-plan information from rotated or noisy raster scans by proposing an end-to-end pipeline that converts pre-processed images into a Region Adjacency Graph (RAG) and classifies indoor elements using a Graph Neural Network. A key contribution is per-polygon normalization and an invariant ratio that ensures Zernike moments capture the full polygon shape within a circular support, improving rotation invariance (, , ). The work also introduces a wall-splitting post-processing step to partition walls into room-associated segments, enabling downstream tasks such as room connectivity and 3D reconstruction. Experiments on CubiCasa and CVC datasets show substantial gains in F1 and IoU on rotated data, with improved robustness when using smaller invariant ratios, while highlighting generalization challenges on more complex RAGs.

Abstract

Nowadays, a lot of old floor plans exist in printed form or are stored as scanned raster images. Slight rotations or shifts may occur during scanning. Bringing floor plans of this form into a machine readable form to enable further use, still poses a problem. Therefore, we propose an end-to-end pipeline that pre-processes the image and leverages a novel approach to create a region adjacency graph (RAG) from the pre-processed image and predict its nodes. By incorporating normalization steps into the RAG feature extraction, we significantly improved the rotation invariance of the RAG feature calculation. Moreover, applying our method leads to an improved F1 score and IoU on rotated data. Furthermore, we proposed a wall splitting algorithm for partitioning walls into segments associated with the corresponding rooms.

Paper Structure

This paper contains 21 sections, 2 theorems, 6 equations, 5 figures, 2 tables, 2 algorithms.

Key Result

lemma thmcounterlemma

Let $r,F\in{\rm I\!R}^+$ and $M\subset {\rm I\!R}^n$ closed, Lebesgue measurable with $\lambda(M)>0$. It holds that

Figures (5)

  • Figure 1: Filter of building (a) Input image after text removal and dilation (b) red the detected contours and green the largest contour (c) refined polygon of largest contour (d) Filtered building
  • Figure 2: RAG of Floorplan. Every polygon has a unique color and is represented by a blue node inside the graph. The node is at the center of mass of the polygon. Two nodes are connected if the corresponding polygons are adjacent.
  • Figure 3: Top: Rectangles with different sides a, b. Bottom: The Polygons are scaled with $F_P$. The polygons with invariant ratio greater than $c=0.5$ are inside the circle with radius r after scaling.
  • Figure 4: (a) RAG with labels (b) Room connectivity Graph
  • Figure 5: Demonstration from input image to splitted walls (a) Input image (b) Wall polygon (c) Interior and exterior linear rings of polygon (d) Added separation lines (e) Crossing lines removed (f) polygon creation of remaining lines

Theorems & Definitions (5)

  • definition thmcounterdefinition
  • lemma thmcounterlemma
  • proof
  • lemma thmcounterlemma
  • proof