Table of Contents
Fetching ...

Deterministic Preprocessing and Interpretable Fuzzy Banding for Cost-per-Student Reporting from Extracted Records

Shane Lee, Stella Ng

TL;DR

A deterministic, rule-governed, file-based workflow that provides a reproducible calculation of a band assignment from the reported anchors and ratios and a worked example provides a reproducible calculation of a band assignment.

Abstract

Administrative extracts are often exchanged as spreadsheets and may be read as reports in their own right during budgeting, workload review, and governance discussions. When an exported workbook becomes the reference snapshot for such decisions, the transformation can be checked by recomputation against a clearly identified input. A deterministic, rule-governed, file-based workflow is implemented in cad_processor.py. The script ingests a Casual Academic Database (CAD) export workbook and aggregates inclusive on-costs and student counts into subject-year and school-year totals, from which it derives cost-per-student ratios. It writes a processed workbook with four sheets: Processing Summary (run record and counters), Trend Analysis (schoolyear cost-per-student matrix), Report (wide subject-level table), and Fuzzy Bands (per-year anchors, membership weights, and band labels). The run record includes a SHA-256 hash of the input workbook bytes to support snapshot-matched recomputation. For within-year interpretation, the workflow adds a simple fuzzy banding layer that labels finite, positive school-year cost-per-student values as Low, Medium, or High. The per-year anchors are the minimum, median, and maximum of the finite, positive ratios. Membership weights are computed using left-shoulder, triangular, and right-shoulder functions, with deterministic tie-breaking in a fixed priority order (Medium, then Low, then High). These weights are treated as decision-support signals rather than probabilities. A worked example provides a reproducible calculation of a band assignment from the reported anchors and ratios. Supplementary material includes a claim-to-evidence matrix, a reproducibility note, and a short glossary that links selected statements to code and workbook artefacts.

Deterministic Preprocessing and Interpretable Fuzzy Banding for Cost-per-Student Reporting from Extracted Records

TL;DR

A deterministic, rule-governed, file-based workflow that provides a reproducible calculation of a band assignment from the reported anchors and ratios and a worked example provides a reproducible calculation of a band assignment.

Abstract

Administrative extracts are often exchanged as spreadsheets and may be read as reports in their own right during budgeting, workload review, and governance discussions. When an exported workbook becomes the reference snapshot for such decisions, the transformation can be checked by recomputation against a clearly identified input. A deterministic, rule-governed, file-based workflow is implemented in cad_processor.py. The script ingests a Casual Academic Database (CAD) export workbook and aggregates inclusive on-costs and student counts into subject-year and school-year totals, from which it derives cost-per-student ratios. It writes a processed workbook with four sheets: Processing Summary (run record and counters), Trend Analysis (schoolyear cost-per-student matrix), Report (wide subject-level table), and Fuzzy Bands (per-year anchors, membership weights, and band labels). The run record includes a SHA-256 hash of the input workbook bytes to support snapshot-matched recomputation. For within-year interpretation, the workflow adds a simple fuzzy banding layer that labels finite, positive school-year cost-per-student values as Low, Medium, or High. The per-year anchors are the minimum, median, and maximum of the finite, positive ratios. Membership weights are computed using left-shoulder, triangular, and right-shoulder functions, with deterministic tie-breaking in a fixed priority order (Medium, then Low, then High). These weights are treated as decision-support signals rather than probabilities. A worked example provides a reproducible calculation of a band assignment from the reported anchors and ratios. Supplementary material includes a claim-to-evidence matrix, a reproducibility note, and a short glossary that links selected statements to code and workbook artefacts.
Paper Structure (35 sections, 4 equations, 2 figures, 4 tables)

This paper contains 35 sections, 4 equations, 2 figures, 4 tables.

Figures (2)

  • Figure 1: Excerpt from the run entry point (CADProcessorFuzzy.run). The processor logs the resolved input path, computes the input workbook SHA-256 hash over the file bytes, writes the output workbook, and appends a tab-separated execution log line (status, input name, output name, input SHA-256).
  • Figure 2: Excerpt from the fuzzy band assignment. Membership weights are computed and ties are broken deterministically with the priority Medium, then Low, then High. The selected label is mapped to a numeric band score (Low 0.0, Medium 0.5, High 1.0).