Information Theory
Covers theoretical and experimental aspects of information theory and coding. Includes source coding, channel coding, data compression, and cryptographic protocols.
Covers theoretical and experimental aspects of information theory and coding. Includes source coding, channel coding, data compression, and cryptographic protocols.
2601.04193We propose a discrete transport equation on graphs which connects distributions on both vertices and edges. We then derive a discrete analogue of the Benamou-Brenier formulation for Wasserstein-$1$ distance on a graph and as a result classify all $W_1$ geodesics on graphs.
Grant-free cell-free massive multiple-input multiple-output (GF-CF-MaMIMO) systems are anticipated to be a key enabling technology for next-generation Internet-of-Things (IoT) networks, as they support massive connectivity without explicit scheduling. However, the large amount of connected devices prevents the use of orthogonal pilot sequences, resulting in severe pilot contamination (PC) that degrades channel estimation and data detection performance. Furthermore, scalable GF-CF-MaMIMO networks inherently rely on distributed signal processing. In this work, we consider the uplink of a GF-CF-MaMIMO system and propose two novel distributed algorithms for joint activity detection, channel estimation, and data detection (JACD) based on expectation propagation (EP). The first algorithm, denoted as JACD-EP, uses Gaussian approximations for the channel variables, whereas the second, referred to as JACD-EP-BG, models them as Bernoulli-Gaussian (BG) random variables. To integrate the BG distribution into the EP framework, we derive its exponential family representation and develop the two algorithms as efficient message passing over a factor graph constructed from the a posteriori probability (APP) distribution. The proposed framework is inherently scalable with respect to both the number of access points (APs) and user equipments (UEs). Simulation results show the efficient mitigation of PC by the proposed distributed algorithms and their superior detection accuracy compared to (genie-aided) centralized linear detectors.
2601.04041A $t$-all-symbol PIR code and a $t$-all-symbol batch code of dimension $k$ consist of $n$ servers storing linear combinations of $k$ linearly independent information symbols with the following recovery property: any symbol stored by a server can be recovered from $t$ pairwise disjoint subsets of servers. In the batch setting, we further require that any multiset of size $t$ of stored symbols can be recovered from $t$ disjoint subsets of servers. This framework unifies and extends several well-known code families, including one-step majority-logic decodable codes, (functional) PIR codes, and (functional) batch codes. In this paper, we determine the minimum code length for some small values of $k$ and $t$, characterize structural properties of codes attaining this optimum, and derive bounds that show the trade-offs between length, dimension, minimum distance, and $t$. In addition, we study MDS codes and the simplex code, demonstrating how these classical families fit within our framework, and establish new cases of an open conjecture from \cite{YAAKOBI2020} concerning the minimal $t$ for which the simplex code is a $t$-functional batch code.
Low-altitude wireless networks (LAWNs) are expected to play a central role in future 6G infrastructures, yet uplink transmissions of uncrewed aerial vehicles (UAVs) remain vulnerable to eavesdropping due to their limited transmit power, constrained antenna resources, and highly exposed air-ground propagation conditions. To address this fundamental bottleneck, we propose a flexible-duplex cell-free (CF) architecture in which each distributed access point (AP) can dynamically operate either as a receive AP for UAV uplink collection or as a transmit AP that generates cooperative artificial noise (AN) for secrecy enhancement. Such AP-level duplex flexibility introduces an additional spatial degree of freedom that enables distributed and adaptive protection against wiretapping in LAWNs. Building upon this architecture, we formulate a max-min secrecy-rate problem that jointly optimizes AP mode selection, receive combining, and AN covariance design. This tightly coupled and nonconvex optimization is tackled by first deriving the optimal receive combiners in closed form, followed by developing a penalty dual decomposition (PDD) algorithm with guaranteed convergence to a stationary solution. To further reduce computational burden, we propose a low-complexity sequential scheme that determines AP modes via a heuristic metric and then updates the AN covariance matrices through closed-form iterations embedded in the PDD framework. Simulation results show that the proposed flexible-duplex architecture yields substantial secrecy-rate gains over CF systems with fixed AP roles. The joint optimization method attains the highest secrecy performance, while the low-complexity approach achieves over 90% of the optimal performance with an order-of-magnitude lower computational complexity, offering a practical solution for secure uplink communications in LAWNs.
2601.03982Error-correcting codes are combinatorial objects designed to cope with the problem of reliable transmission of information on a noisy channel. A fundamental problem in coding theory and practice is to efficiently decode the received word with errors to obtain the transmitted codeword. In this paper, we consider the decoding problem of Hyperderivative Reed-Solomon (HRS) codes with respect to the NRT metric. Specifically, we propose a Welch-Berlekamp algorithm for the unique decoding of NRT HRS codes.
Reconfigurable intelligent surfaces (RISs) enable programmable control of the wireless propagation environment and are key enablers for future networks. Beyond-diagonal RIS (BD-RIS) architectures enhance conventional RIS by interconnecting elements through tunable impedance components, offering greater flexibility with higher circuit complexity. However, excessive interconnections between BD-RIS elements require multi-layer printed circuit board (PCB) designs, increasing fabrication difficulty. In this letter, we use graph theory to characterize the BD-RIS architectures that can be realized on double-layer PCBs, denoted as planar-connected RISs. Among the possible planar-connected RISs, we identify the ones with the most degrees of freedom, expected to achieve the best performance under practical constraints.
2601.03492This paper introduces a class of Hermitian LCD $2$-quasi-abelian codes over finite fields and presents a comprehensive enumeration of these codes in which relative minimum weights are small. We show that such codes are asymptotically good over finite fields. Furthermore, we extend our analysis to finite chain rings by characterizing $2$-quasi-abelian codes in this setting and proving the existence of asymptotically good Hermitian LCD $2$-quasi-abelian codes over finite chain rings as well.
2601.03489A subspace code is a nonempty collection of subspaces of the vector space $\mathbb{F}_q^{n}$. A pair of linear codes is called a linear complementary pair (in short LCP) of codes if their intersection is trivial and the sum of their dimensions equals the dimension of the ambient space. Equivalently, the two codes form an LCP if the direct sum of these two codes is equal to the entire space. In this paper, we introduce the concept of LCPs of subspace codes. We first provide a characterization of subspace codes that form an LCP. Furthermore, we present a sufficient condition for the existence of an LCP of subspace codes based on a complement function on a subspace code. In addition, we give several constructions of LCPs for subspace codes using various techniques and provide an application to insertion error correction.
We provide new insights into an open problem recently posed by Yuan-Sun [ISIT 2025], concerning the minimum individual key rate required in the vector linear secure aggregation problem. Consider a distributed system with $K$ users, where each user $k\in [K]$ holds a data stream $W_k$ and an individual key $Z_k$. A server aims to compute a linear function $\mathbf{F}[W_1;\ldots;W_K]$ without learning any information about another linear function $\mathbf{G}[W_1;\ldots;W_K]$, where $[W_1;\ldots;W_K]$ denotes the row stack of $W_1,\ldots,W_K$. The open problem is to determine the minimum required length of $Z_k$, denoted as $R_k$, $k\in [K]$. In this paper, we characterize a new achievable region for the rate tuple $(R_1,\ldots,R_K)$. The region is polyhedral, with vertices characterized by a binary rate assignment $(R_1,\ldots,R_K) = (\mathbf{1}(1 \in \mathcal{I}),\ldots,\mathbf{1}(K\in \mathcal{I}))$, where $\mathcal{I}\subseteq [K]$ satisfies the \textit{rank-increment condition}: $\mathrm{rank}\left(\bigl[\mathbf{F}_{\mathcal{I}};\mathbf{G}_{\mathcal{I}}\bigr]\right) =\mathrm{rank}\bigl(\mathbf{F}_{\mathcal{I}}\bigr)+N$. Here, $\mathbf{F}_\mathcal{I}$ and $\mathbf{G}_\mathcal{I}$ are the submatrices formed by the columns indexed by $\mathcal{I}$. Our results uncover the novel fact that it is not necessary for every user to hold a key, thereby strictly enlarging the best-known achievable region in the literature. Furthermore, we provide a converse analysis to demonstrate its optimality when minimizing the number of users that hold keys.
2601.03165In this article, we determine the minimum distance of the Euclidean dual of the cyclic code $\mathcal{C}_n$ generated by the $n$th cyclotomic polynomial $Q_n(x)$ over $\mathbb{F}_q$, for every positive integer $n$ co-prime to $q$. In particular, we prove that the minimum distance of $\mathcal{C}_{n}^{\perp}$ is a function of $n$, namely $2^{ω(n)}$. This was precisely the conjecture posed by us in \cite{BHAGAT2025}.
2601.03126The choice of an isomorphism, a duality, between a finite abelian group $A$ and its character group allows one to define dual codes of additive codes over $A$. Properties of dualities and dual codes are studied, continuing work of Delsarte from 1973 and more recent work of Dougherty and his collaborators.
Linear queries, as the basis of broad analysis tasks, are often released through privacy mechanisms based on differential privacy (DP), the most popular framework for privacy protection. However, DP adopts a context-free definition that operates independently of the data-generating distribution. In this paper, we revisit the privacy analysis of the Laplace mechanism through the lens of pointwise maximal leakage (PML). We demonstrate that the distribution-agnostic definition of the DP framework often mandates excessive noise. To address this, we incorporate an assumption about the prior distribution by lower-bounding the probability of any single record belonging to any specific class. With this assumption, we derive a tight, context-aware leakage bound for general linear queries, and prove that our derived bound is strictly tighter than the standard DP guarantee and converges to the DP guarantee as this probability lower bound approaches zero. Numerical evaluations demonstrate that by exploiting this prior knowledge, the required noise scale can be reduced while maintaining privacy guarantees.
The task of jointly communicating a message and reconstructing a common estimate of the channel state is examined for a fading Gaussian model with additive state interference. The state is an independent and identically distributed Gaussian sequence known noncausally at the transmitter, and the instantaneous fading coefficient is perfectly known at both the transmitter and the receiver. The receiver is required to decode the transmitted message and, in addition, reconstruct the state under a common reconstruction constraint ensuring that its estimate coincides with that at the transmitter. A complete characterization of the optimal rate distortion tradeoff region for this setting is the main result of our work. The analytical results are also validated through numerical examples illustrating the rate distortion and power distortion tradeoffs.
2601.02608In the 1960s, MacWilliams proved that the Hamming weight enumerator of a linear code over a finite field completely determines, and is determined by, the Hamming weight enumerator of its dual code. In particular, if two linear codes have the same Hamming weight enumerator, then their dual codes have the same Hamming weight enumerator. In contrast, there is a wide class of weights on finite fields whose weight enumerators have the opposite behavior: there exist two linear codes having the same weight enumerator, but their dual codes have different weight enumerators.
We study exact decoding for the toric code and for planar and rotated surface codes under the standard independent \(X/Z\) noise model, focusing on Separate Minimum Weight (SMW) decoding and Separate Most Likely Coset (SMLC) decoding. For the SMW decoding problem, we show that an \(O(n^{3/2}\log n)\)-time decoder is achievable for surface and toric codes, improving over the \(O(n^{3}\log n)\) worst-case time of the standard approach based on complete decoding graphs. Our approach is based on a local reduction of SMW decoding to the minimum weight perfect matching problem using Fisher gadgets, which preserves planarity for planar and rotated surface codes and genus~\(1\) for the toric code. This reduction enables the use of Lipton--Tarjan planar separator methods and implies that SMW decoding lies in \(\mathrm{NC}\). For SMLC decoding, we show that the planar surface code admits an exact decoder with \(O(n^{3/2})\) algebraic complexity and that the problem lies in \(\mathrm{NC}\), improving over the \(O(n^{2})\) algebraic complexity of Bravyi \emph{et al.} Our approach proceeds via a dual-cycle formulation of coset probabilities and an explicit reduction to planar Pfaffian evaluation using Fisher--Kasteleyn--Temperley constructions. The same complexity measures apply to SMLC decoding of the rotated surface code. For the toric code, we obtain an exact polynomial-time SMLC decoder with \(O(n^{3})\) algebraic complexity. In addition, while the SMLC formulation is motivated by connections to statistical mechanics, we provide a purely algebraic derivation of the underlying duality based on MacWilliams duality and Fourier analysis. Finally, we discuss extensions of the framework to the depolarizing noise model and identify resulting open problems.
The deployment of large-scale neural networks within the Open Radio Access Network (O-RAN) architecture is pivotal for enabling native edge intelligence. However, this paradigm faces two critical bottlenecks: the prohibitive memory footprint required for local training on resource-constrained gNBs, and the saturation of bandwidth-limited backhaul links during the global aggregation of high-dimensional model updates. To address these challenges, we propose CoCo-Fed, a novel Compression and Combination-based Federated learning framework that unifies local memory efficiency and global communication reduction. Locally, CoCo-Fed breaks the memory wall by performing a double-dimension down-projection of gradients, adapting the optimizer to operate on low-rank structures without introducing additional inference parameters/latency. Globally, we introduce a transmission protocol based on orthogonal subspace superposition, where layer-wise updates are projected and superimposed into a single consolidated matrix per gNB, drastically reducing the backhaul traffic. Beyond empirical designs, we establish a rigorous theoretical foundation, proving the convergence of CoCo-Fed even under unsupervised learning conditions suitable for wireless sensing tasks. Extensive simulations on an angle-of-arrival estimation task demonstrate that CoCo-Fed significantly outperforms state-of-the-art baselines in both memory and communication efficiency while maintaining robust convergence under non-IID settings.
As wireless communication applications evolve from traditional multipath environments to high-mobility scenarios like unmanned aerial vehicles, multiplexing techniques have advanced accordingly. Traditional single-carrier frequency-domain equalization (SC-FDE) and orthogonal frequency-division multiplexing (OFDM) have given way to emerging orthogonal time-frequency space (OTFS) and affine frequency-division multiplexing (AFDM). These approaches exploit specific channel structures to diagonalize or sparsify the effective channel, thereby enabling low-complexity detection. However, their reliance on these structures significantly limits their robustness in dynamic, real-world environments. To address these challenges, this paper studies a random multiplexing technique that is decoupled from the physical channels, enabling its application to arbitrary norm-bounded and spectrally convergent channel matrices. Random multiplexing achieves statistical fading-channel ergodicity for transmitted signals by constructing an equivalent input-isotropic channel matrix in the random transform domain. It guarantees the asymptotic replica MAP bit-error rate (BER) optimality of AMP-type detectors for linear systems with arbitrary norm-bounded, spectrally convergent channel matrices and signaling configurations, under the unique fixed point assumption. A low-complexity cross-domain memory AMP (CD-MAMP) detector is considered, leveraging the sparsity of the time-domain channel and the randomness of the equivalent channel. Optimal power allocations are derived to minimize the replica MAP BER and maximize the replica constrained capacity of random multiplexing systems. The optimal coding principle and replica constrained-capacity optimality of CD-MAMP detector are investigated for random multiplexing systems. Additionally, the versatility of random multiplexing in diverse wireless applications is explored.
In the age of data revolution, a modern storage~or transmission system typically requires different levels of protection. For example, the coding technique used to fortify data in a modern storage system when the device is fresh cannot be the same as that used when the device ages. Therefore, providing reconfigurable coding schemes and devising an effective way to perform this reconfiguration are key to extending the device lifetime. We focus on constrained coding schemes for the emerging two-dimensional magnetic recording (TDMR) technology. Recently, we have designed efficient lexicographically-ordered constrained (LOCO) coding schemes for various stages of the TDMR device lifetime, focusing on the elimination of isolation patterns, and demonstrated remarkable gains by using them. LOCO codes are naturally reconfigurable, and we exploit this feature in our work. Reconfiguration based on predetermined time stamps, which is what the industry adopts, neglects the actual device status. Instead, we propose offline and online learning methods to perform this task based on the device status. In offline learning, training data is assumed to be available throughout the time span of interest, while in online learning, we only use training data at specific time intervals to make consequential decisions. We fit the training data to polynomial equations that give the bit error rate in terms of TD density, then design an optimization problem in order to reach the optimal reconfiguration decisions to switch from a coding scheme to another. The objective is to maximize the storage capacity and/or minimize the decoding complexity. The problem reduces to a linear programming problem. We show that our solution is the global optimal based on problem characteristics, and we offer various experimental results that demonstrate the effectiveness of our approach in TDMR systems.
We make modifications to the unscented Kalman filter (UKF) which bestow almost complete practical identifiability upon a lumped-parameter cardiovascular model with 10 parameters and 4 output observables - a highly non-linear, stiff problem of clinical significance. The modifications overcome the challenging problems of rank deficiency when applying the UKF to parameter estimation. Rank deficiency usually means only a small subset of parameters can be estimated. Traditionally, pragmatic compromises are made, such as selecting an optimal subset of parameters for estimation and fixing non-influential parameters. Kalman filters are typically used for dynamical state tracking, to facilitate the control u at every time step. However, for the purpose of parameter estimation, this constraint no longer applies. Our modification has transformed the utility of UKF for the parameter estimation purpose, including minimally influential parameters, with excellent robustness (i.e., under severe noise corruption, challenging patho-physiology, and no prior knowledge of parameter distributions). The modified UKF algorithm is robust in recovering almost all parameters to over 98% accuracy, over 90% of the time, with a challenging target data set of 50, 10-parameter samples. We compare this to the original implementation of the UKF algorithm for parameter estimation and demonstrate a significant improvement.
Federated edge learning (FEEL) enables wireless devices to collaboratively train a centralised model without sharing raw data, but repeated uplink transmission of model updates makes communication the dominant bottleneck. Over-the-air (OTA) aggregation alleviates this by exploiting the superposition property of the wireless channel, enabling simultaneous transmission and merging communication with computation. Digital OTA schemes extend this principle by incorporating the robustness of conventional digital communication, but current designs remain limited in low signal-to-noise ratio (SNR) regimes. This work proposes a learned digital OTA framework that improves recovery accuracy, convergence behaviour, and robustness to challenging SNR conditions while maintaining the same uplink overhead as state-of-the-art methods. The design integrates an unsourced random access (URA) codebook with vector quantisation and AMP-DA-Net, an unrolled approximate message passing (AMP)-style decoder trained end-to-end with the digital codebook and parameter server local training statistics. The proposed design extends OTA aggregation beyond averaging to a broad class of symmetric functions, including trimmed means and majority-based rules. Experiments on highly heterogeneous device datasets and varying numbers of active devices show that the proposed design extends reliable digital OTA operation by more than 10 dB into low SNR regimes while matching or improving performance across the full SNR range. The learned decoder remains effective under message corruption and nonlinear aggregation, highlighting the broader potential of end-to-end learned design for digital OTA communication in FEEL.