Skip to content

Policy Driven RF Denoising for Adaptive Geolocation FFT-Domain Filtering

We propose a policy-driven RF denoising framework
in which reinforcement learning (RL) adaptively controls FFT domain filters to minimize timing and correlation errors in
passive geolocation. Unlike static low-pass or notch filters, the
policy selects denoising actions in real time based on residual
time-difference-of-arrival (TDoA) error and correlation entropy,
providing a feedback loop that directly targets physical error
metrics. Experiments on synthetic RF sequences with and without
narrowband jammers demonstrate that the learned policies
converge rapidly and consistently outperform fixed filtering
strategies, yielding 28.6% reduction in TDoA residuals and 45%
improvement in jammer conditions across SNR sweeps. Ablation
on the entropy-weight λ confirms its role in balancing timing
fidelity with spectral purity, with optimal performance at λ = 0.5.
Before/after spectrograms illustrate the qualitative suppression of
jammer tones and the restoration of signal structure. By framing
reinforcement learning as a controller for adaptive denoising, this
work extends classical signal processing approaches with data driven adaptability, while retaining interpretability, deployability,
and tight alignment with RF timing accuracy.
Index Terms—RF signal processing, adaptive denoising, reinforcement learning, time-difference-of-arrival, geolocation, FFT
filtering, jammer suppression