Skip to content

RL-Driven RF Neuromodulation

We train a DQN over power, frequency, phase,
angle to maximize a target-state proxy while penalizing SAR.
Compared to a hand-tuned schedule baseline, our agent improves
evaluation return by 25 % with median episode return 100, and
reduces state reconstruction error to 0.05. Plots and captions
auto-sync from logs.