Grouped Query Attention for Subscriber Routing in Message-Oriented Middleware

Grouped Query Attention for Subscriber Routing in Message-Oriented Middleware

We study GroupedSubscriberManager (GQA-
inspired): subscribers are grouped, per-topic group sets are
KV-cached, groups are ordered by measured performance (faster
first), and subscribers are ordered by priority (within group).
We quantify cache-hit ratios, group prioritization accuracy,
and end-to-end throughput under synthetic workloads. Our
approach achieves cache-hit ratios up to 90% while maintaining
priority semantics and adapting group ordering based on
real-time performance feedback.

Grouped Query Attention for Subscriber Routing in Message-Oriented Middleware

Leave a Reply Cancel reply