{"id":5259,"date":"2026-03-28T17:18:59","date_gmt":"2026-03-28T17:18:59","guid":{"rendered":"https:\/\/arapt.us\/?p=5259"},"modified":"2026-03-28T17:18:59","modified_gmt":"2026-03-28T17:18:59","slug":"nerfengine-stage-8-protocol-intelligence-memory-compression-and-the-intelligence-loop","status":"publish","type":"post","link":"https:\/\/172-234-197-23.ip.linodeusercontent.com\/?p=5259","title":{"rendered":"NerfEngine Stage 8: Protocol Intelligence, Memory Compression, and the Intelligence Loop"},"content":{"rendered":"\n<p># There&#8217;s a specific moment in a system&#8217;s life where the question stops being &#8220;does it work?&#8221; and starts being &#8220;does it think?&#8221; \u00a0Stage 8 is that moment for SCYTHE.<\/p>\n\n\n\n<p>This release didn&#8217;t add more sensors. It didn&#8217;t add more endpoints. It made the system aware of *what the network is supposed to be doing* \u2014 and therefore hyper-sensitive to everything it isn&#8217;t.<\/p>\n\n\n\n<p>&#8212;<\/p>\n\n\n\n<p>## The Problem We Were Really Solving<\/p>\n\n\n\n<p>Stage 7 left us with a speculative graph that could recognise semantic similarity between entities, promote edges that accumulated enough evidence, and stream live intelligence to the operator dashboard. That&#8217;s good. But there was a structural gap.<\/p>\n\n\n\n<p>The system was observing **transport behavior** without understanding **protocol intent**.<\/p>\n\n\n\n<p>nDPI was scanning 655+ sessions per PCAP and coming back with `{&#8216;dns_names&#8217;: 2, &#8216;tls_snis&#8217;: 0, &#8216;http_hosts&#8217;: 0}`. Nearly blind. When you see port 443 traffic with no TLS SNI, that&#8217;s not a nDPI failure \u2014 it&#8217;s a **signal**. When port 53 traffic has average frame sizes above 300 bytes, that&#8217;s not noise \u2014 it&#8217;s a DNS tunnel.<\/p>\n\n\n\n<p>The difference between a system that drops that information and a system that uses it is the difference between an IDS and an intelligence platform.<\/p>\n\n\n\n<p>Three more cracks were showing alongside this:<\/p>\n\n\n\n<p>1. **Similarity search was tied to FAISS rebuild cycles.** Every streaming insert that needed neighbour lookup was competing with index state. There was no true online path.<\/p>\n\n\n\n<p>2. **The attention engine was scoring edges on four components**, but none of them knew whether the underlying traffic was violating protocol expectations. A port-443 session with no TLS SNI had the same anomaly weight as clean HTTPS.<\/p>\n\n\n\n<p>3. **SSE stream clients had no gap detection.** If a client missed five promotions during a reconnect, it silently fell behind. The dashboard looked live but wasn&#8217;t.<\/p>\n\n\n\n<p>&#8212;<\/p>\n\n\n\n<p>## Protocol Expectation Intelligence<\/p>\n\n\n\n<p>The core insight from this stage: **IANA port assignments are a behavioral contract**. Port 443 promises TLS. Port 53 promises small DNS frames. Port 123 promises 76-byte NTP packets. When a session breaks that contract, the violation *is* the signal \u2014 no signature matching required.<\/p>\n\n\n\n<p>We implemented `protocol_intel.py` \u2014 a stateless, data-oblivious IANA violation scorer.<\/p>\n\n\n\n<p>### What it does<\/p>\n\n\n\n<p>Nine violation detectors, each tuned to a specific class of contract breach:<\/p>\n\n\n\n<p>| Violation | Trigger | Score |<\/p>\n\n\n\n<p>|&#8212;|&#8212;|&#8212;|<\/p>\n\n\n\n<p>| `missing_tls` | Port 443\/993\/995 with no TLS SNI observed | 0.35 |<\/p>\n\n\n\n<p>| `dns_tunnel` | Port 53 avg frame &gt; 300B | 0.55 |<\/p>\n\n\n\n<p>| `oversized_ntp` | Port 123 avg frame &gt; 76B | 0.50 |<\/p>\n\n\n\n<p>| `wrong_transport` | TCP on UDP-only port or vice versa | 0.45 |<\/p>\n\n\n\n<p>| `unexpected_dns` | DNS payload on a non-DNS port | 0.40 |<\/p>\n\n\n\n<p>| `constant_size_c2` | Coefficient of variation &lt; 5% on any variable-size port | 0.40 |<\/p>\n\n\n\n<p>| `tcp_syn_only` | SYN with no ACK or RST (half-open probe \/ scan) | 0.30 |<\/p>\n\n\n\n<p>| `tcp_rst_flood` | RST flood without SYN\/ACK context | 0.35 |<\/p>\n\n\n\n<p>| `risk_port` | Known-malicious ports: 4444, 31337, 6667, 9001, 5555 | dynamic |<\/p>\n\n\n\n<p>The scorer accepts both `pcap_ingest.SessionData` objects and plain dicts, so it works at every layer of the pipeline \u2014 PCAP ingest, live stream events, and shadow edge metadata.<\/p>\n\n\n\n<p>### What this unlocks<\/p>\n\n\n\n<p>This is the detection class that signature-based systems miss entirely:<\/p>\n\n\n\n<p>&#8220;`<\/p>\n\n\n\n<p>Port 443 TCP<\/p>\n\n\n\n<p>\u2193<\/p>\n\n\n\n<p>No TLS ClientHello observed<\/p>\n\n\n\n<p>\u2193<\/p>\n\n\n\n<p>Score: 0.35 (missing_tls)<\/p>\n\n\n\n<p>Same actor, port 53 UDP<\/p>\n\n\n\n<p>\u2193<\/p>\n\n\n\n<p>Avg frame: 412 bytes<\/p>\n\n\n\n<p>\u2193<\/p>\n\n\n\n<p>Score: 0.55 (dns_tunnel)<\/p>\n\n\n\n<p>Combined: 0.90 \u2192 HOT tier immediately<\/p>\n\n\n\n<p>&#8220;`<\/p>\n\n\n\n<p>Neither of those sessions looks malicious by payload. You can&#8217;t fingerprint them. But the *behavior* violates protocol expectations \u2014 and now the system knows.<\/p>\n\n\n\n<p>### Wired everywhere<\/p>\n\n\n\n<p>`protocol_anomaly_score` and `protocol_violations` are now stamped onto:<\/p>\n\n\n\n<p>&#8211; Every session node in the hypergraph (via `pcap_ingest.emit_session()`)<\/p>\n\n\n\n<p>&#8211; Every speculative edge from the live ingest worker<\/p>\n\n\n\n<p>&#8211; The attention score for every edge in ShadowGraph<\/p>\n\n\n\n<p>&#8212;<\/p>\n\n\n\n<p>## Attention Engine: Five-Component Scoring<\/p>\n\n\n\n<p>The original attention formula had four weights summing to 1.0:<\/p>\n\n\n\n<p>&#8220;`<\/p>\n\n\n\n<p>attention = conf(0.40) + evidence(0.30) + recency(0.20) + proxy_anomaly(0.10)<\/p>\n\n\n\n<p>&#8220;`<\/p>\n\n\n\n<p>The `proxy_anomaly` term was an approximation \u2014 it used observation count and unmet requires as a stand-in for true anomaly signal.<\/p>\n\n\n\n<p>Now there are five:<\/p>\n\n\n\n<p>&#8220;`<\/p>\n\n\n\n<p>attention = conf(0.37) + evidence(0.28) + recency(0.18) + proxy_anomaly(0.07) + protocol_anomaly(0.10)<\/p>\n\n\n\n<p>&#8220;`<\/p>\n\n\n\n<p>`W_PROTO = 0.10` pulls directly from `ProtocolIntel` violation scores. `AttentionResult` now carries a `protocol_anomaly` field, and the engine looks for the score in three dict paths \u2014 the session node labels, the edge context, and the top-level edge dict \u2014 so it fires regardless of which pipeline stage produced the edge.<\/p>\n\n\n\n<p>The effect: a port-443 session with no TLS and constant-size packets that was previously `WARM` tier is now `HOT` on first observation. No waiting for observation accumulation.<\/p>\n\n\n\n<p>&#8212;<\/p>\n\n\n\n<p>## TurboQuant: Eliminating the Index Rebuild Problem<\/p>\n\n\n\n<p>The similarity search architecture had a fundamental mismatch with streaming ingest. FAISS `IndexFlatL2` is excellent for batch workloads \u2014 you build an index, you query it. For streaming, you&#8217;re constantly inserting while querying. Every new entity is a potential search target immediately, but FAISS has no true online insert path; each `add()` modifies state that concurrent searches read under a shared lock.<\/p>\n\n\n\n<p>We evaluated the paper *&#8221;TurboQuant: Near-Optimal Vector Quantization&#8221;* (arXiv 2504.19874) and found it maps precisely to this workload.<\/p>\n\n\n\n<p>### What TurboQuant does<\/p>\n\n\n\n<p>Three steps, done once per vector:<\/p>\n\n\n\n<p>1. **Random rotation** \u2014 statistically independent coordinates after QR decomposition<\/p>\n\n\n\n<p>2. **Scalar quantization** \u2014 optimal per-coordinate quantization against a pre-computed Beta distribution codebook<\/p>\n\n\n\n<p>3. **QJL correction** (TurboQuantIP) \u2014 1-bit residual quantization that removes inner product bias introduced by Stage 1<\/p>\n\n\n\n<p>The result: inner-product-optimal compression at 3 bits per dimension with near-zero indexing time.<\/p>\n\n\n\n<p>### TurboQuantStore<\/p>\n\n\n\n<p>We built `turbo_quant_store.py` as a thread-safe streaming vector store wrapping TurboQuantIP:<\/p>\n\n\n\n<p>&#8220;`<\/p>\n\n\n\n<p>add(entity_id, vec) &nbsp; &nbsp;\u2192 encode \u2192 append to fp16 dense cache &nbsp; &nbsp;O(dim)<\/p>\n\n\n\n<p>search(query, k=10) &nbsp; &nbsp;\u2192 normalize \u2192 fp16 matmul vs dense cache &nbsp;O(N\u00b7dim\/16)<\/p>\n\n\n\n<p>&#8220;`<\/p>\n\n\n\n<p>The fp16 dense matrix is the search primitive \u2014 no graph structure, no index rebuild, no lock contention. Pure `torch.mm`.<\/p>\n\n\n\n<p>**Benchmarks on this hardware:**<\/p>\n\n\n\n<p>| Metric | Value |<\/p>\n\n\n\n<p>|&#8212;|&#8212;|<\/p>\n\n\n\n<p>| Encode 10,000 vectors | 198ms (one-time, not per-query) |<\/p>\n\n\n\n<p>| Search top-20 over 10,000 vecs | 1.54ms |<\/p>\n\n\n\n<p>| Search top-10 over 100 vecs | 0.03ms |<\/p>\n\n\n\n<p>| Memory per vector (fp16 active) | 1.5KB vs 3KB fp32 (2\u00d7 compression) |<\/p>\n\n\n\n<p>| Compression vs fp32 codes | 7.5MB for 10k vecs vs 30MB fp32 |<\/p>\n\n\n\n<p>The numpy 2.x compatibility shim (`np.trapz \u2192 np.trapezoid`) is applied at import time so downstream code is unaffected.<\/p>\n\n\n\n<p>### Wired into SemanticShadow<\/p>\n\n\n\n<p>TurboQuantStore is now the **primary similarity search backend** in `semantic_shadow.py`:<\/p>\n\n\n\n<p>&#8220;`<\/p>\n\n\n\n<p>_embed_with_delta()<\/p>\n\n\n\n<p>&nbsp; \u2192 blend alpha=0.80 (identity continuity)<\/p>\n\n\n\n<p>&nbsp; \u2192 _tq_store.add(entity_id, blended_vec) &nbsp; \u2190 mirrors into fp16 cache<\/p>\n\n\n\n<p>process_entity()<\/p>\n\n\n\n<p>&nbsp; \u2192 if len(_tq_store) &gt; 1:<\/p>\n\n\n\n<p>&nbsp; &nbsp; &nbsp; &nbsp; results = _tq_store.search(vec, k=15) &nbsp; \u2190 TQ primary<\/p>\n\n\n\n<p>&nbsp; &nbsp; else:<\/p>\n\n\n\n<p>&nbsp; &nbsp; &nbsp; &nbsp; results = ee.search_similar(&#8230;) &nbsp; &nbsp; &nbsp; &nbsp; \u2190 FAISS fallback<\/p>\n\n\n\n<p>get_pca_coords()<\/p>\n\n\n\n<p>&nbsp; \u2192 reads from _entity_vecs directly (fp32, always most current)<\/p>\n\n\n\n<p>&nbsp; \u2192 no longer loops through FAISS reconstruct()<\/p>\n\n\n\n<p>&#8220;`<\/p>\n\n\n\n<p>FAISS remains as the cold-start fallback and persistence layer. TurboQuant handles all hot-path similarity.<\/p>\n\n\n\n<p>&#8212;<\/p>\n\n\n\n<p>## Delta Embeddings + Identity Continuity<\/p>\n\n\n\n<p>One of the deeper problems in network intelligence: the same actor rotates IPs, changes ports, cycles ASNs. Treated as independent entities, they look like noise. Treated as a single evolving identity, they&#8217;re a pattern.<\/p>\n\n\n\n<p>`_embed_with_delta()` applies exponential identity blending:<\/p>\n\n\n\n<p>&#8220;`python<\/p>\n\n\n\n<p>blended <strong>=<\/strong> 0.80 <strong>*<\/strong> old_vec <strong>+<\/strong> 0.20 <strong>*<\/strong> fresh_embed<\/p>\n\n\n\n<p>blended <strong>\/=<\/strong> norm(blended) &nbsp; <em># re-normalise to unit sphere<\/em><\/p>\n\n\n\n<p>&#8220;`<\/p>\n\n\n\n<p>Alpha=0.80 means each new observation shifts identity by only 20%. An entity&#8217;s embedding is a *weighted average of its entire observation history*, decaying toward the present. This is the LLM KV-cache insight applied to graph entities: you don&#8217;t discard context, you blend it.<\/p>\n\n\n\n<p>Combined with TurboQuantStore, this gives us:<\/p>\n\n\n\n<p>&#8211; Rotating IPs that share behavioral context \u2192 near-identical blended vecs \u2192 high cosine similarity \u2192 speculative edge<\/p>\n\n\n\n<p>&#8211; Different actors on the same port \u2192 diverging blended vecs \u2192 no spurious edge<\/p>\n\n\n\n<p>&#8212;<\/p>\n\n\n\n<p>## SSE Hardening: Sequence Numbers and Backpressure<\/p>\n\n\n\n<p>The `\/stream\/speculative` SSE endpoint now has production-grade delivery guarantees:<\/p>\n\n\n\n<p>**Sequence numbers.** Every event carries `id: {seq}`. On reconnect, the client sends `Last-Event-ID: N` and the server checks the gap. If `current_seq &#8211; last_seen &gt; 1`, a synthetic `_event: resync` fires immediately and the client re-bootstraps from `\/api\/shadow\/edges`.<\/p>\n\n\n\n<p>**Bounded subscriber queues.** Each connected client gets an `_SseSubscriber` with a `queue.Queue(maxsize=500)`. Slow clients can&#8217;t block fast ones. `drop_count` is incremented on overflow and reported in heartbeat frames.<\/p>\n\n\n\n<p>**Deadlock-safe sequence counters.** The sequence counter uses a dedicated `_seq_lock` separate from the graph `_lock`. This avoids the deadlock where `push()` holds `_lock` while `_notify_delta()` tried to acquire it to increment `_seq`.<\/p>\n\n\n\n<p>**Browser-side gap handling.** The frontend&#8217;s `_watchPromotions()` handles `resync` (full re-bootstrap from REST) and `heartbeat` (drop_count warning banner if &gt; 0). Reconnect delay is advertised in `retry: 3000`.<\/p>\n\n\n\n<p>&#8212;<\/p>\n\n\n\n<p>## MMR Neighbor Selection<\/p>\n\n\n\n<p>The speculative graph was accumulating redundant cluster links. A CDN subnet with 50 near-identical IPs was generating O(n\u00b2) cross-links \u2014 semantically useless, computationally expensive.<\/p>\n\n\n\n<p>**Maximal Marginal Relevance (MMR)** selects diverse-yet-relevant neighbors from the candidate pool:<\/p>\n\n\n\n<p>&#8220;`<\/p>\n\n\n\n<p>MMR = \u03bb \u00d7 similarity_to_query<\/p>\n\n\n\n<p>&nbsp; &nbsp; \u2212 (1\u2212\u03bb) \u00d7 max_similarity_to_already_selected<\/p>\n\n\n\n<p>&#8220;`<\/p>\n\n\n\n<p>With \u03bb=0.55, the algorithm is slightly relevance-biased but actively avoids selecting candidates that are redundant with those already chosen. MAX_NEIGHBORS=5 per entity instead of O(n). CDN subnets that previously flooded the graph with 50-node clusters now contribute 5 structurally diverse speculative edges.<\/p>\n\n\n\n<p>&#8212;<\/p>\n\n\n\n<p>## Ollama Remote: Embedding at Network Scale<\/p>\n\n\n\n<p>The embedding model (`nomic-embed-text`, 768-dim) now runs on `neurosphere` \u2014 an Alma 9 Linux VM with an RTX 3060 (12GB VRAM, CUDA 8.6). Ollama is bound to `0.0.0.0:11434` via systemd override and reachable across the LAN at `192.168.1.185`.<\/p>\n\n\n\n<p>`scythe_orchestrator.py` accepts `&#8211;ollama-url` and propagates `OLLAMA_URL` to all spawned subprocesses. Cold-load time for `nomic-embed-text` from GPU eviction: ~8 seconds. The embedding throughput is now network I\/O bound, not compute bound.<\/p>\n\n\n\n<p>FAISS AVX2 runs locally (Python process on WSL2). The separation is correct: GPU embeds, CPU indexes and searches.<\/p>\n\n\n\n<p>&#8212;<\/p>\n\n\n\n<p>## BehavioralFingerprint: The Foundation for Cross-Domain Tracking<\/p>\n\n\n\n<p>`protocol_intel.py` includes `BehavioralFingerprint` \u2014 a 22-dimensional statistical feature vector computed per session:<\/p>\n\n\n\n<p>&#8220;`<\/p>\n\n\n\n<p>[avg_pkt_bytes, std_pkt_bytes, CV, median_pkt_bytes,<\/p>\n\n\n\n<p>&nbsp;avg_iat_ms, std_iat_ms, pkt_count, total_bytes,<\/p>\n\n\n\n<p>&nbsp;duration_sec, bytes_per_sec, dst_port, src_port,<\/p>\n\n\n\n<p>&nbsp;proto_tcp, proto_udp, proto_icmp,<\/p>\n\n\n\n<p>&nbsp;has_tls, has_dns, has_http,<\/p>\n\n\n\n<p>&nbsp;tcp_flag_syn, tcp_flag_rst, tcp_flag_fin,<\/p>\n\n\n\n<p>&nbsp;anomaly_score]<\/p>\n\n\n\n<p>&#8220;`<\/p>\n\n\n\n<p>This vector is normalized to [0, 1] per dimension and designed for future fusion into delta embeddings:<\/p>\n\n\n\n<p>&#8220;`python<\/p>\n\n\n\n<p>fused <strong>=<\/strong> concat(text_embedding_768, fingerprint_22)<\/p>\n\n\n\n<p>&#8220;`<\/p>\n\n\n\n<p>When that fusion is complete, cosine similarity will measure *behavioral identity* \u2014 not just semantic text proximity. The same actor across different IPs, different ports, and different protocols will cluster in the fused space because their *statistical behavior* is similar.<\/p>\n\n\n\n<p>This is the &#8220;same actor, different IP&#8221; detection problem solved at the vector level.<\/p>\n\n\n\n<p>&#8212;<\/p>\n\n\n\n<p>## Deck.gl Visualization: Dual-Layer Intelligence<\/p>\n\n\n\n<p>The frontend now renders two semantically distinct flow layers simultaneously:<\/p>\n\n\n\n<p>&#8211; **Validated flows** (cyan, `conf \u2265 0.75`) \u2014 promoted edges with full evidence<\/p>\n\n\n\n<p>&#8211; **Speculative flows** (amber, pulsing alpha) \u2014 shadow graph edges still accumulating evidence<\/p>\n\n\n\n<p>&#8211; **Semantic ScatterplotLayer** (magenta ghost nodes) \u2014 PCA-projected entity embeddings anchored to geography<\/p>\n\n\n\n<p>&#8211; **Promotion flash** \u2014 white expanding ring animation when an edge crosses the promotion threshold<\/p>\n\n\n\n<p>The `\ud83d\udd00 FUSION` mode toggle cycles between validated-only, speculative-only, and both layers. This lets the operator distinguish confirmed intelligence from hypothesis-level signal.<\/p>\n\n\n\n<p>&#8212;<\/p>\n\n\n\n<p>## Architecture State<\/p>\n\n\n\n<p>&#8220;`<\/p>\n\n\n\n<p>Raw PCAP \/ Live Stream<\/p>\n\n\n\n<p>&nbsp; &nbsp; &nbsp; &nbsp; \u2193<\/p>\n\n\n\n<p>ProtocolIntel &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;\u2190 IANA violation scoring (NEW)<\/p>\n\n\n\n<p>BehavioralFingerprint &nbsp; &nbsp; &nbsp; &nbsp;\u2190 Statistical identity vector (NEW)<\/p>\n\n\n\n<p>&nbsp; &nbsp; &nbsp; &nbsp; \u2193<\/p>\n\n\n\n<p>pcap_ingest.emit_session() &nbsp; \u2190 Stamps protocol_anomaly_score on hypergraph nodes (NEW)<\/p>\n\n\n\n<p>&nbsp; &nbsp; &nbsp; &nbsp; \u2193<\/p>\n\n\n\n<p>EmbeddingEngine (Ollama) &nbsp; &nbsp; \u2190 nomic-embed-text 768-dim on neurosphere RTX 3060<\/p>\n\n\n\n<p>_embed_with_delta() &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;\u2190 \u03b1=0.80 identity blending (NEW)<\/p>\n\n\n\n<p>&nbsp; &nbsp; &nbsp; &nbsp; \u2193<\/p>\n\n\n\n<p>TurboQuantStore &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;\u2190 3-bit fp16 streaming index, 0.03ms search (NEW)<\/p>\n\n\n\n<p>&nbsp; &nbsp; &nbsp; &nbsp; \u2193<\/p>\n\n\n\n<p>SemanticShadow.process_entity()<\/p>\n\n\n\n<p>&nbsp; \u2192 TurboQuant search &nbsp; &nbsp; &nbsp; &nbsp;\u2190 Primary (NEW)<\/p>\n\n\n\n<p>&nbsp; \u2192 FAISS search &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; \u2190 Fallback<\/p>\n\n\n\n<p>&nbsp; \u2192 MMR selection &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;\u2190 Diversity filter (NEW)<\/p>\n\n\n\n<p>&nbsp; \u2192 Temporal decay &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; \u2190 Age-weighted bumps (NEW)<\/p>\n\n\n\n<p>&nbsp; &nbsp; &nbsp; &nbsp; \u2193<\/p>\n\n\n\n<p>ShadowGraph &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;\u2190 Speculative edge store<\/p>\n\n\n\n<p>&nbsp; \u2192 AttentionEngine &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;\u2190 5-component scoring incl. W_PROTO (NEW)<\/p>\n\n\n\n<p>&nbsp; \u2192 SSE stream &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; \u2190 Seq numbers + resync + backpressure (NEW)<\/p>\n\n\n\n<p>&nbsp; &nbsp; &nbsp; &nbsp; \u2193<\/p>\n\n\n\n<p>Deck.gl Frontend &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; \u2190 Dual-layer validated\/speculative rendering (NEW)<\/p>\n\n\n\n<p>&#8220;`<\/p>\n\n\n\n<p>&#8212;<\/p>\n\n\n\n<p>## What&#8217;s Next<\/p>\n\n\n\n<p>The `BehavioralFingerprint` 22-dim vector is ready. The TurboQuantStore `fingerprint_store()` singleton is waiting. The missing step is the fusion path:<\/p>\n\n\n\n<p>&#8220;`python<\/p>\n\n\n\n<p>fused <strong>=<\/strong> concat(embed_768, fingerprint_22) &nbsp;<em># 790-dim<\/em><\/p>\n\n\n\n<p>_tq_store.add(entity_id, fused)<\/p>\n\n\n\n<p>&#8220;`<\/p>\n\n\n\n<p>Once that&#8217;s wired, the similarity search finds actors by *what they do*, not just *what they&#8217;re described as*. That closes the &#8220;same actor, different IP&#8221; loop.<\/p>\n\n\n\n<p>The Android ScytheCommandApp is the other frontier \u2014 all of these intelligence layers need a mobile tactical interface that can run disconnected from the LAN and resync over Tailscale. The project scaffold, MainActivity, and mDNS discovery layer are next.<\/p>\n\n\n\n<p>The system now has:<\/p>\n\n\n\n<p>&#8211; A memory that forgets slowly (delta embeddings)<\/p>\n\n\n\n<p>&#8211; A sensor that detects intent violations (protocol_intel)<\/p>\n\n\n\n<p>&#8211; A compressor that thinks at cache speed (TurboQuant)<\/p>\n\n\n\n<p>&#8211; An attention system that knows what matters (AttentionEngine)<\/p>\n\n\n\n<p>The next stage is giving it a hand.<\/p>\n\n\n\n<p>&#8212;<\/p>\n\n\n\n<p>*NerfEngine is a local-first, RF-aware tactical intelligence platform. All inference runs on local hardware. No data leaves the edge.*<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img data-opt-id=378798246  fetchpriority=\"high\" decoding=\"async\" width=\"920\" height=\"811\" src=\"https:\/\/ml6vmqguit1n.i.optimole.com\/w:auto\/h:auto\/q:mauto\/f:best\/https:\/\/arapt.us\/wp-content\/uploads\/2026\/03\/image-8.png\" alt=\"\" class=\"wp-image-5260\" srcset=\"https:\/\/ml6vmqguit1n.i.optimole.com\/w:920\/h:811\/q:mauto\/f:best\/https:\/\/172-234-197-23.ip.linodeusercontent.com\/wp-content\/uploads\/2026\/03\/image-8.png 920w, https:\/\/ml6vmqguit1n.i.optimole.com\/w:300\/h:264\/q:mauto\/f:best\/https:\/\/172-234-197-23.ip.linodeusercontent.com\/wp-content\/uploads\/2026\/03\/image-8.png 300w, https:\/\/ml6vmqguit1n.i.optimole.com\/w:768\/h:677\/q:mauto\/f:best\/https:\/\/172-234-197-23.ip.linodeusercontent.com\/wp-content\/uploads\/2026\/03\/image-8.png 768w\" sizes=\"(max-width: 920px) 100vw, 920px\" \/><\/figure>\n","protected":false},"excerpt":{"rendered":"<p># There&#8217;s a specific moment in a system&#8217;s life where the question stops being &#8220;does it work?&#8221; and starts being &#8220;does it think?&#8221; \u00a0Stage 8 is that moment for SCYTHE. This release didn&#8217;t add more sensors. It didn&#8217;t add more endpoints. It made the system aware of *what the network is supposed to be doing*&hellip;&nbsp;<a href=\"https:\/\/172-234-197-23.ip.linodeusercontent.com\/?p=5259\" rel=\"bookmark\"><span class=\"screen-reader-text\">NerfEngine Stage 8: Protocol Intelligence, Memory Compression, and the Intelligence Loop<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":5260,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"neve_meta_sidebar":"","neve_meta_container":"","neve_meta_enable_content_width":"","neve_meta_content_width":0,"neve_meta_title_alignment":"","neve_meta_author_avatar":"","neve_post_elements_order":"","neve_meta_disable_header":"","neve_meta_disable_footer":"","neve_meta_disable_title":"","footnotes":""},"categories":[10,7],"tags":[],"class_list":["post-5259","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-signal_scythe","category-the-truben-show"],"_links":{"self":[{"href":"https:\/\/172-234-197-23.ip.linodeusercontent.com\/index.php?rest_route=\/wp\/v2\/posts\/5259","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/172-234-197-23.ip.linodeusercontent.com\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/172-234-197-23.ip.linodeusercontent.com\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/172-234-197-23.ip.linodeusercontent.com\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/172-234-197-23.ip.linodeusercontent.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=5259"}],"version-history":[{"count":1,"href":"https:\/\/172-234-197-23.ip.linodeusercontent.com\/index.php?rest_route=\/wp\/v2\/posts\/5259\/revisions"}],"predecessor-version":[{"id":5261,"href":"https:\/\/172-234-197-23.ip.linodeusercontent.com\/index.php?rest_route=\/wp\/v2\/posts\/5259\/revisions\/5261"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/172-234-197-23.ip.linodeusercontent.com\/index.php?rest_route=\/wp\/v2\/media\/5260"}],"wp:attachment":[{"href":"https:\/\/172-234-197-23.ip.linodeusercontent.com\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=5259"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/172-234-197-23.ip.linodeusercontent.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=5259"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/172-234-197-23.ip.linodeusercontent.com\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=5259"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}