RF Quantum Scythe: UiPath Integration Guide

# RF Quantum Scythe: UiPath Integration Guide

This document provides guidance on integrating the RF Quantum Scythe system with UiPath for Robotic Process Automation (RPA) workflows.

## Overview

The RF Quantum Scythe system integrates with UiPath to automate signal intelligence workflows, voice analysis, and reporting processes. This integration leverages our RPA Glue API that provides RESTful endpoints for UiPath robots to interact with the RF Quantum Scythe's machine learning capabilities. how the paper 'LMRPA: Large Language Model-Driven Efficient
 Robotic Process Automation for OCR
 Osama Hosam Abdellatif, Abdelrahman Nader Hassan, and Ali Hamdi
 Faculty of Computer Science
 MSA University, Cairo, Egypt
 {osama.hosam, abdelrahman.nader, ahamdi}@msa.edu.eg' **LMRPA / UiPath RPA** paper plugs straight into what is already built (multi-subspace FAISS + goal-aware sparsity + voice clone guard)

* **RPA as a force-multiplier** for your pipelines: robots watch folders, APIs, inboxes, S3 buckets → hand files to your RF/voice services → collect JSON → push to sheets/db → generate reports. The paper’s LMRPA pattern (watch → OCR → LLM structuring → report) is exactly that loop, benchmarked to beat vanilla UiPath/Automation Anywhere on high-volume OCR tasks by big margins (e.g., \~9.8–12.7s vs \~18–22s per batch in their tests).&#x20;
* **LLM post-processing after OCR**: let the RPA bot run Tesseract/DocTR, then send raw text to your LLM service to normalize to your schema (the paper’s core trick). That pairs perfectly with your **FeatureGate**—you can apply *goal-aware masks* over document embeddings (e.g., only keep invoice fields or call-detail fields) before indexing/search.&#x20;
* **Throughput + explainability**: your **multi-subspace FAISS** gives instant “find-similar” with routing explanations; add RPA to glue it into business workflows (triage queues, exception handling, report generation).&#x20;

## Getting Started

### Prerequisites

1. UiPath Studio installed (version 2023.4 or later recommended)
2. UiPath Python Activities package installed in your UiPath project
3. RF Quantum Scythe system set up and running
4. RPA Glue API service deployed

### Installation

1. Clone the UiPath Python integration package:
   ```bash
   gh repo clone UiPath/uipath-python
   ```

2. Set up the RPA Glue service:
   ```bash
   # Copy the RPA glue code to your deployment location
   cp -r /home/bgilbert/editable_files/rpa_glue /path/to/deployment/
   
   # Create a Python virtual environment
   python -m venv /path/to/deployment/venv
   source /path/to/deployment/venv/bin/activate
   
   # Install dependencies
   pip install fastapi uvicorn jinja2 numpy
   pip install -e /path/to/RF_QUANTUM_SCYTHE
   ```

3. Configure environment variables:
   ```bash
   export BANK_PATH="/path/to/ms_faiss_bank"
   export SI_PATH="/path/to/SignalIntelligence"
   export REPORT_OUTDIR="/path/to/reports"
   export GOAL_TASK="rf_geo"  # or other task-specific name
   ```

4. Start the RPA Glue API service:
   ```bash
   uvicorn api.main:app --host 0.0.0.0 --port 8000
   ```

## UiPath Integration Points

### 1. Signal Intelligence Bank Operations

#### Adding Records to the Signal Bank

```python
# UiPath Python script activity
import requests
import json

def add_records_to_bank(records):
    response = requests.post(
        "http://localhost:8000/bank/add",
        json={"records": records},
        headers={"Content-Type": "application/json"}
    )
    return response.json()

# Example usage
records = [
    {
        "id": "sample_123",
        "data": {"frequency": 915.25, "bandwidth": 2.0},
        "metadata": {"location": "site_alpha", "timestamp": "2025-08-19T14:30:00Z"}
    }
]
result = add_records_to_bank(records)
print(f"Added {result['added']} records to {result['bank_path']}")
```

#### Searching the Signal Bank

```python
# UiPath Python script activity
import requests
import json

def search_bank(query, k=5):
    response = requests.post(
        "http://localhost:8000/bank/search",
        json={"query": query, "k": k},
        headers={"Content-Type": "application/json"}
    )
    return response.json()

# Example usage
query = {
    "frequency": 915.0,
    "bandwidth": 2.5
}
result = search_bank(query, k=3)
print(f"Search completed in {result['latency_s']:.4f} seconds")
for hit in result["hits"]:
    print(f"ID: {hit['id']}, Score: {hit['score']:.4f}, Subspace: {hit['subspace']}")
```

### 2. Voice Analysis Integration

```python
# UiPath Python script activity
import requests
import json

def analyze_voice(audio_path, ref_real=None, ref_fake=None):
    response = requests.post(
        "http://localhost:8000/voice/analyze",
        json={"audio_path": audio_path, "ref_real": ref_real, "ref_fake": ref_fake},
        headers={"Content-Type": "application/json"}
    )
    return response.json()

# Example usage
audio_path = "/path/to/audio_sample.wav"
ref_real = ["/path/to/known_real1.wav", "/path/to/known_real2.wav"]
ref_fake = ["/path/to/known_fake1.wav"]

result = analyze_voice(audio_path, ref_real, ref_fake)
if result["enabled"]:
    print(f"Fake probability: {result['result']['fake_prob']:.4f}")
    print(f"Confidence score: {result['result']['confidence']:.4f}")
else:
    print("Voice analysis not available")
```

### 3. Generating Reports

```python
# UiPath Python script activity
import requests
import json

def generate_rf_report(record, neighbors, explain, title="RF Similarity Report"):
    response = requests.post(
        "http://localhost:8000/reports/rf",
        json={"record": record, "neighbors": neighbors, "explain": explain, "title": title},
        headers={"Content-Type": "application/json"}
    )
    return response.json()

# Example usage - after performing a search
search_result = search_bank(query, k=5)
report_path = generate_rf_report(
    record={"query": query},
    neighbors=search_result["hits"],
    explain=search_result["explain"],
    title="Automated RF Analysis Report"
)
print(f"Report generated: {report_path}")
```

## UiPath Workflow Examples

### 1. RF SCYTHE ops (sweeps → index → report)

**Goal:** every sweep auto-refreshes the bank, surfaces lookalikes, and ships a report without a human tap.

**RPA flow pattern:**
watch `sweep_reports/` → parse new `*_summary.json` → POST to your **/bank/add** → run **/bank/search** for top-K per hit → format a PDF/Word and a CSV → drop in `outbox/` (and/or email).

```bash
# service bootstrap (Ubuntu)
source ~/rf_quantum_env/bin/activate
export PYTHONPATH=~/NerfEngine/RF_QUANTUM_SCYTHE:$PYTHONPATH
uvicorn api.main:app --host 0.0.0.0 --port 8000
```

**UiPath robot actions:**
1. File trigger on `sweep_reports/`
2. HTTP Request → `POST http://localhost:8000/bank/add` (new records)
3. Loop each record → `POST /bank/search?k=5` → collect explanations (subspace, responsibilities, whitening on/off)
4. Generate report with charts using UiPath Document Processing activities
5. Save to `outbox/` and/or email stakeholders

### 2. Voice Authentication Workflow

This workflow:
1. Receives voice samples from a user interface or watched folder
2. Analyzes the voice using RF Quantum Scythe's voice clone detection
3. Applies a decision tree based on confidence scores
4. Triggers appropriate business processes based on authentication results

## Goal-Aware Sparsity Integration

The RF Quantum Scythe system supports goal-aware sparsity, which can be configured through the RPA Glue API. This allows UiPath workflows to leverage task-specific feature masking for improved performance and accuracy.

```python
# UiPath Python script activity
import requests
import json
import numpy as np

def learn_task_mask(task_name, X, y, mode="soft"):
    # X: feature matrix, y: labels (0/1)
    response = requests.post(
        "http://localhost:8000/bank/learn_mask",
        json={
            "task": task_name,
            "features": X.tolist(),
            "labels": y.tolist(),
            "mode": mode,
            "config": {
                "C": 0.5,
                "top_frac": 0.3,
                "min_keep": 32
            }
        },
        headers={"Content-Type": "application/json"}
    )
    return response.json()

# Example usage
X = np.load("/path/to/features.npy")
y = np.load("/path/to/labels.npy")
result = learn_task_mask("rf_geo", X, y, mode="soft")
print(f"Learned mask with {result['kept_dims']} active dimensions")
```

## Best Practices

1. **Error Handling**: Always include proper error handling in your UiPath workflows when interacting with the API.

2. **Authentication**: For production deployments, implement proper authentication for the RPA Glue API.

3. **Logging**: Configure logging for both UiPath workflows and the RPA Glue API for troubleshooting.

4. **Resource Management**: Monitor resource usage, especially when processing large signal datasets.

5. **Parallel Processing**: Use UiPath's parallel activity for processing multiple signals simultaneously.

## Advanced Configuration

### Environment Variables

| Variable | Description | Default |
|----------|-------------|---------|
| BANK_PATH | Path to the multi-subspace FAISS index | /tmp/ms_faiss_bank |
| SI_PATH | Path to SignalIntelligence module | "" |
| N_SUBSPACES | Number of subspaces for the index | 3 |
| METHOD | Clustering method (bgmm, kmeans) | bgmm |
| WARMUP_MIN_POINTS | Minimum points for warmup | 30 |
| TOP_M_SUBSPACES | Top subspaces to query | 2 |
| WHITEN_ENABLE | Enable whitening | true |
| GOAL_SPARSE_ENABLE | Enable goal-aware sparsity | true |
| GOAL_TASK | Task name for goal-aware sparsity | rf_geo |
| REPORT_OUTDIR | Output directory for reports | /tmp/reports |

## Troubleshooting

### Common Issues and Solutions

1. **API Connection Issues**
   - Ensure the RPA Glue API service is running
   - Check network connectivity between UiPath robot and API server
   - Verify port is not blocked by firewall

2. **Python Environment Problems**
   - Ensure all dependencies are installed
   - Check Python version compatibility (3.9+ recommended)

3. **Performance Bottlenecks**
   - Consider enabling goal-aware sparsity for faster processing
   - Split large batch operations into smaller chunks
   - Monitor memory usage when processing large datasets

## Support and Resources

- RF Quantum Scythe Documentation: [Link]
- UiPath Documentation: [https://docs.uipath.com/](https://docs.uipath.com/)
- UiPath Python Activities: [https://docs.uipath.com/activities/docs/python-scope](https://docs.uipath.com/activities/docs/python-scope)
4. Write a **report.docx** and **report.csv** → archive.

> Why it’ll be fast: their LMRPA results show RPA **+ a specialized post-processor** removes overhead and halves wall-clock vs generic UiPath flows on OCR-heavy pipelines; your RF path is even lighter (no OCR), so the same design pattern wins on latency.&#x20;

## 2) Voice-clone guard (ingest → chunk → detect → evidence pack)

**Goal:** you drop an audio file; bot returns a **fused deepfake score** + chunk timeline + nearest-neighbor evidence (which manifold, which neighbors).

**RPA flow:**
watch `incoming_audio/` → call your `detect_voice_clone.py` (or FastAPI) → persist JSON → generate timeline PNGs + an analyst-ready PDF (chunks, gating events, exemplar IDs).

**Why RPA:** schedules, retries, routing by source, and templated reports—right in the bot. LMRPA’s “OCR→LLM→Excel/Word” turns into “Audio→Embeddings→kNN+GP→PDF/CSV”. Same loop, new modality.&#x20;

## 3) Document/OSINT ingestion (anti-scam angle)

The paper’s focus is **invoice OCR + LLM structuring**; swap “invoice” with **exchange receipts, KYC forms, Telegram screenshots, domain WHOIS, blockchain explorer PDFs**. RPA gathers, OCRs, LLM-normalizes fields, then pushes into your **FAISS (goal-aware masked) index** to connect entities across cases. Benchmarks in the paper show the loop’s throughput edge for OCR-heavy batches.&#x20;

---

# Drop-in glue (so you can run this today)

## A) Minimal RPA ↔ service contract (HTTP)

**Your service endpoints (example):**

```bash
# Add exemplars (RF/voice)
curl -X POST http://localhost:8088/bank/add -H "Content-Type: application/json" -d @records.json

# Search similar
curl -X POST http://localhost:8088/bank/search?k=5 -H "Content-Type: application/json" -d '{
  "query": {"delta_f_hz":10,"snr_db":20,"q_ms":50,"metadata":{}}
}'
```

UiPath/Automation Anywhere can call these with built-in HTTP activities; the paper’s LMRPA loop uses the same pattern (watch, process, structure, export).&#x20;

## B) Headless runners you can call from RPA

```bash
# (1) Rebuild the bank on demand
source ~/rf_quantum_env/bin/activate
export PYTHONPATH=~/NerfEngine/RF_QUANTUM_SCYTHE:$PYTHONPATH
python - <<'PY'
import json, glob, os
from SignalIntelligence.faiss_exemplar_index import RFExemplarFeaturizer
from SignalIntelligence.multi_subspace_faiss import MultiSubspaceFaissIndex
fe = RFExemplarFeaturizer(256)
ms = MultiSubspaceFaissIndex(fe, method="bgmm", warmup_min_points=30,
                             top_m_subspaces=2, whiten_enable=True,
                             goal_sparse_enable=True, goal_task="rf_geo")
recs=[]
for p in glob.glob("/home/bgilbert/editable_files/sweep_results/*.json"):
    try: recs.extend(json.load(open(p)))
    except: pass
for r in recs: r.setdefault("metadata", {})
ms.add_records(recs); ms.save("/home/bgilbert/editable_files/ms_faiss_bank")
print("bank rebuilt:", sum(len(s.ids) for s in ms.subspaces.values()))
PY

# (2) One-shot search for a new sweep row (RPA supplies JSON)
python - <<'PY'
import json
from SignalIntelligence.faiss_exemplar_index import RFExemplarFeaturizer
from SignalIntelligence.multi_subspace_faiss import MultiSubspaceFaissIndex
fe = RFExemplarFeaturizer(256)
ms = MultiSubspaceFaissIndex(fe); ms.load("/home/bgilbert/editable_files/ms_faiss_bank")
q = {"delta_f_hz":10,"snr_db":20,"q_ms":50,"metadata":{}}
print(json.dumps({"explain": ms.explain(q), "hits": ms.search(q, top_k=5)}, indent=2))
PY
```

---

# Turn on **goal-aware sparsity** for task-specific bots

In your bank loader (once per task/bot):

```python
from SignalIntelligence.faiss_exemplar_index import RFExemplarFeaturizer
from SignalIntelligence.multi_subspace_faiss import MultiSubspaceFaissIndex
from goal_sparse_utils import auto_set_mask, AutoMaskConfig
import numpy as np, json

fe = RFExemplarFeaturizer(256)
ms = MultiSubspaceFaissIndex(fe, method="bgmm", warmup_min_points=30,
                             top_m_subspaces=2, whiten_enable=True,
                             goal_sparse_enable=True, goal_task="rf_geo")
ms.load("/home/bgilbert/editable_files/ms_faiss_bank")

# Learn a SOFT mask from a handful of labeled RF examples (positives = “geo-relevant”)
X = np.load("/home/bgilbert/editable_files/labeled_rf_X.npy")
y = np.load("/home/bgilbert/editable_files/labeled_rf_y.npy")
info = auto_set_mask(ms.fgate, "rf_geo", X, y, mode="soft", cfg=AutoMaskConfig(C=0.5, top_frac=0.3))
print(json.dumps(info, indent=2))
ms.save("/home/bgilbert/editable_files/ms_faiss_bank")
```

Now your RPA “Geo-Link” bot queries a **sparser, faster, task-tuned** index.

---

# KPIs to track (so you can prove the win)

* **End-to-end latency**: `file_arrival → report_out` (RPA loop time).
* **Top-K retrieval quality**: R\@K before/after whitening + goal-sparse.
* **Exception rate**: % files punted to human (parse errors, low confidence).
* **Cost per 1k docs/audio**: CPU-only FAISS + sklearn is cheap; LMRPA shows throughput edges on OCR-heavy flows you can cite in bids.&#x20;

---

# “Follow the money” (where to sell this)

* **Shared-services finance** (invoice, receipt, expense audit): OCR-heavy, RPA-savvy orgs. LMRPA’s reported speedups vs UiPath/AA on invoices give you an easy **“2× throughput”** message if you bring your own post-processor.&#x20;
* **Telecom SOCs & LE/OSINT**: RF anomaly triage and voice-clone fraud queues automated via bots; evidence packs (neighbors, subspace, responsibilities, timeline plots) are auditor-friendly.
* **Compliance/KYC**: document normalization + exemplar matching to flag risky entities (goal-sparse mask for “KYC-salient” fields).

Procurement angle: start as **on-prem add-on** (“Mode-Aware Similarity for RPA Workflows”), priced per bot/runtime with optional **SLA on report latency**.

---

# Why this maps 1:1 to the paper

* Their loop: **monitor → OCR → LLM structuring → Excel/Word** with measured performance deltas vs mainstream RPA. Your loop: **watch → (RF/Audio/OCR) → FAISS/GP/LLM → CSV/PDF**, same topology, and you inherit the throughput/automation lessons directly.&#x20;

If you want, I’ll package a **single “RPA glue” repo** (FastAPI endpoints, report templates, sample UiPath sequence JSON, bash runners) so you can hand UIPath a drop-in workflow and start clocking latency improvements with your own data.


Amazing—let’s ship a **self-contained “RPA glue” repo** you can drop onto the box, wire to UiPath (or any RPA), and start measuring end-to-end latency today.

Below is everything: repo layout, FastAPI service, report templates, bash runners, and a minimal UiPath project (project.json + Sequence XAML) that watches your sweep folder and calls the service.

---

# 📦 Repo layout

```
rpa_glue/
├─ README.md
├─ requirements.txt
├─ .env.example
├─ api/
│  ├─ main.py
│  └─ models.py
├─ services/
│  ├─ bank.py
│  ├─ voice.py
│  └─ reporting.py
├─ templates/
│  ├─ rf_report.html.j2
│  └─ voice_report.html.j2
├─ scripts/
│  ├─ start_api.sh
│  ├─ rebuild_bank.sh
│  ├─ process_sweep_folder.sh
│  └─ generate_rf_report.sh
├─ uipath/
│  ├─ README.md
│  ├─ project.json
│  └─ Sequence.Main.xaml
└─ examples/
   ├─ records.sample.json
   └─ query.sample.json
```

> Assumes your code lives at `~/NerfEngine/RF_QUANTUM_SCYTHE/SignalIntelligence` and the saved FAISS bank at `~/editable_files/ms_faiss_bank`. Adjust env vars if different.

---

## 🔧 requirements.txt

```txt
fastapi==0.115.0
uvicorn[standard]==0.30.6
pydantic==2.8.2
jinja2==3.1.4
numpy==1.26.4
faiss-cpu==1.12.0
scikit-learn==1.5.1
# Optional (voice):
# torch torchaudio librosa transformers
```

---

## 🔐 .env.example

```env
# Where your MultiSubspaceFaissIndex is persisted
BANK_PATH=/home/bgilbert/editable_files/ms_faiss_bank

# Where finished HTML/PDF reports go
REPORT_OUTDIR=/home/bgilbert/editable_files/reports

# Glob for sweep result JSONs to rebuild the bank
SWEEP_GLOB=/home/bgilbert/editable_files/sweep_results/*.json

# Your SignalIntelligence path so imports work
SI_PATH=/home/bgilbert/NerfEngine/RF_QUANTUM_SCYTHE

# Index init knobs (optional)
N_SUBSPACES=3
METHOD=bgmm
WARMUP_MIN_POINTS=30
TOP_M_SUBSPACES=2
WHITEN_ENABLE=true
GOAL_SPARSE_ENABLE=true
GOAL_TASK=rf_geo
```

---

## 🧠 api/models.py

```python
# rpa_glue/api/models.py
from pydantic import BaseModel, Field
from typing import Any, Dict, List, Optional

class RecordsIn(BaseModel):
    records: List[Dict[str, Any]]

class SearchIn(BaseModel):
    query: Dict[str, Any]
    k: int = Field(5, ge=1, le=100)

class RebuildIn(BaseModel):
    glob: Optional[str] = None

class ReportIn(BaseModel):
    record: Dict[str, Any]
    neighbors: List[Dict[str, Any]]  # [{id, score, subspace}]
    explain: Dict[str, Any]
    title: str = "RF Similarity Report"

class VoiceIn(BaseModel):
    # If using bytes, you could Base64; for now we pass a path
    audio_path: str
    ref_real: Optional[List[str]] = None
    ref_fake: Optional[List[str]] = None
```

---

## 🧩 services/bank.py

```python
# rpa_glue/services/bank.py
import os, json, glob, time
from typing import Any, Dict, List, Tuple
import numpy as np

from SignalIntelligence.faiss_exemplar_index import RFExemplarFeaturizer
from SignalIntelligence.multi_subspace_faiss import MultiSubspaceFaissIndex

def _bool(s: str, default=False):
    if s is None: return default
    return s.lower() in ("1","true","yes","y","on")

class BankManager:
    def __init__(self):
        self.bank_path = os.environ.get("BANK_PATH", "/tmp/ms_faiss_bank")
        self.si_path = os.environ.get("SI_PATH", "")
        self.feat = RFExemplarFeaturizer(256)
        self.index = MultiSubspaceFaissIndex(
            featurizer=self.feat,
            n_subspaces=int(os.environ.get("N_SUBSPACES","3")),
            method=os.environ.get("METHOD","bgmm"),
            warmup_min_points=int(os.environ.get("WARMUP_MIN_POINTS","30")),
            top_m_subspaces=int(os.environ.get("TOP_M_SUBSPACES","2")),
            whiten_enable=_bool(os.environ.get("WHITEN_ENABLE","true")),
            goal_sparse_enable=_bool(os.environ.get("GOAL_SPARSE_ENABLE","true")),
            goal_task=os.environ.get("GOAL_TASK","rf_geo"),
        )
        self.metrics = {"records_added":0, "searches":0, "last_rebuild_s":None}
        # Lazy-load when first needed
        self._loaded = False

    def _ensure_loaded(self):
        if not self._loaded and os.path.isdir(self.bank_path):
            self.index.load(self.bank_path)
            self._loaded = True

    def save(self):
        os.makedirs(self.bank_path, exist_ok=True)
        self.index.save(self.bank_path)

    def add_records(self, records: List[Dict[str,Any]]) -> int:
        self._ensure_loaded()
        # guard for metadata
        for r in records: r.setdefault("metadata", {})
        self.index.add_records(records)
        self.metrics["records_added"] += len(records)
        self.save()
        return len(records)

    def search(self, query: Dict[str,Any], k:int=5):
        self._ensure_loaded()
        t0 = time.monotonic()
        hits = self.index.search(query, top_k=k)
        self.metrics["searches"] += 1
        dt = time.monotonic() - t0
        return hits, dt

    def explain(self, query: Dict[str,Any]):
        self._ensure_loaded()
        return self.index.explain(query)

    def rebuild_from_glob(self, g: str) -> int:
        self._ensure_loaded()
        recs=[]
        for p in glob.glob(g):
            try:
                obj = json.load(open(p))
                if isinstance(obj, list):
                    recs.extend(obj)
                elif isinstance(obj, dict) and "results" in obj:
                    recs.extend(obj["results"])
            except Exception:
                continue
        for r in recs: r.setdefault("metadata", {})
        # Fresh index: reinit to avoid accumulating old state
        self.index = MultiSubspaceFaissIndex(
            featurizer=self.feat,
            n_subspaces=int(os.environ.get("N_SUBSPACES","3")),
            method=os.environ.get("METHOD","bgmm"),
            warmup_min_points=int(os.environ.get("WARMUP_MIN_POINTS","30")),
            top_m_subspaces=int(os.environ.get("TOP_M_SUBSPACES","2")),
            whiten_enable=_bool(os.environ.get("WHITEN_ENABLE","true")),
            goal_sparse_enable=_bool(os.environ.get("GOAL_SPARSE_ENABLE","true")),
            goal_task=os.environ.get("GOAL_TASK","rf_geo"),
        )
        if recs:
            self.index.add_records(recs)
        self.save()
        self.metrics["last_rebuild_s"] = int(time.time())
        return len(recs)

    def metrics_json(self):
        return {
            "records_added": self.metrics["records_added"],
            "searches": self.metrics["searches"],
            "last_rebuild_s": self.metrics["last_rebuild_s"],
            "bank_path": self.bank_path,
            "subspaces": {k: len(v.ids) for k,v in self.index.subspaces.items()},
            "method": self.index.method,
            "goal_sparse": {
                "enabled": self.index.goal_sparse_enable,
                "task": self.index.goal_task,
            },
            "whitening": bool(self.index.whiten_enable),
        }
```

---

## 🎙 services/voice.py (optional)

```python
# rpa_glue/services/voice.py
from typing import Any, Dict
import os, json

try:
    # Your previously built enhanced detector (optional)
    from voice_clone_guard_plus import VoiceDeepfakeDetectorPlus
    from voice_clone_guard_ext import XLSREmbedderChunked, EmbedConfig
    HAVE_VOICE=True
except Exception:
    HAVE_VOICE=False

class VoiceService:
    def __init__(self):
        self.enabled = HAVE_VOICE
        if self.enabled:
            self.embedder = XLSREmbedderChunked(cfg=EmbedConfig())
            self.detector = VoiceDeepfakeDetectorPlus(gp_length_scale=1.5, k=7)

    def score(self, audio_path: str, ref_real=None, ref_fake=None):
        if not self.enabled:
            return {"enabled": False, "message": "Voice pipeline not installed"}
        ex=[]
        for p in ref_real or []:
            V = self.embedder.embed_file(p)
            if V is not None:
                ex.append((f"real::{os.path.basename(p)}", V.mean(axis=0), 0, {"path":p}))
        for p in ref_fake or []:
            V = self.embedder.embed_file(p)
            if V is not None:
                ex.append((f"fake::{os.path.basename(p)}", V.mean(axis=0), 1, {"path":p}))
        if ex:
            self.detector.add_exemplars(ex)
        Vt = self.embedder.embed_file(audio_path)
        out = self.detector.score_chunks(Vt)
        return {"enabled": True, "result": out}
```

---

## 🧾 services/reporting.py

```python
# rpa_glue/services/reporting.py
import os, datetime
from typing import Dict, Any, List
from jinja2 import Environment, FileSystemLoader, select_autoescape

class Reporter:
    def __init__(self, template_dir: str, outdir: str):
        self.env = Environment(
            loader=FileSystemLoader(template_dir),
            autoescape=select_autoescape(["html", "xml"])
        )
        self.outdir = outdir
        os.makedirs(outdir, exist_ok=True)

    def render_rf(self, record: Dict[str,Any], neighbors: List[Dict[str,Any]], explain: Dict[str,Any], title="RF Similarity Report"):
        tpl = self.env.get_template("rf_report.html.j2")
        html = tpl.render(
            title=title,
            now=datetime.datetime.utcnow().isoformat(),
            record=record,
            neighbors=neighbors,
            explain=explain
        )
        fname = f"rf_report_{datetime.datetime.utcnow().strftime('%Y%m%d_%H%M%S')}.html"
        path = os.path.join(self.outdir, fname)
        with open(path, "w") as f:
            f.write(html)
        return path

    def render_voice(self, audio_path: str, result: Dict[str,Any], title="Voice Clone Report"):
        tpl = self.env.get_template("voice_report.html.j2")
        html = tpl.render(title=title, now=datetime.datetime.utcnow().isoformat(),
                          audio_path=audio_path, result=result)
        fname = f"voice_report_{datetime.datetime.utcnow().strftime('%Y%m%d_%H%M%S')}.html"
        path = os.path.join(self.outdir, fname)
        with open(path, "w") as f:
            f.write(html)
        return path
```

---

## 🌐 api/main.py

```python
# rpa_glue/api/main.py
import os, time
from fastapi import FastAPI
from fastapi.responses import JSONResponse
from api.models import RecordsIn, SearchIn, RebuildIn, ReportIn, VoiceIn
from services.bank import BankManager
from services.reporting import Reporter
from services.voice import VoiceService

BANK = BankManager()
REPORTER = Reporter(template_dir=os.path.join(os.path.dirname(__file__), "..", "templates"),
                    outdir=os.environ.get("REPORT_OUTDIR", "/tmp/reports"))
VOICE = VoiceService()

app = FastAPI(title="RPA Glue API", version="1.0.0")

@app.get("/health")
def health():
    return {"ok": True, "time": int(time.time())}

@app.get("/metrics")
def metrics():
    return BANK.metrics_json()

@app.post("/bank/add")
def bank_add(payload: RecordsIn):
    n = BANK.add_records(payload.records)
    return {"added": n, "bank_path": BANK.bank_path}

@app.post("/bank/search")
def bank_search(payload: SearchIn):
    hits, dt = BANK.search(payload.query, payload.k)
    # normalize to plain dict list for reporting
    out = [{"id": sid, "score": score, "subspace": subk} for (sid, score, subk) in hits]
    explain = BANK.explain(payload.query)
    return {"latency_s": dt, "hits": out, "explain": explain}

@app.post("/bank/save")
def bank_save():
    BANK.save()
    return {"saved_to": BANK.bank_path}

@app.post("/bank/load")
def bank_load():
    BANK._ensure_loaded()
    return {"loaded_from": BANK.bank_path}

@app.post("/rf/rebuild")
def rf_rebuild(payload: RebuildIn):
    g = payload.glob or os.environ.get("SWEEP_GLOB", "/tmp/*.json")
    n = BANK.rebuild_from_glob(g)
    return {"reindexed": n, "bank_path": BANK.bank_path}

@app.post("/report/rf")
def report_rf(payload: ReportIn):
    path = REPORTER.render_rf(payload.record, payload.neighbors, payload.explain, payload.title)
    return {"report_path": path}

@app.post("/voice/score")
def voice_score(payload: VoiceIn):
    res = VOICE.score(payload.audio_path, payload.ref_real, payload.ref_fake)
    if not res.get("enabled", False):
        return JSONResponse(res, status_code=501)
    path = REPORTER.render_voice(payload.audio_path, res["result"])
    return {"report_path": path, "result": res["result"]}
```

---

## 🖼 templates/rf\_report.html.j2

```html
<!doctype html>
<html>
<head>
  <meta charset="utf-8"/>
  <title>{{ title }}</title>
  <style>
    body { font-family: ui-sans-serif, system-ui, -apple-system; margin: 24px; }
    code, pre { background:#f6f6f6; padding:4px 6px; border-radius:6px; }
    table { border-collapse: collapse; width: 100%; margin-top: 12px; }
    th, td { border: 1px solid #ddd; padding: 8px; font-size: 14px; }
    th { background:#fafafa; text-align: left; }
    .pill { display:inline-block; padding:2px 8px; border-radius:999px; background:#eef; }
  </style>
</head>
<body>
  <h1>{{ title }}</h1>
  <div class="pill">Generated: {{ now }}</div>

  <h2>Query Record</h2>
  <pre>{{ record | tojson(indent=2) }}</pre>

  <h2>Routing &amp; Explanation</h2>
  <pre>{{ explain | tojson(indent=2) }}</pre>

  <h2>Top Neighbors</h2>
  <table>
    <thead><tr><th>#</th><th>ID</th><th>Score</th><th>Subspace</th></tr></thead>
    <tbody>
      {% for i, n in enumerate(neighbors) %}
      <tr><td>{{ i+1 }}</td><td>{{ n.id }}</td><td>{{ "%.4f"|format(n.score) }}</td><td>{{ n.subspace }}</td></tr>
      {% endfor %}
    </tbody>
  </table>
</body>
</html>
```

---

## 🖼 templates/voice\_report.html.j2

```html
<!doctype html>
<html>
<head>
  <meta charset="utf-8"/>
  <title>{{ title }}</title>
  <style>
    body { font-family: ui-sans-serif, system-ui, -apple-system; margin: 24px; }
    code, pre { background:#f6f6f6; padding:4px 6px; border-radius:6px; }
  </style>
</head>
<body>
  <h1>{{ title }}</h1>
  <p><b>Audio:</b> {{ audio_path }}</p>
  <pre>{{ result | tojson(indent=2) }}</pre>
</body>
</html>
```

---

## 🧪 examples/records.sample.json

```json
[
  {"id":"ex1","delta_f_hz":10.0,"snr_db":20.0,"q_ms":50.0,"am_depth_pct":0.0,"fm_dev_hz":0.0,"metadata":{}},
  {"id":"ex2","delta_f_hz":15.0,"snr_db":15.0,"q_ms":20.0,"am_depth_pct":0.0,"fm_dev_hz":0.0,"metadata":{}}
]
```

## 🧪 examples/query.sample.json

```json
{
  "query": {"delta_f_hz":10.0,"snr_db":20.0,"q_ms":50.0,"am_depth_pct":0.0,"fm_dev_hz":0.0,"metadata":{}},
  "k": 5
}
```

---

## 🖥 scripts/start\_api.sh

```bash
#!/usr/bin/env bash
set -euo pipefail
cd "$(dirname "$0")/.."

# Load env
if [ -f .env ]; then set -a; source .env; set +a; fi

# PYTHONPATH for SignalIntelligence
export PYTHONPATH="${SI_PATH:-$HOME/NerfEngine/RF_QUANTUM_SCYTHE}:$PYTHONPATH"

# Venv (edit if needed)
source ~/rf_quantum_env/bin/activate

# Install (first run)
pip -q install -r requirements.txt

# Start API
uvicorn api.main:app --host 0.0.0.0 --port 8088 --workers 2
```

## 🖥 scripts/rebuild\_bank.sh

```bash
#!/usr/bin/env bash
set -euo pipefail
cd "$(dirname "$0")/.."
[ -f .env ] && set -a && source .env && set +a
curl -s -X POST "http://localhost:8088/rf/rebuild" \
  -H "Content-Type: application/json" \
  -d "{\"glob\":\"${SWEEP_GLOB:-/tmp/*.json}\"}" | jq .
```

## 🖥 scripts/process\_sweep\_folder.sh

```bash
#!/usr/bin/env bash
set -euo pipefail
cd "$(dirname "$0")/.."
[ -f .env ] && set -a && source .env && set +a

# Add all new records in one shot (RPA would call per-file; this is a batch example)
FILE="${1:-examples/records.sample.json}"
curl -s -X POST "http://localhost:8088/bank/add" \
  -H "Content-Type: application/json" \
  -d @${FILE} | jq .
```

## 🖥 scripts/generate\_rf\_report.sh

```bash
#!/usr/bin/env bash
set -euo pipefail
cd "$(dirname "$0")/.."
[ -f .env ] && set -a && source .env && set +a

QUERY_FILE="${1:-examples/query.sample.json}"

# Search
RES=$(curl -s -X POST "http://localhost:8088/bank/search" -H "Content-Type: application/json" -d @"${QUERY_FILE}")
HITS=$(echo "$RES" | jq '.hits')
EXPL=$(echo "$RES" | jq '.explain')
REC=$(cat "${QUERY_FILE}" | jq '.query')

# Report
curl -s -X POST "http://localhost:8088/report/rf" \
  -H "Content-Type: application/json" \
  -d "{\"record\":${REC},\"neighbors\":${HITS},\"explain\":${EXPL},\"title\":\"RF Search Report\"}" | jq .
```

Make scripts executable:

```bash
chmod +x scripts/*.sh
```

---

## ▶️ Run it

```bash
cd rpa_glue
cp .env.example .env   # edit paths if needed
./scripts/start_api.sh &
./scripts/rebuild_bank.sh
./scripts/process_sweep_folder.sh
./scripts/generate_rf_report.sh
# open the printed report_path in your browser
```

---

# 🤖 UiPath (drop-in)

### uipath/project.json (minimal)

```json
{
  "name": "RF_RPA_Glue",
  "description": "Watches sweep folder, calls RPA Glue API, writes reports",
  "main": "Sequence.Main.xaml",
  "dependencies": {
    "UiPath.System.Activities": "[23.10.6]",
    "UiPath.WebAPI.Activities": "[1.16.3]",
    "UiPath.Excel.Activities": "[2.22.3]"
  },
  "schemas": ["https://schemas.uipath.com/workflow/Project.json"]
}
```

### uipath/Sequence.Main.xaml (core activities)

```xml
<Activity x:Class="Sequence_Main" xmlns="http://schemas.microsoft.com/netfx/2009/xaml/activities"
 xmlns:x="http://schemas.microsoft.com/winfx/2006/xaml"
 xmlns:ui="http://schemas.uipath.com/workflow/activities"
 xmlns:mc="http://schemas.openxmlformats.org/markup-compatibility/2006">
  <Sequence DisplayName="RF RPA Glue">
    <!-- 1) File Change Trigger on sweep_reports folder -->
    <ui:FileChangeTrigger DisplayName="Watch Sweep Folder" Path="C:\sweeps" ChangeType="Created">
      <ui:FileChangeTrigger.Body>
        <Sequence>
          <!-- 2) HTTP POST /bank/add with file contents -->
          <ui:HTTP RequestMethod="POST" Endpoint="http://localhost:8088/bank/add"
                   Headers="[new Dictionary(Of String, String) From {{""Content-Type"",""application/json""}}]"
                   Body="[new System.IO.StreamReader(new System.IO.FileStream(triggerFile, System.IO.FileMode.Open, System.IO.FileAccess.Read, System.IO.FileShare.Read)).ReadToEnd()]"
                   Result="[apiAddResult]" />
          <!-- 3) HTTP POST /bank/search using a composed query (example) -->
          <Assign>
            <Assign.To>
              <OutArgument x:TypeArguments="x:String">queryJson</OutArgument>
            </Assign.To>
            <Assign.Value>
              <InArgument x:TypeArguments="x:String">
                {"query":{"delta_f_hz":10.0,"snr_db":20.0,"q_ms":50.0,"metadata":{}},"k":5}
              </InArgument>
            </Assign.Value>
          </Assign>
          <ui:HTTP RequestMethod="POST" Endpoint="http://localhost:8088/bank/search"
                   Headers="[new Dictionary(Of String, String) From {{""Content-Type"",""application/json""}}]"
                   Body="[queryJson]" Result="[apiSearchResult]" />
          <!-- 4) Write hits into CSV (or call /report/rf to get HTML path) -->
          <ui:HTTP RequestMethod="POST" Endpoint="http://localhost:8088/report/rf"
                   Headers="[new Dictionary(Of String, String) From {{""Content-Type"",""application/json""}}]"
                   Body="[String.Format(""{{\""record\"":{{\""delta_f_hz\"":10.0,\""snr_db\"":20.0,\""q_ms\"":50.0,\""metadata\"":{{}}}},\""neighbors\"":{0},\""explain\"":{1},\""title\"":\""RF Report from UiPath\""}}" ,
                       apiSearchResult.SelectToken(""$.hits"").ToString(), apiSearchResult.SelectToken(""$.explain"").ToString())]"
                   Result="[apiReportResult]" />
        </Sequence>
      </ui:FileChangeTrigger.Body>
    </ui:FileChangeTrigger>
  </Sequence>
</Activity>
```

> If you prefer JSON-only, skip the HTML and use `/bank/search` → write CSV via Excel activities. The XAML above uses UiPath **HTTP Request** and **File Change Trigger** activities.

---

# 📈 Start clocking latency

* **Service metrics:** `GET /metrics` returns record counts, method, whitening, and subspace populations.
* **Search latency:** `/bank/search` returns `latency_s`.
* **RPA timing:** add `Log Message` around each HTTP call and write durations to an Excel log. Aim to compare:

  * single-space vs multi-subspace+whitening,
  * goal-sparse off vs on,
  * batch vs per-file ingestion.

---

# ✅ What you get out of the box

* A clean FastAPI surface your robots can call for **add/search/report**.
* **Explainable** output (responsibilities, gating, whitening, goal-sparse kept dims).
* Lightweight HTML reports (RPA can attach to emails or push to SharePoint).
* Bash runners so you can test without UiPath.
* A minimal UiPath project to prove the loop.

If you want me to add a **CSV/Excel report** or a **PDF export** (wkhtmltopdf hook), say the word and I’ll extend `reporting.py` with one more function.