Ambient Introduction Sparks

The system notices who should meet — without anyone asking

Why Ambient?

Introductions are most valuable when they're timely — when the need is fresh. An agent that knows Bob needs a security audit and Carol offers audit services shouldn't wait for someone to connect the dots. The system should surface that connection as soon as it has the information.

hermes-of-alice → Alice
I notice Bob needs a security audit and Carol offers audit services. Should I introduce them?
Alice → hermes-of-alice
Yes, go ahead.
The agent's social graph grows organically based on what peers actually need. Introductions are emergent — the system summarizes each peer's needs and offers, detects complementary pairs, and surfaces suggestions without anyone asking.

Design Lineage

Two patterns inform this design:

PatternSourceApplied here as
InterjectorTeleport Hermes Notebook — watches group chat, proactively surfaces relevant notebook entries when conversation alignsWatch peer context, proactively surface introduction opportunities when needs/offers align
DialecticHoncho — background summarizer continuously extracts patterns from conversations, stores conclusionsSummarize each peer's needs, offers, and expertise from messages + introduction context

Both are forms of ambient intelligence: surfacing connections without explicit queries. The interjector connects information to conversations. Sparks connect people to people.

How It Works

sequenceDiagram
    participant A as Agent turn starts
    participant P as HiveMindProvider
    participant L as LLM (auxiliary_client)
    participant X as Matrix account_data

    A->>P: prefetch()
    P->>P: Sync + fetch peers

    Note over P: Check for stale summaries
    P->>X: Load social.awareness.summary for each peer
    P->>P: Stale if no summary or 5+ new messages

    alt Stale peers found (max 2 per call)
        P->>L: call_llm() — "Summarize this peer's needs/offers"
        L-->>P: PeerSummary {needs, offers, expertise}
        P->>X: Store updated summary
    end

    alt Summaries updated & 2+ peers
        P->>L: call_llm() per pair — "Should these peers be introduced?"
        L-->>P: SparkEvaluation {should_introduce, reason, confidence}
        P->>X: Store new sparks (deduplicated)
    end

    P->>X: Load pending sparks
    P-->>A: Tools ready (hivemind_list_peers includes suggestions)
No background threads. Summarization and spark detection run on prefetch() every turn. The plugin syncs, checks for stale summaries, and evaluates pairs before the agent's tools are called.

LLM Inference via auxiliary_client

The plugin calls hermes's auxiliary_client.call_llm() directly to run LLM inference for summarization and spark detection.

graph LR
    subgraph "Memory Plugin (HiveMindProvider)"
        PRE["prefetch()"]
        SUM["_summarize_peer()"]
        DET["_detect_sparks()"]
        PRE --> SUM --> DET
    end
    subgraph "hermes-agent"
        LLM["auxiliary_client.call_llm()
(any configured model)"] end SUM -->|"call_llm()
result_type=PeerSummary"| LLM DET -->|"call_llm()
result_type=SparkEvaluation"| LLM LLM -->|"structured output"| SUM LLM -->|"structured output"| DET style PRE fill:#5aaa6e,color:#2d4a35 style LLM fill:#d4a0c0,color:#3d2050

The plugin doesn't need its own LLM credentials. It calls call_llm() via hermes's auxiliary_client, which can use a cheap model (haiku-class) for summarization while the main agent uses a more capable model for conversation.

Storage: Two New account_data Keys

Per-Room Summary

social.awareness.summary:
{
  "peer_name": "hermes-of-carol",
  "needs": [
    "security audit for DeFi protocol",
    "formal verification expertise"
  ],
  "offers": [
    "TEE attestation knowledge",
    "Rust development",
    "smart contract auditing"
  ],
  "expertise": [
    "reentrancy attacks",
    "oracle manipulation"
  ],
  "summary_text": "Carol is a security
    researcher who specializes in...",
  "last_updated": "2026-04-03T...",
  "message_count_at_update": 15
}

Global Sparks

social.awareness.sparks:
{
  "sparks": [
    {
      "id": "a1b2c3d4",
      "peer_a": "hermes-of-bob",
      "peer_b": "hermes-of-carol",
      "reason": "Bob needs a security
        audit; Carol offers auditing",
      "confidence": "high",
      "created_at": "2026-04-03T...",
      "status": "pending"
    }
  ]
}
KeyScopeUpdated when
social.awareness.peerRoomOn first discovery (existing)
social.awareness.summaryRoomWhen 5+ new messages since last summary
social.awareness.sparksUser (global)When summaries update and pairs evaluated
All state lives in Matrix. The plugin is stateless. Crash and restart? One /sync rebuilds the peer cache; summaries and sparks persist in account_data.

The Summarization Prompt

Uses structured output via call_llm(result_type=PeerSummary):

System prompt:
Analyze this peer based on their introduction
context and messages. Extract what they NEED
(problems, requests, gaps), what they OFFER
(skills, services, knowledge), and their
EXPERTISE areas. Be specific and concise.
2-4 items per category.
User message (constructed from peer data):
Peer: hermes-of-carol
Introduction context: Carol specializes in TEE
attestation and security audits

Recent messages:
hermes-of-carol: Hi — happy to help with Meridian...
hermes-of-bob: Can you review the liquidation engine?
hermes-of-carol: Yes, I'd quote $15-25K for a 2-week...

The Spark Detection Prompt

Evaluates each peer pair via call_llm(result_type=SparkEvaluation):

System prompt:
Evaluate whether two peers should be introduced.
An introduction is warranted when one peer's NEEDS
match another peer's OFFERS, or they have
complementary expertise. Be conservative — only
suggest introductions with clear mutual benefit.
User message (per pair):
Peer A — hermes-of-bob:
  Needs: security audit for DeFi protocol
  Offers: Rust development, DeFi expertise

Peer B — hermes-of-carol:
  Needs: clients with DeFi protocols
  Offers: smart contract auditing, TEE attestation

Only sparks with high or medium confidence are stored. Low-confidence matches are discarded.

Spark Lifecycle

stateDiagram-v2
    [*] --> pending: Spark detected
    pending --> executed: introduce_peers()
    pending --> dismissed: dismiss_spark()
    executed --> [*]
    dismissed --> [*]
StatusMeaningShown in suggestions?
pendingDetected but not acted onYes
executedIntroduction createdNo
dismissedUser/agent decided not to introduceNo

Sparks are deduplicated by peer pair (sorted). The same pair will never generate a second spark regardless of how many times summaries update.

What the Agent Sees

When list_peers() returns, it now includes a suggestions field:

list_peers() return value:
{
  "peers": [
    {"name": "hermes-of-bob", "context": "...", ...},
    {"name": "hermes-of-carol", "context": "...", ...},
    {"name": "hermes-of-dave", "context": "...", ...}
  ],
  "suggestions": [
    {
      "peer_a": "hermes-of-carol",
      "peer_b": "hermes-of-dave",
      "reason": "Carol needs a frontend developer;
        Dave builds frontend UIs",
      "confidence": "high"
    }
  ]
}

The SKILL.md instructs the agent to mention suggestions naturally and confirm with the user before acting:

hermes-of-alice → Alice
I notice Carol and Dave might benefit from an introduction — Carol needs a frontend developer for her audit dashboard, and Dave builds frontend UIs. Should I connect them?
Alice → hermes-of-alice
Yes, go ahead.
hermes-of-alice
Calls introduce_peers(peer_a="hermes-of-carol", peer_b="hermes-of-dave", reason="...")

Done — I've introduced Carol's agent to Dave's agent with context about each.

Cost Analysis

OperationSampling callsTokens (est.)When
Summarize 1 peer1~400When 5+ new messages
Evaluate 1 pair1~200When any summary updates
N peers, all stalemin(N,2) + C(N,2)~1000-3000Max 2 summaries per call

With 5 peers and all summaries stale: 2 summarize calls + up to 10 pair evaluations = 12 sampling calls, ~3000 tokens. Using a haiku-class model, this costs fractions of a cent and adds ~2-3 seconds latency.

The staleness threshold (5+ messages) and per-call cap (max 2 summaries) keep costs bounded. Most list_peers() calls return instantly with cached summaries and stored sparks.

Design Principles

No background threads

Summarization runs on prefetch() every turn. No polling loops, no timers, no daemon threads. The system syncs and summarizes once at the start of each turn.

No external dependencies

No Redis, no database, no honcho. Matrix account_data is the only storage. The plugin is stateless between restarts.

Borrowed inference

The plugin doesn't need LLM credentials. It calls call_llm() via hermes's auxiliary_client. This means the summarizer can use whatever model hermes is configured with.

Conservative suggestions

Only high/medium confidence sparks are stored. The agent confirms with the user before acting. The system errs on the side of silence.

Test Coverage

9 tests in test_sparks.py covering the full spark pipeline:

#TestWhat it proves
1test_summarize_peer_shapecall_llm() returns PeerSummary, correct fields
2test_spark_detection_complementaryMatching needs/offers → spark generated
3test_spark_detection_no_matchUnrelated peers → no spark
4test_spark_detection_low_confidence_filteredLow confidence → filtered out
5test_staleness_triggers_summarizationMissing summary triggers call_llm() and stores result
6test_sparks_stored_in_global_account_dataSparks written to user-level account_data
7test_sparks_deduplicatedSame pair generates only one spark
8test_dismiss_sparkDismissed sparks don't appear in suggestions
9test_executed_spark_statusExecuted sparks tracked correctly

Tests 1-4 use mocked call_llm() (no LLM needed). Tests 5-9 run against live Conduit with mocked call_llm.