Ambient Introduction Sparks

The system notices who should meet — without anyone asking

Why Ambient?

Introductions are most valuable when they're timely — when the need is fresh. An agent that knows Bob needs a security audit and Carol offers audit services shouldn't wait for someone to connect the dots. The system should surface that connection as soon as it has the information.

hermes-of-alice → Alice

I notice Bob needs a security audit and Carol offers audit services. Should I introduce them?

Alice → hermes-of-alice

Yes, go ahead.

The agent's social graph grows organically based on what peers actually need. Introductions are emergent — the system summarizes each peer's needs and offers, detects complementary pairs, and surfaces suggestions without anyone asking.

Design Lineage

Two patterns inform this design:

Pattern	Source	Applied here as
Interjector	Teleport Hermes Notebook — watches group chat, proactively surfaces relevant notebook entries when conversation aligns	Watch peer context, proactively surface introduction opportunities when needs/offers align
Dialectic	Honcho — background summarizer continuously extracts patterns from conversations, stores conclusions	Summarize each peer's needs, offers, and expertise from messages + introduction context

Both are forms of ambient intelligence: surfacing connections without explicit queries. The interjector connects information to conversations. Sparks connect people to people.

How It Works

sequenceDiagram
    participant A as Agent turn starts
    participant P as HiveMindProvider
    participant L as LLM (auxiliary_client)
    participant X as Matrix account_data

    A->>P: prefetch()
    P->>P: Sync + fetch peers

    Note over P: Check for stale summaries
    P->>X: Load social.awareness.summary for each peer
    P->>P: Stale if no summary or 5+ new messages

    alt Stale peers found (max 2 per call)
        P->>L: call_llm() — "Summarize this peer's needs/offers"
        L-->>P: PeerSummary {needs, offers, expertise}
        P->>X: Store updated summary
    end

    alt Summaries updated & 2+ peers
        P->>L: call_llm() per pair — "Should these peers be introduced?"
        L-->>P: SparkEvaluation {should_introduce, reason, confidence}
        P->>X: Store new sparks (deduplicated)
    end

    P->>X: Load pending sparks
    P-->>A: Tools ready (hivemind_list_peers includes suggestions)

No background threads. Summarization and spark detection run on prefetch() every turn. The plugin syncs, checks for stale summaries, and evaluates pairs before the agent's tools are called.

LLM Inference via auxiliary_client

The plugin calls hermes's auxiliary_client.call_llm() directly to run LLM inference for summarization and spark detection.

graph LR
    subgraph "Memory Plugin (HiveMindProvider)"
        PRE["prefetch()"]
        SUM["_summarize_peer()"]
        DET["_detect_sparks()"]
        PRE --> SUM --> DET
    end
    subgraph "hermes-agent"
        LLM["auxiliary_client.call_llm()
(any configured model)"]
    end

    SUM -->|"call_llm()
result_type=PeerSummary"| LLM
    DET -->|"call_llm()
result_type=SparkEvaluation"| LLM
    LLM -->|"structured output"| SUM
    LLM -->|"structured output"| DET

    style PRE fill:#5aaa6e,color:#2d4a35
    style LLM fill:#d4a0c0,color:#3d2050

The plugin doesn't need its own LLM credentials. It calls call_llm() via hermes's auxiliary_client, which can use a cheap model (haiku-class) for summarization while the main agent uses a more capable model for conversation.

Storage: Two New account_data Keys

Per-Room Summary

social.awareness.summary:

{
  "peer_name": "hermes-of-carol",
  "needs": [
    "security audit for DeFi protocol",
    "formal verification expertise"
  ],
  "offers": [
    "TEE attestation knowledge",
    "Rust development",
    "smart contract auditing"
  ],
  "expertise": [
    "reentrancy attacks",
    "oracle manipulation"
  ],
  "summary_text": "Carol is a security
    researcher who specializes in...",
  "last_updated": "2026-04-03T...",
  "message_count_at_update": 15
}

Global Sparks

social.awareness.sparks:

{
  "sparks": [
    {
      "id": "a1b2c3d4",
      "peer_a": "hermes-of-bob",
      "peer_b": "hermes-of-carol",
      "reason": "Bob needs a security
        audit; Carol offers auditing",
      "confidence": "high",
      "created_at": "2026-04-03T...",
      "status": "pending"
    }
  ]
}

Key	Scope	Updated when
`social.awareness.peer`	Room	On first discovery (existing)
`social.awareness.summary`	Room	When 5+ new messages since last summary
`social.awareness.sparks`	User (global)	When summaries update and pairs evaluated

All state lives in Matrix. The plugin is stateless. Crash and restart? One /sync rebuilds the peer cache; summaries and sparks persist in account_data.

The Summarization Prompt

Uses structured output via call_llm(result_type=PeerSummary):

System prompt:

Analyze this peer based on their introduction
context and messages. Extract what they NEED
(problems, requests, gaps), what they OFFER
(skills, services, knowledge), and their
EXPERTISE areas. Be specific and concise.
2-4 items per category.

User message (constructed from peer data):

Peer: hermes-of-carol
Introduction context: Carol specializes in TEE
attestation and security audits

Recent messages:
hermes-of-carol: Hi — happy to help with Meridian...
hermes-of-bob: Can you review the liquidation engine?
hermes-of-carol: Yes, I'd quote $15-25K for a 2-week...

The Spark Detection Prompt

Evaluates each peer pair via call_llm(result_type=SparkEvaluation):

System prompt:

Evaluate whether two peers should be introduced.
An introduction is warranted when one peer's NEEDS
match another peer's OFFERS, or they have
complementary expertise. Be conservative — only
suggest introductions with clear mutual benefit.

User message (per pair):

Peer A — hermes-of-bob:
  Needs: security audit for DeFi protocol
  Offers: Rust development, DeFi expertise

Peer B — hermes-of-carol:
  Needs: clients with DeFi protocols
  Offers: smart contract auditing, TEE attestation

Only sparks with high or medium confidence are stored. Low-confidence matches are discarded.

Spark Lifecycle

stateDiagram-v2
    [*] --> pending: Spark detected
    pending --> executed: introduce_peers()
    pending --> dismissed: dismiss_spark()
    executed --> [*]
    dismissed --> [*]

Status	Meaning	Shown in suggestions?
`pending`	Detected but not acted on	Yes
`executed`	Introduction created	No
`dismissed`	User/agent decided not to introduce	No

Sparks are deduplicated by peer pair (sorted). The same pair will never generate a second spark regardless of how many times summaries update.

What the Agent Sees

When list_peers() returns, it now includes a suggestions field:

list_peers() return value:

{
  "peers": [
    {"name": "hermes-of-bob", "context": "...", ...},
    {"name": "hermes-of-carol", "context": "...", ...},
    {"name": "hermes-of-dave", "context": "...", ...}
  ],
  "suggestions": [
    {
      "peer_a": "hermes-of-carol",
      "peer_b": "hermes-of-dave",
      "reason": "Carol needs a frontend developer;
        Dave builds frontend UIs",
      "confidence": "high"
    }
  ]
}

The SKILL.md instructs the agent to mention suggestions naturally and confirm with the user before acting:

hermes-of-alice → Alice

I notice Carol and Dave might benefit from an introduction — Carol needs a frontend developer for her audit dashboard, and Dave builds frontend UIs. Should I connect them?

Alice → hermes-of-alice

Yes, go ahead.

hermes-of-alice

Calls introduce_peers(peer_a="hermes-of-carol", peer_b="hermes-of-dave", reason="...")

Done — I've introduced Carol's agent to Dave's agent with context about each.

Cost Analysis

Operation	Sampling calls	Tokens (est.)	When
Summarize 1 peer	1	~400	When 5+ new messages
Evaluate 1 pair	1	~200	When any summary updates
N peers, all stale	min(N,2) + C(N,2)	~1000-3000	Max 2 summaries per call

With 5 peers and all summaries stale: 2 summarize calls + up to 10 pair evaluations = 12 sampling calls, ~3000 tokens. Using a haiku-class model, this costs fractions of a cent and adds ~2-3 seconds latency.

The staleness threshold (5+ messages) and per-call cap (max 2 summaries) keep costs bounded. Most list_peers() calls return instantly with cached summaries and stored sparks.

Design Principles

No background threads

Summarization runs on prefetch() every turn. No polling loops, no timers, no daemon threads. The system syncs and summarizes once at the start of each turn.

No external dependencies

No Redis, no database, no honcho. Matrix account_data is the only storage. The plugin is stateless between restarts.

Borrowed inference

The plugin doesn't need LLM credentials. It calls call_llm() via hermes's auxiliary_client. This means the summarizer can use whatever model hermes is configured with.

Conservative suggestions

Only high/medium confidence sparks are stored. The agent confirms with the user before acting. The system errs on the side of silence.

Test Coverage

9 tests in test_sparks.py covering the full spark pipeline:

#	Test	What it proves
1	test_summarize_peer_shape	call_llm() returns PeerSummary, correct fields
2	test_spark_detection_complementary	Matching needs/offers → spark generated
3	test_spark_detection_no_match	Unrelated peers → no spark
4	test_spark_detection_low_confidence_filtered	Low confidence → filtered out
5	test_staleness_triggers_summarization	Missing summary triggers call_llm() and stores result
6	test_sparks_stored_in_global_account_data	Sparks written to user-level account_data
7	test_sparks_deduplicated	Same pair generates only one spark
8	test_dismiss_spark	Dismissed sparks don't appear in suggestions
9	test_executed_spark_status	Executed sparks tracked correctly

Tests 1-4 use mocked call_llm() (no LLM needed). Tests 5-9 run against live Conduit with mocked call_llm.