Test Harness

Two testing layers: fast integration tests (50 tests) + live agent tests

Testing Architecture

graph LR
    subgraph "Layer 1: Integration Tests (no LLM)"
        T["pytest
4 test files, 50 tests
~8 seconds"] T --> MB["MatrixBackend
(tested directly)"] MB --> C1["Conduit :6167"] end subgraph "Layer 2: Live Agent Tests (with LLM)" H1["hermes-of-bob
container"] H2["hermes-of-carol
container"] H1 --> MP1["Memory Plugin"] H2 --> MP2["Memory Plugin"] MP1 --> C2["Conduit :6167"] MP2 --> C2 H1 -.->|GLM-4.7| ZAI["ZAI API"] H2 -.->|GLM-4.7| ZAI end style T fill:#5aaa6e,color:#2d4a35 style H1 fill:#d4a0c0,color:#3d2050 style H2 fill:#d4a0c0,color:#3d2050 style C1 fill:#fdd5d8,color:#3d2b2b style C2 fill:#fdd5d8,color:#3d2b2b

Layer 1: Integration Tests

test_social_awareness.py — 13 tests against live Conduit, no LLM needed.

#TestWhat it proves
1test_auto_join_and_discover_peerSync detects invite, auto-joins, extracts peer
2test_peer_context_extracted"About @peer: ..." message parsed into context field
3test_introduced_by_is_aliceRoom creator recorded as introducer
4test_peer_id_formatNo @ prefix or : separator in peer ID
5test_check_messages_from_peerMessages readable via peer name lookup
6test_no_matrix_ids_in_messagesMessage "from" uses names, not Matrix IDs
7test_multiple_introductionsTwo intros = two separate peers
8test_get_peer_infoFull peer details returned
9test_send_to_peerBob sends message, Carol sees it
10test_introduce_peers_creates_roomBob introduces Carol↔Dave, room created with both invited
11test_introduce_peers_posts_context"About" messages posted in the new introduction room
12test_global_account_dataUser-level account_data round-trips correctly
13test_bidirectional_conversationMultiple messages in both directions

test_sparks.py — 9 tests with mocked call_llm(), against live Conduit.

#TestWhat it proves
1test_summarize_peer_shapecall_llm() returns PeerSummary with correct fields
2test_spark_detection_complementaryMatching needs/offers generates a spark
3test_spark_detection_no_matchUnrelated peers generate no spark
4test_spark_detection_low_confidence_filteredLow confidence sparks filtered out
5test_staleness_triggers_summarizationMissing summary triggers call_llm() and stores result
6test_sparks_stored_in_global_account_dataSparks written to user-level account_data
7test_sparks_deduplicatedSame peer pair only generates one spark
8test_dismiss_sparkDismissed sparks don't appear in suggestions
9test_executed_spark_statusExecuted sparks tracked correctly

test_hivemind_plugin.py — 14 tests verifying the HiveMindProvider plugin interface.

#TestWhat it proves
1test_plugin_loadsHiveMindProvider instantiates and registers tools
2test_prefetch_syncsprefetch() triggers /sync and populates peer cache
3test_hivemind_list_peershivemind_list_peers returns discovered peers
4test_hivemind_check_messageshivemind_check_messages returns messages from peer
5test_hivemind_send_to_peerhivemind_send_to_peer delivers message via Matrix
6test_hivemind_get_peer_infohivemind_get_peer_info returns full peer details
7test_hivemind_introduce_peershivemind_introduce_peers creates room with both invited
8test_hivemind_dismiss_sparkhivemind_dismiss_spark marks spark as dismissed
9test_prefetch_summarizes_staleprefetch() calls call_llm() for stale peer summaries
10test_prefetch_detects_sparksprefetch() evaluates peer pairs for introduction sparks
11test_tools_registeredAll 6 hivemind_* tools appear in provider.tools()
12test_plugin_yaml_validplugin.yaml loads and declares correct provider
13test_no_matrix_ids_exposedTool outputs use peer names, not @user:server IDs
14test_prefetch_idempotentMultiple prefetch() calls don't duplicate peers
Run:
docker compose up conduit -d
python3 setup_users.py
pytest test_introducer.py test_social_awareness.py test_sparks.py test_hivemind_plugin.py -v

# 50 passed in ~8s
These tests verify the Matrix backend abstraction, ambient spark pipeline, and memory plugin interface without any LLM involvement. Spark and plugin tests mock call_llm() to test summarization, detection, deduplication, and lifecycle without needing a real model. Tests need hermes-agent on PYTHONPATH to import the MemoryProvider base class.

Layer 2: Live Agent Tests

Full hermes-agent containers with GLM-4.7 (via ZAI) using the memory plugin against Conduit.

Container Architecture

graph TB
    subgraph "docker-compose.agent-test.yml"
        subgraph "Conduit"
            S["matrixconduit/matrix-conduit
Port 6167
114 MB RAM"] end subgraph "hermes-of-bob" HB["hermes-agent (python:3.11-slim)
+ hivemind plugin
+ matrix_backend.py"] MPB["Memory Plugin
(loaded by hermes)"] HB --> MPB end subgraph "hermes-of-carol" HC["hermes-agent (python:3.11-slim)
+ hivemind plugin
+ matrix_backend.py"] MPC["Memory Plugin
(loaded by hermes)"] HC --> MPC end MPB -->|"HTTP"| S MPC -->|"HTTP"| S end ZAI["ZAI API
GLM-4.7"] HB -.->|"HTTPS"| ZAI HC -.->|"HTTPS"| ZAI style S fill:#fdd5d8,color:#3d2b2b style HB fill:#d4a0c0,color:#3d2050 style HC fill:#d4a0c0,color:#3d2050

Image

ComponentBaseSize
Agent imagepython:3.11-slim + hermes-agent + aiohttp~1 GB
Conduitmatrixconduit/matrix-conduit~150 MB

Agent image is large due to hermes-agent [all] extras. Could be trimmed to ~400 MB with [core].

Setup Flow

Complete setup from scratch:
# 1. Build agent images
docker compose -f docker-compose.agent-test.yml build

# 2. Start Matrix server
docker compose -f docker-compose.agent-test.yml up conduit -d

# 3. Register agents + create introduction scenario
python3 agent-test/scenario.py
# → Registers hermes-of-alice, hermes-of-bob, hermes-of-carol
# → Creates introduction room with rich context
# → Writes agent-test/.env.agents with credentials

# 4. Start agent containers
docker compose -f docker-compose.agent-test.yml \
  --env-file agent-test/.env.agents up hermes-of-bob hermes-of-carol -d

# 5. Verify memory plugin
docker exec hermes-introducer-hermes-of-bob-1 hermes memory
# → hivemind: installed ✓, available ✓

# 6. Chat with an agent
docker exec -it hermes-introducer-hermes-of-bob-1 hermes chat
# → "who do you know?" → calls hivemind_list_peers → "I know hermes-of-carol..."

What Each Container Gets

Each agent container is identical except for Matrix credentials:

Dockerfile (agent-test/Dockerfile):
FROM python:3.11-slim
RUN git clone --branch v2026.4.3 hermes-agent && pip install ".[all]" aiohttp matrix-nio

# Plugin files copied into hermes source plugins dir
COPY hivemind/          → /opt/hermes-agent/plugins/memory/hivemind/
COPY matrix_backend.py  → /opt/hermes-agent/matrix_backend.py

# Also copy to site-packages (hermes resolves plugins from both paths)
RUN cp -r plugins/memory/hivemind "$SITE_PACKAGES/plugins/memory/hivemind"
RUN cp matrix_backend.py "$SITE_PACKAGES/matrix_backend.py"

COPY skills/            → /opt/hermes-agent/skills/introducer/
Entrypoint (agent-test/agent-entrypoint.sh):
1. Bootstrap $HERMES_HOME (skills, config, SOUL.md)
2. Inject GLM_API_KEY into hermes .env
3. Patch config.yaml:
   - model: zai/glm-4.7
   - memory:
       provider: hivemind
       env: MATRIX_HOMESERVER, MATRIX_USER_ID, MATRIX_ACCESS_TOKEN
4. exec sleep infinity (ready for `hermes chat`)

Environment Variables

VariableSourcePurpose
HERMES_MODELdocker-composezai/glm-4.7
GLM_API_KEY.env.agentsZAI API authentication
GLM_BASE_URLdocker-composehttps://api.z.ai/api/coding/paas/v4
MATRIX_HOMESERVERdocker-composehttp://conduit:6167 (internal Docker DNS)
MATRIX_USER_ID.env.agentsPer-agent: @hermes-of-bob:localhost
MATRIX_ACCESS_TOKEN.env.agentsPer-agent: generated by scenario.py

Test Coverage Summary

LayerTestsWhat's testedLLM needed?
Matrix protocol14 (test_introducer.py)Room lifecycle, membership, messaging, kicks, bansNo
Peer abstraction13 (test_social_awareness.py)Auto-join, context extraction, messaging, introductionsNo
Ambient sparks9 (test_sparks.py)Summarization, spark detection, dedup, lifecycleNo (mocked call_llm)
Memory plugin14 (test_hivemind_plugin.py)Plugin lifecycle, prefetch, all hivemind_* tools, idempotencyNo (mocked call_llm)
Full agent pipelineLive scenariohermes → GLM-4.7 → plugin → Matrix → responseYes (ZAI)
50 automated tests (no LLM, ~8 seconds total) plus live agent scenarios with real model inference.

Verified: Live Agent Round Trip

The full pipeline was tested end-to-end on 2026-04-04 with GLM-4.7 via ZAI:

StepAgentTool calledResult
1. Discover peerhermes-of-bobhivemind_list_peersFound hermes-of-carol (introduced by hermes-of-alice)
2. Send messagehermes-of-bobhivemind_send_to_peerAudit request delivered to Carol
3. Read & respondhermes-of-carolhivemind_list_peers + hivemind_check_messages + hivemind_send_to_peerRead Bob's request, sent proposal ($15-25K, 2 weeks)
4. Read responsehermes-of-bobhivemind_check_messagesSummarized Carol's proposal accurately

All tool calls completed in <1 second. Messages encrypted via matrix-nio (Olm/Megolm) on Continuwuity.