HouYi is a lightweight, extensible, production-grade multi-agent framework that ships with SOTA built-in agents (Deep Research, Chatbox, Memory Inbox). One Agent class, one SDK β define, orchestrate, evaluate, and ship agents from prototype to production without changing your API surface.
Why HouYi
- Full-lifecycle harness β Not just execution: definition β orchestration β context engineering β evaluation β observability β governance. Every layer is pluggable, every extension point is documented for community and enterprise customization.
- Context engineering as first-class β Token budgeting, persistent memory with emphasis-aware recall, RAG, context compression, and Reminders injection at the Transformer attention sweet spot β built into the SDK, not afterthoughts.
- Neuro-symbolic verification β Z3 SMT solver validates LLM outputs against business constraints, separating probabilistic reasoning from deterministic correctness for production reliability.
- Ships with SOTA agents β Deep Research (plan β multi-round search β conflict resolution β citation-verified report with RACE/FACT scoring), Chatbox (multi-turn with tool calling and memory), Memory Inbox (LLM-powered extraction with review workflow). Use them directly or study their source as reference implementations.
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β HouYi Studio (Ideas Foundry) β
β Graph Orchestration Β· Chatbox Β· Agent Hub Β· Deep Research β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β Studio Server (FastAPI + SSE) β
β Chat API Β· Research API Β· Memory API Β· Knowledge API β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β HouYi SDK (Core) β
β β
β ββββββββββββ ββββββββββββββββ ββββββββββββ ββββββββββββββββββββββ β
β β Agent β β AgentTeam β β Team β β DAG β β
β β Runner β β Manager β β Task β β Engine β β
β βββββββ¬βββββ ββββββββ¬ββββββββ ββββββ¬ββββββ ββββββββββ¬ββββββββββββ β
β ββββββββββββ¬ββββ΄ββββββββββββββββ΄ββββββββββββββββββ β
β ββββββββ΄ββββββββ β
β β Orchestrator β Delegate Β· Autonomous Β· DAG β
β ββββββββ¬ββββββββ β
β ββββββββββββββββββΌβββββββββββββββββββββββββββββββββββββββββββββββββ β
β β Context Engineering Layer β
Pluggable β β
β β Token Budget Β· Tools Β· Memory Β· RAG Β· State Checkpoints β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€ β
β β Capabilities Layer β
Pluggable β β
β β SimpleSkill Β· Web Search Β· Shell Exec Β· A2A Β· Self-Evolver β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€ β
β β Quality & Governance Layer β
Pluggable β β
β β Evaluators Β· Z3 Verification Β· Sandbox Β· Cost Control β β
β β OTEL Tracing Β· Error Policy Β· Conflict Resolution β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€ β
β β Adapters Layer β
Pluggable β β
β β OpenAI Β· Anthropic Β· Gemini Β· more... β β
β β Memory Store Β· Embedding Provider Β· Persistence Backend β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
HouYi is designed for community contribution and enterprise customization. Every β
Pluggable layer exposes well-defined extension points:
Extension Point Protocol / Base Class Implementations
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
LLM Adapter LLMAdapter OpenAI, Anthropic, Gemini, Ollama, vLLM
Memory Backend MemoryStore SQLite, Redis, QMD
Embedding Provider EmbeddingProvider FastEmbed, OpenAI, HuggingFace
Search Provider WebSearchService Bocha, DuckDuckGo, Tavily, Serper
Skill / Tool @tool / SkillSpec Any Python function β auto-schema
Context Source ContextSource RAG, Memory, MCP server, custom retriever
Evaluator Evaluator 19+ built-in evaluators, extensible strategy Pattern
Observability Exporter OTEL SpanExporter Jaeger, Zipkin, Datadog, Prometheus
Message Bus Backend AgentMessageBus In-process queue, NATS, Kafka, RocketMQ
Orchestration Mode AgentOrchestrator Delegate, Autonomous, DAG, custom
Error / Conflict Policy ErrorPolicy / ConflictResolver Retry, fallback, source voting, LLM arbiter
Verification Backend Z3 Solver SMT constraints, custom verifier
State / Persistence StateStore SQLite, filesystem, Redis
| Category | Feature | Highlight |
|---|---|---|
| Orchestration | Lightweight Pydantic Core | Declarative agents, tasks, and workflows as Python classes with automatic validation and serialization β "code as configuration" |
| Unified Multi-Agent Engine | Same Agent class, same SDK: tool-loop, mode="delegate" (supervisor dispatches sub-agents), mode="autonomous" (shared state + message bus). Scale from a single chatbot to a multi-agent research team without API fragmentation |
|
| Async DAG Execution | Built on asyncio with DAG-based task orchestration β parallel execution, dynamic graph evolution, and non-blocking I/O for high-concurrency scenarios |
|
| Context | Context Engineering Pipeline | Dynamic token budgeting, RAG integration, persistent Memory with hybrid retrieval (full-text + embedding), emphasis-aware recall that prioritizes user-stressed instructions, and context compression with Reminders injection at the Transformer attention sweet spot |
| SimpleSkill Specification | Cross-platform skill model with built-in governance, evaluation hooks, and host-portable capability negotiation. Any Python function becomes a governed, evaluable capability unit | |
| Quality | Neuro-Symbolic Verification | Z3 SMT solver formally verifies LLM outputs against business constraints, separating probabilistic reasoning from deterministic correctness for production reliability |
| Extensible Evaluator Framework | 19+ evaluators across 4 categories β Quality (accuracy, completeness, relevance, coherence, factuality), Safety (toxicity, bias, hallucination), RAG (groundedness, faithfulness, context precision/recall), Performance (cost, latency). Add custom evaluators via Evaluator base class |
|
| Cost-Aware Governance | Token budget control with dynamic model routing enables automatic cost optimization while maintaining quality through intelligent provider fallback | |
| Infrastructure | A2A Pub/Sub Protocol | Native Agent-to-Agent messaging (P2P, Pub/Sub, Broadcast) aligned with the A2A Pub/Sub draft. Pluggable transport: in-process queues for dev, NATS/Kafka/RocketMQ for distributed production |
| Zero-Config Observability | OpenTelemetry auto-instruments every agent execution with distributed tracing across LLM calls, tool invocations, and state transitions β <3% overhead, no manual setup | |
| Persistent State & Workflows | Automatic execution snapshots support pause/resume, external event handling, and human-in-the-loop workflows β agents wait for async callbacks and resume exactly where they left off | |
| Secure Sandbox Execution | Isolated execution environment with permission controls prevents LLM-generated code from accessing unauthorized resources, ensuring enterprise-grade security |
git clone https://github.com/YiLabsAI/HouYiAgent.git
cd HouYiAgent
uv sync --extra devHouYi Studio is a full-featured web IDE with Chatbox, Agent Hub, Deep Research, and Memory Inbox. Start it locally with one command:
cp .env.example .env # configure your LLM and search API keys
./scripts/dev.sh # launches backend (FastAPI) + frontend (Vite) via tmuxOpen http://localhost:3000 to access the Studio.
from houyi import Agent, tool
from houyi.llm import OpenAIAdapter
@tool
def search(query: str) -> list[str]:
"""Search the web for information."""
return [f"Result for {query}"]
agent = Agent(
role="Researcher",
skills=[search],
llm=OpenAIAdapter(model="gpt-4o-mini"),
)
result = agent.run("What is HouYi?")from houyi import Agent, Task, Team
researcher = Agent(role="Researcher", skills=[search], llm=llm)
analyst = Agent(role="Analyst", skills=[analyze], llm=llm)
team = Team(
agents=[researcher, analyst],
tasks=[
Task("Research AI trends", agent=researcher),
Task("Analyze findings", agent=analyst, context=[0]),
],
)
result = team.run()from houyi import Agent, AgentTeamConfig
supervisor = Agent(
role="Research Supervisor",
llm=llm,
tools=[web_search],
sub_agents=[
AgentTeamConfig(role="Searcher", skills=["web_search"]),
AgentTeamConfig(role="Analyst", skills=["code_execute"]),
],
mode="delegate",
)
result = supervisor.run("Deep research on AI agent architectures")from houyi.adapters.memory.engine import MemoryEngine
from houyi.adapters.memory.store import MemoryStore
store = MemoryStore(data_dir="./memory_data")
engine = MemoryEngine(store)
await engine.add("User prefers Python over JavaScript", tags=["preference"])
memories = await engine.recall("programming language preference?", top_k=5)
context = await engine.build_context("coding question", max_tokens=500)from houyi.application.context.reminders import ReminderInjector, CITATION_REMINDER
injector = ReminderInjector([CITATION_REMINDER])
messages = injector.inject(conversation_messages)
# Critical instructions injected at context tail β Transformer attention sweet spotfrom houyi import evaluate
results = evaluate(
agent=agent,
test_cases=[{"input": "What is AI?", "expected_output": "..."}],
evaluators=["accuracy", "completeness", "relevance"],
)
print(results.summary())HouYi ships with production-ready agent applications built on top of the SDK:
| Agent | Description |
|---|---|
| Deep Research | Automated research: plan decomposition β multi-round web search β source aggregation β intermediate analysis β conflict resolution β citation-verified report with RACE/FACT quality scoring |
| Chatbox | Multi-turn conversational AI with streaming, tool calling, memory integration, and full context engineering pipeline |
| Memory Inbox | LLM-powered memory extraction from conversations with human-in-the-loop review/approve/reject workflow |
Each is a production-grade application that exercises every layer of the SDK. Study their source as reference implementations for building your own agents.
| Guide | Description |
|---|---|
| Getting Started | Installation, quick start, core concepts |
| API Reference | Complete API documentation |
| Advanced Features | Observability, multi-LLM, DAG execution, context engineering |
| Evaluation | Evaluator framework and all built-in evaluators |
| Development Guide | Coding standards and engineering practices |
| Examples | Runnable code examples |
We welcome contributions! See our Contributing Guide.
HouYi is built on and contributes to open standards:
| Standard | Role in HouYi |
|---|---|
| OpenTelemetry | Zero-config distributed tracing across LLM calls, tools, and agent state transitions |
| SimpleSkill | HouYi's native skill specification β cross-platform, governable, evaluable capability units (originated from this project) |
| MCP | Model Context Protocol integration for external context sources |
| A2A | Agent-to-Agent protocol with native Pub/Sub messaging for distributed multi-agent communication |