Why Pori Dropped LangChain: Building LLM Wrappers From Scratch

December 25, 2025 (3mo ago)

Aloy Sathekge

LangChain is the default choice for LLM applications. It's got integrations for everything, a massive community, and you can ship a prototype in an afternoon. We started Pori with it.

Then we ripped it out.

This isn't a LangChain-is-bad post. It's a post about what happens when your agent framework needs to control every layer of the LLM call — structured output, tool calling, message formatting — and you realize you're fighting the abstraction more than using it. If you want to skip ahead to the code, Pori is open source on GitHub.

The Problem

Pori is an agent framework built around a specific loop: Plan → Act → Reflect → Evaluate. Every step talks to an LLM. The agent plans with structured output (Pydantic models), executes tools, pauses for human approval at HITL gates, reflects on results, and evaluates whether to continue.

That means the LLM layer needs to do two things well:

  1. Text completions — for open-ended reasoning
  2. Structured output — for typed decisions the agent loop can actually parse

LangChain handles both. But when you're building an agent that needs to work identically across Anthropic, OpenAI, and any compatible endpoint, you start hitting edges:

  • Anthropic puts the system message in a separate system parameter. OpenAI puts it in the messages array. LangChain abstracts this, but when something goes wrong, you're debugging through three layers of abstraction to find a one-line SDK difference.
  • Structured output works differently per provider — Anthropic uses tool calling, OpenAI uses response_format with JSON schema. LangChain's with_structured_output() handles this, but the error messages when it fails are opaque.
  • We needed full control over the request/response cycle for HITL logging. Every LLM call gets tracked — what went in, what came out, which agent step triggered it. Wrapping LangChain's internals to extract this was more code than just calling the SDK directly.

The dependency tree was the final straw. LangChain pulls in a lot. For a framework that prides itself on being lightweight, having langchain-core, langchain-anthropic, langchain-openai, and their transitive dependencies felt wrong.

The Architecture

We replaced everything with a pori/llm/ module — about 200 lines of actual logic across five files:

pori/llm/
├── messages.py      # Universal message types
├── base.py          # BaseChatModel protocol
├── anthropic.py     # Anthropic implementation
├── openai.py        # OpenAI implementation
└── __init__.py      # Exports

Universal Messages

Every provider has its own message format. We defined one:

SystemMessage(content="You are helpful")
UserMessage(content="Hello!")
AssistantMessage(content="Hi there!")

Simple dataclasses. Each provider converts from this format to whatever its SDK expects. The conversion happens inside the provider, not in shared middleware.

The Protocol

Instead of an abstract base class with inheritance, we used Python's Protocol — structural typing:

class BaseChatModel(Protocol):
    model: str
 
    async def ainvoke(
        messages: list[BaseMessage],
        output_format: type[T] | None = None
    ) -> str | T:
        """Returns text OR parsed Pydantic model"""
        ...
 
    def with_structured_output(
        output_model: type[T],
        include_raw: bool = False
    ) -> Any:
        """Returns wrapper for structured output"""
        ...

Any class that implements ainvoke() and with_structured_output() is a valid LLM provider. No inheritance needed. No registration. Just implement the methods and it works.

We kept with_structured_output() deliberately — it mirrors LangChain's interface. If someone's migrating from LangChain, the calling code barely changes. The difference is what's underneath.

Two Paths, One Method

Every ainvoke() call takes an optional output_format. If it's None, you get text back. If it's a Pydantic model, you get a parsed instance:

# Text output
response = await llm.ainvoke(messages)
# response: str
 
# Structured output
response = await llm.ainvoke(messages, output_format=PlanResult)
# response: PlanResult (validated Pydantic model)

Inside the provider, this is a simple branch:

if output_format is None:
    response = await self._client.messages.create(**request)
    return response.content[0].text
else:
    schema = output_format.model_json_schema()
    # Provider-specific structured output call
    response = await self._client.messages.create(
        ...,
        structured_output_params=...
    )
    return output_format.model_validate(...)

Provider Differences

This is the part that matters. Each provider is different, and we handle it explicitly instead of hiding it behind five layers of abstraction:

Provider     System Message            Structured Output
─────────    ────────────────────      ─────────────────────────────
Anthropic    Separate `system` param   Tool calling (`tool_use` block)
OpenAI       In `messages` array       `response_format` JSON schema
Google       In `messages` array       `response_mime_type` + schema

For Anthropic, the system message gets extracted and passed separately:

system_prompt = None
messages_list = []
 
for msg in messages:
    if isinstance(msg, SystemMessage):
        system_prompt = msg.content
    else:
        messages_list.append({
            "role": msg.role,
            "content": msg.content
        })

For OpenAI, it just stays in the array. That's a three-line difference. In LangChain, that three-line difference is spread across multiple files, mixins, and adapters.

What We Gained

Debuggability. When a structured output call fails, the stack trace is: your code → provider wrapper → SDK. Three frames. You can read the exact API request that was sent. No guessing what LangChain did to your messages before sending them.

Dependency count. Pori's LLM layer depends on anthropic, openai, and pydantic. That's it. No langchain-core (which alone pulls in tenacity, jsonpatch, packaging, and more).

Adding a new provider is trivial. Copy anthropic.py, change the SDK calls, export it. The checklist is four steps:

  1. Create provider.py with a ChatProvider class
  2. Implement ainvoke() — convert messages, handle text/structured
  3. Implement with_structured_output() — return a StructuredWrapper
  4. Export in __init__.py and add to the config factory

We wrote the Google provider in under an hour.

Full control for HITL. Every ainvoke() call is a single async function we own. Wrapping it with logging, cost tracking, or approval gates is just a decorator — not a callback hook into someone else's chain.

What We Lost

Convenience features. Retry logic, rate limiting, token counting, streaming — LangChain handles all of these. We implemented what we needed (retries, basic token tracking) and left the rest for later.

For Pori, the tradeoff was clear. We're an agent framework — the LLM layer is foundational, not peripheral. We need to own it.

The Takeaway

If you're building a simple RAG app or a chatbot, use LangChain. Seriously. It'll save you time and the abstractions won't get in your way.

If you're building a framework where the LLM call is load-bearing infrastructure — where you need to control message formatting, structured output, error handling, and observability at every layer — consider going direct. The SDKs from Anthropic and OpenAI are good. Pydantic handles validation. Python's Protocol gives you clean interfaces without inheritance.

Two hundred lines replaced a dependency tree. Sometimes that's the right trade.


Pori is open source under the MIT license. Star the repo if this was useful, or open an issue if you have questions about the wrapper architecture.