Back to Articles

Serena: Teaching LLMs to Navigate Code Like an IDE, Not a Text Editor

[ View on GitHub ]

Serena: Teaching LLMs to Navigate Code Like an IDE, Not a Text Editor

Hook

Your AI coding assistant is burning through 100K tokens reading entire files just to rename a function. Meanwhile, your IDE does it instantly with zero context—because it understands symbols, not strings.

Context

The explosion of LLM-powered coding agents has revealed an uncomfortable truth: these systems are embarrassingly inefficient at understanding code structure. Most agents treat codebases as collections of text files, using grep-style searches and reading entire files into context windows to perform simple operations. Need to find where a class is used? Read every file. Want to rename a method? Parse the whole codebase. This text-centric approach doesn’t just waste tokens—it fundamentally limits how large a codebase an agent can effectively work with.

Human developers solved this problem decades ago with IDEs that understand code semantically. Language servers parse your codebase into abstract syntax trees, track symbol definitions and references, and enable instant navigation across millions of lines of code. But this IDE-level intelligence has been largely absent from AI coding tools, which instead rely on RAG systems, file embeddings, or brute-force context stuffing. Serena bridges this gap by exposing IDE-quality semantic analysis directly to LLMs through a clean, language-agnostic API. It’s not another coding agent—it’s the missing infrastructure layer that lets any agent understand code structure the way your IDE does.

Technical Insight

Serena’s architecture cleverly separates three concerns: the LLM frontend, the semantic analysis backend, and the tool interface that connects them. At its core, Serena exposes a set of symbol-level operations—find_symbol, find_referencing_symbols, get_symbol_definition, insert_after_symbol, replace_symbol—that abstract away language-specific implementation details. This abstraction is crucial because it lets agents interact with Python, TypeScript, Rust, or any other language using the same mental model.

The backend can run in two modes. The LSP (Language Server Protocol) backend leverages existing language servers like typescript-language-server, rust-analyzer, or pyright. These are the same tools that power VSCode’s IntelliSense and other IDE features. When you ask Serena to find a symbol, it’s using the same battle-tested analysis that developers rely on daily. The JetBrains plugin backend goes deeper, integrating directly with IntelliJ IDEA’s PSI (Program Structure Interface) for even more robust semantic understanding—at the cost of requiring a running JetBrains IDE.

Here’s what this looks like in practice. Instead of an agent reading an entire 500-line file to find a function, it can do this:

# Traditional agent approach: expensive and imprecise
with open('src/api/users.py') as f:
    content = f.read()  # 500 lines → ~15K tokens
    # Now parse with regex or hope the LLM figures it out

# Serena approach: precise and token-efficient
result = serena.find_symbol(
    query='UserService.create_user',
    project_path='/workspace'
)
# Returns: exact location, signature, just the symbol definition
# Token cost: ~100 tokens for the result

The real power emerges when you chain these operations. An agent can find a symbol definition, locate all its references, understand the call hierarchy, and make surgical edits—all without reading unrelated code. This is how you might refactor a function signature:

# 1. Find the function definition
func = serena.find_symbol('create_user', 'src/api')

# 2. Find all call sites
refs = serena.find_referencing_symbols(
    symbol_name='create_user',
    file_path=func['file_path'],
    line=func['line']
)

# 3. Update signature at definition
serena.replace_symbol(
    file_path=func['file_path'],
    start_line=func['line'],
    end_line=func['line'] + 3,
    new_content='def create_user(email: str, name: str, role: str = "user"):'
)

# 4. Update each call site (agent decides how based on context)
for ref in refs:
    # Agent reads just the call site context, not entire file
    context = serena.get_symbol_context(ref)
    # Makes targeted update
    serena.replace_symbol(...)

Serena exposes these capabilities through multiple interfaces. The Model Context Protocol (MCP) server is the primary integration point for Claude and other MCP-compatible clients. You can also convert it to an OpenAPI server using mcpo for REST-based access, or import it directly as a Python library for custom agent frameworks. This flexibility means Serena can be the semantic layer for nearly any LLM-powered coding tool.

The framework also handles the messy reality of language server management. It automatically spawns and manages LSP processes, handles initialization sequences, and normalizes responses across different servers (which, despite the protocol’s name, often have quirks). The JetBrains backend sidesteps this by delegating to the IDE, but requires a running instance—a reasonable tradeoff for teams already living in IntelliJ.

What makes this architecture particularly elegant is that it transforms coding agents from “text processors with context windows” into “semantic code editors with language understanding.” The agent no longer needs to understand how Python scoping works or how TypeScript resolves imports—the language server already knows. The agent just needs to make strategic decisions about what to search for, what to read, and what to modify.

Gotcha

The dual-backend architecture is both Serena’s strength and its complexity. The LSP backend requires installing and configuring language servers for each language you work with. While popular languages like Python (pyright), TypeScript (typescript-language-server), and Rust (rust-analyzer) have excellent LSP implementations, others are spottier. Groovy has only partial support, and some niche languages lack quality language servers entirely. Each language server also has its own resource requirements—rust-analyzer can be memory-hungry on large projects, and some servers are slower to initialize than others. You’re essentially depending on the maturity of each language’s LSP ecosystem.

The JetBrains plugin backend solves quality issues but introduces new constraints. You need a running JetBrains IDE, which means either keeping IntelliJ IDEA open while your agent works (viable for interactive workflows) or running it headlessly (more complex to set up). Rider isn’t currently supported, limiting .NET developers. There’s also the question of licensing—JetBrains IDEs require paid licenses for commercial use, though the Community Edition of IntelliJ IDEA works for many languages. The plugin approach is fantastic if you’re already a JetBrains shop, but it’s a heavier dependency than the LSP route.

Beyond backend concerns, Serena is still maturing. The project is under active development, which means APIs might evolve and edge cases haven’t all been discovered yet. Integration with specific agent frameworks may require some glue code. The documentation is comprehensive but assumes familiarity with concepts like language servers and MCP—if you’re new to these, expect a learning curve. And while Serena dramatically improves token efficiency compared to reading full files, it’s still making network calls (or IPC calls) to language servers, which adds latency compared to pure text operations.

Verdict

Use if: You’re building or customizing coding agents that work with codebases larger than a few dozen files, you need to support multiple programming languages without reinventing semantic analysis for each, you’re using Claude or other MCP-enabled tools and want to extend their code understanding, or you’re hitting token limits and need to make your agent’s code reading more surgical. It’s especially compelling if you’re already in the JetBrains ecosystem and can leverage the plugin backend for maximum IDE integration. Skip if: You’re working with small scripts or single-file programs where reading everything into context is viable, your agent framework already has deep semantic code analysis built-in (rare but exists), you need enterprise-grade stability and can’t tolerate a rapidly evolving tool, or you’re building for languages with poor LSP support where Serena can’t add much value over text-based approaches. Also skip if you need offline-first operation with no external dependencies—the language server requirement means you’re always coordinating with another process.