Architecture Agentic AI MCP FastAPI March 2025 · Sophilia Engineering

How Sophilia Uses MCP and FastAPI to Power Agentic SDLC Intelligence

Most engineering dashboards are passive — they show you data after the fact. We built Sophilia to be active: an agentic AI platform where your AI assistant can query live engineering data as tool context, and autonomous agents continuously monitor, enforce, and resolve issues across your SDLC.

The Problem: Engineering Data Is Siloed and Stale

Engineering leaders face a fragmentation crisis. DORA metrics live in one tool. Security vulnerabilities in another. Sprint data in Jira. Cloud costs in AWS Cost Explorer. GitHub analytics require custom scripts. And none of it is accessible to your AI assistant when you need an answer fast.

The typical response is to build yet another internal dashboard. But dashboards are passive — they wait for you to look at them. We needed something different: a platform that not only aggregates data but also exposes it as live, queryable context for AI agents and assistants.

Our Architecture: FastAPI + MCP + LangGraph

Sophilia's backend is a Python FastAPI application that serves a dual purpose: it's both a traditional REST API for the React dashboard and an MCP (Model Context Protocol) server that exposes engineering data as AI tool context via Server-Sent Events (SSE).

┌─────────────────────────────────────────────────────────┐ │ Sophilia Platform │ │ │ │ ┌───────────────┐ ┌──────────────────────────┐ │ │ │ React + Vite │────▶│ FastAPI Backend │ │ │ │ Dashboard │ │ (Python 3.12) │ │ │ │ Azure SWA │ │ │ │ │ └───────────────┘ │ ┌────────────────────┐ │ │ │ │ │ REST API Routes │ │ │ │ ┌───────────────┐ │ │ /api/engineering │ │ │ │ │ AI Assistants│ │ │ /api/security │ │ │ │ │ Claude/Cursor│────▶│ │ /api/finops │ │ │ │ │ (MCP client) │ SSE │ └────────────────────┘ │ │ │ └───────────────┘ │ │ │ │ │ ┌────────────────────┐ │ │ │ ┌───────────────┐ │ │ MCP Server /sse │ │ │ │ │ LangGraph │────▶│ │ Tools: DORA, │ │ │ │ │ AI Agents │ │ │ incidents, repos │ │ │ │ └───────────────┘ │ └────────────────────┘ │ │ │ │ │ │ │ │ ┌────────────────────┐ │ │ │ │ │ PostgreSQL (Neon) │ │ │ │ │ │ + Keycloak RBAC │ │ │ │ │ └────────────────────┘ │ │ │ └──────────────────────────┘ │ └─────────────────────────────────────────────────────────┘

What is the Model Context Protocol (MCP)?

MCP is an open standard introduced by Anthropic that allows AI assistants (Claude, Cursor, Continue.dev) to connect to external data sources and tools in a standardised way. Instead of writing custom integrations for every AI tool, you expose your data once via an MCP server, and any MCP-compatible client can consume it.

An MCP server declares a set of tools (functions the AI can call) and resources (data the AI can read). The protocol communicates via stdio or SSE (Server-Sent Events over HTTP).

Key insight: By making Sophilia an MCP server, we enable any engineer on the team to ask their AI assistant "What is our current MTTR?" or "Which repos have critical CVEs?" and get live answers — without building a custom plugin for each AI tool.

Sophilia's MCP Tool Catalog (excerpt)

# core/mcp_server.py (simplified)

from mcp.server.fastmcp import FastMCP

mcp = FastMCP("Sophilia SDLC Intelligence")

@mcp.tool()
async def get_dora_metrics(team: str = "all", days: int = 30) -> dict:
    """
    Returns DORA metrics for the specified team and time window.
    Includes Deployment Frequency, Lead Time, MTTR, and Change Failure Rate.
    """
    return await engineering_service.get_dora_metrics(team=team, days=days)

@mcp.tool()
async def list_open_incidents(severity: str = "all") -> list:
    """
    Returns all open SRE incidents, optionally filtered by severity.
    Each incident includes AI-generated root cause summary.
    """
    return await sre_service.get_open_incidents(severity=severity)

@mcp.tool()
async def get_security_posture(repo: str = None) -> dict:
    """
    Returns security posture score and open CVE count for a repo (or all repos).
    """
    return await security_service.get_posture(repo=repo)

@mcp.tool()
async def run_governance_check(framework: str) -> dict:
    """
    Runs an agentic compliance check against the specified framework.
    Supported: GDPR, SOC2, HIPAA, PCI-DSS.
    """
    return await compliance_agent.run(framework=framework)

The FastAPI Backend

The backend is a single FastAPI application that mounts the MCP server alongside standard REST routes. This means the same Python process serves both the dashboard's API calls and the MCP SSE endpoint — no separate infrastructure needed.

# app.py
from fastapi import FastAPI
from fastapi.middleware.cors import CORSMiddleware
from mcp.server.sse import SseServerTransport

from core.mcp_server import mcp
from routers import engineering, security, finops, sre, agents

app = FastAPI(title="Sophilia Backend", version="2.0.0")

# ── Standard REST API ──────────────────────────────────
app.include_router(engineering.router, prefix="/api/engineering")
app.include_router(security.router,    prefix="/api/security")
app.include_router(finops.router,      prefix="/api/finops")
app.include_router(sre.router,         prefix="/api/sre")
app.include_router(agents.router,      prefix="/api/agents")

# ── MCP SSE endpoint ───────────────────────────────────
sse = SseServerTransport("/mcp/messages")

@app.get("/mcp/sse")
async def mcp_sse_endpoint(request: Request):
    """MCP SSE stream — compatible with Claude Desktop, Cursor, Continue.dev"""
    async with sse.connect_sse(request.scope, request.receive, request._send) as streams:
        await mcp.run(streams[0], streams[1], mcp.create_initialization_options())

if __name__ == "__main__":
    import uvicorn
    uvicorn.run(app, host="0.0.0.0", port=8011)

Agentic Workflows with LangGraph

Beyond exposing data, Sophilia runs active LangGraph agents that continuously monitor the engineering environment. These agents are defined as directed graphs where each node is a function, tools are MCP-backed calls, and state is persisted across invocations via PostgreSQL checkpoints.

Example: The SRE Sentinel Agent

The SRE Sentinel is a LangGraph agent that runs on a schedule, calls the monitoring tools, analyses anomalies against historical baselines, and — if it detects an issue — automatically creates an incident record, generates a root-cause hypothesis, and notifies on-call via GitHub Actions.

# core/agents/sentinel.py (simplified)

from langgraph.graph import StateGraph, END
from langchain_google_genai import ChatGoogleGenerativeAI

llm = ChatGoogleGenerativeAI(model="gemini-2.0-flash", temperature=0)

def check_metrics(state):
    metrics = get_dora_metrics(days=1)
    incidents = list_open_incidents(severity="critical")
    return {**state, "metrics": metrics, "incidents": incidents}

def analyse_anomalies(state):
    prompt = f"""
    You are an SRE AI assistant. Analyse these metrics and incidents:
    Metrics: {state['metrics']}
    Active incidents: {state['incidents']}
    Identify anomalies vs 30-day baseline and assess blast radius.
    """
    analysis = llm.invoke(prompt)
    return {**state, "analysis": analysis.content}

def create_incident_if_needed(state):
    if "critical" in state["analysis"].lower():
        incident = sre_service.create_incident(
            title="AI Sentinel: Anomaly Detected",
            root_cause=state["analysis"],
            severity="P1"
        )
        return {**state, "incident_created": True, "incident_id": incident.id}
    return {**state, "incident_created": False}

graph = StateGraph(dict)
graph.add_node("check", check_metrics)
graph.add_node("analyse", analyse_anomalies)
graph.add_node("respond", create_incident_if_needed)
graph.set_entry_point("check")
graph.add_edge("check", "analyse")
graph.add_edge("analyse", "respond")
graph.add_edge("respond", END)
sentinel = graph.compile()

Authentication & Multi-Tenancy with Keycloak

All API endpoints are protected by Keycloak JWT validation. The FastAPI backend validates tokens via the Keycloak public key, extracts realm roles, and enforces RBAC at the route level. This means the same platform can serve multiple teams with different visibility scopes.

For the MCP endpoint, we support both authenticated SSE (for internal use) and an optional unauthenticated demo mode, allowing AI assistants to connect with pre-scoped read-only access.

Observability: LangSmith Tracing

Every agent invocation is traced with LangSmith, giving the engineering team full visibility into which tools were called, what the LLM reasoned, and where latency was introduced. This is critical for debugging agentic workflows where a single user query might trigger 10+ tool calls across multiple microservices.

Result: Using this architecture, Sophilia's SRE Sentinel reduced mean time to incident detection from 23 minutes (human-monitored alert thresholds) to under 90 seconds for the anomaly classes it was trained to detect.

Connecting Your AI Assistant to Sophilia

Any MCP-compatible AI client can connect to Sophilia's MCP endpoint. Sophilia's built-in MCP Config Generator produces the exact JSON snippet you need:

// Claude Desktop — claude_desktop_config.json
{
  "mcpServers": {
    "sophilia": {
      "command": "npx",
      "args": ["-y", "mcp-remote", "https://devhub.sophilia.ai/mcp/sse"],
      "env": {
        "AUTH_TOKEN": "your-keycloak-token"
      }
    }
  }
}

Once connected, Claude can answer questions like:

"What is our deployment frequency this week compared to last month?"
"Which team has the highest change failure rate right now?"
"Are there any open P1 incidents affecting the payments service?"
"Run a GDPR compliance check on the user-data-service repo."

What's Next

We're actively expanding Sophilia's agentic capabilities:

Autonomous codebase upgrading — agents that detect outdated dependencies and open PRs with tested upgrades
Predictive sprint intelligence — LLM-powered models that flag at-risk stories before standups
Multi-cloud FinOps agents — agents that detect cost anomalies and automatically apply right-sizing recommendations across AWS, Azure, and GCP
RAG-ready documentation — automatic chunking and embedding of your engineering docs for semantic search across the platform

✦ Explore the Live Demo