DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Related

  • Understanding MCP Architecture: LLM + API vs Model Context Protocol
  • The Architecture Tax: What Nobody Tells You About Deploying LLMs in Production
  • Engineering Agentic Workflows: Architecting Autonomous Multi-Agent Systems With MCP and LangGraph
  • Design and Implementation of Cloud-Native Microservice Architectures for Scalable Insurance Analytics Platforms

Trending

  • Identity in Action
  • Implementing Observability in Distributed Systems Using OpenTelemetry
  • Prompt Injection Is Real, So I Built a Python Firewall for LLM Pipelines
  • Building Threat Intelligence Pipelines Using Python, APIs, and Elasticsearch
  1. DZone
  2. Data Engineering
  3. AI/ML
  4. MCP Client Agent: Architecture and Implementation

MCP Client Agent: Architecture and Implementation

Learn how to build a custom MCP client agent that connects to MCP servers programmatically and understand the end-to-end request flow in the process.

By 
Venkata Rakesh Buddhiraju user avatar
Venkata Rakesh Buddhiraju
·
Jun. 20, 25 · Presentation
Likes (6)
Comment
Save
Tweet
Share
3.5K Views

Join the DZone community and get the full member experience.

Join For Free

In this post, we’ll go deeper into the overall MCP architecture and client flow, and we’ll also implement an MCP client agent.

The goal is to provide some clarity on “What happens when you submit your request to MCP powered with LLMs”—breaking down what’s actually going on behind the scenes.

There are plenty of articles out there about building MCP servers. For reference, here is an official example from the MCP website. In this post, though, we’ll focus only on implementing an MCP client agent that can programmatically connect to MCP servers.

High-Level MCP Architecture

MCP Components

Host: AI Code editors (like Claude Desktop or Cursor) that users directly interact with, serving as the main interface and system manager.

Clients: Intermediaries that maintain connections between hosts and MCP servers, handling communication protocols and data flow.

Servers: Components that provide specific functionalities, data sources, and tools to AI models through standardized interfaces

Without delaying further lets get to the core of this article.

What are MCP Client Agents?

Custom MCP Clients: Programmatically Invoking MCP Servers

Most of the use cases we've seen so far involve using MCP within an AI-powered IDE. In these setups, users configure MCP servers inside the IDE and interact with them through a chat interface. In this case, the chat interface acts as the MCP client or host.

But what if you want to invoke MCP servers programmatically from your own services? That’s where the real strength of MCP comes in. It provides a standardized way to supply context and tools to your LLMs. Instead of writing custom code to integrate with every external API, resource, or file, you can focus on packaging the right context and capabilities, then hand them off to the LLM to reason over and act on.

MCP Client Agent Workflow With Multiple MCP Servers

MCP Client Agent Workflow With Multiple MCP Servers

The diagram illustrates how MCP Custom Clients/AI agents process user requests through MCP servers. Below is a step-by-step breakdown of this interaction flow:

Step 1: User Initiates Request

  • User asks a query or submits a request either through an IDE, or browser or terminal.
  • Query is received by the Custom MCP Client/Agent interface.

Step 2: MCP Client and Server Connection

  • MCP Client connects to the MCP Server. It can connect to multiple servers at a time and requests for tools from these servers
  • Servers send back the supported list of tools and functions.

Step 3: AI Processing

  • Both user query and tools list are sent to the LLM (e.g., OpenAI).
  • LLM analyzes the request and suggests appropriate tool and input parameters and sends back response to MCP Client.

Step 4: Function Execution

  • MCP Client calls the selected function in MCP Server with the suggested parameters.
  • MCP Server receives the function call and processes the request, depending on the request the corresponding tool in a specific MCP Server will get called. Please note to make sure the tool names across your MCP servers are different to avoid LLM hallucination and non-deterministic responses.
  • Server may interact with databases, external APIs, or file systems to process the request.

Step 5: (Optional) Improve Response using LLM

  • MCP Server returns the function execution response to MCP Client.

(Optional)

  •  MCP Client can then forward that response to LLM for refinement.
  •  LLM converts technical response to natural language or creates a summary.

Step 6: Respond to User

  • Final processed response is sent back to the user through the client interface.
  • User receives the answer to their original query.

Custom MCP Client Implementation / Source Code

Connecting to MCP Servers:
As discussed earlier, an MCP client can connect to multiple MCP servers. This behavior can be simulated in a custom MCP client implementation.

Note: To reduce hallucinations and ensure consistent results, it’s recommended to avoid tool name collisions across multiple MCP servers.

MCP Server Transport Options:
MCP servers support two types of transport mechanisms:

  • STDIO – for local process communication
  • SSE – for HTTP/WebSocket-based requests

Connecting to STDIO Transport

Python
 
async def connect_to_stdio_server(self, server_script_path: str):
        """Connect to an MCP stdio server"""
        is_python = server_script_path.endswith('.py')
        is_js = server_script_path.endswith('.js')
        if not (is_python or is_js):
            raise ValueError("Server script must be a .py or .js file")
        command = "python" if is_python else "node"
        server_params = StdioServerParameters(
            command=command,
            args=[server_script_path],
            env=None
        )
        stdio_transport = await self.exit_stack.enter_async_context(stdio_client(server_params))
        self.stdio, self.write = stdio_transport
        self.session = await self.exit_stack.enter_async_context(ClientSession(self.stdio, self.write))
        await self.session.initialize()
        print("Initialized stdio...")


Connecting to SSE Transport

Python
 
async def connect_to_sse_server(self, server_url: str):
        """Connect to an MCP server running with SSE transport"""
        # Store the context managers so they stay alive
        self._streams_context = sse_client(url=server_url)
        streams = await self._streams_context.__aenter__()
        self._session_context = ClientSession(*streams)
        self.session: ClientSession = await self._session_context.__aenter__()
        await self.session.initialize()
        print("Initialized SSE...")


Get Tools and Process User request With LLM and MCP Servers

Once the Servers are initialized, we can now fetch tools from all available servers and process user query, processing user query will follow the steps as described above:

Python
 
stdio_tools = await std_server.list_tools() 
sse_tools = await sse_server.list_tools()


Process User Request:

Python
 
async def process_user_query(self, available_tools: any, user_query: str, tool_session_map: dict):
        """
        Process the user query and return the response.
        """
        model_name = "gpt-35-turbo"
        api_version = "2022-12-01-preview"

        # On first user query, initialize messages if empty
        self.messages = [
            {
                "role": "user",
                "content": user_query
            }
        ]

        # Initialize your LLM - e.g., Azure OpenAI client
        openai_client = AzureOpenAI(
            api_version=api_version,
            azure_endpoint=<OPENAI_ENDPOINT>,
            api_key=<API_KEY>,
        )

        # send the user query to the LLM along with the available tools from MCP Servers
        response = openai_client.chat.completions.create(
            messages=self.messages,
            model=model_name,
            tools=available_tools,
            tool_choice="auto"
        )
        llm_response = response.choices[0].message

        # append the user query along with LLM response
        self.messages.append({
            "role": "user",
            "content": user_query
        })
        self.messages.append(llm_response)
        
        # Process respose and handle tool calls
        if azure_response.tool_calls:
            # assuming only one tool call suggested by LLM or keep in for loop to go over all suggested tool_calls
            tool_call = azure_response.tool_calls[0]

            # tool call based on the LLM suggestion
            result = await tool_session_map[tool_call.function.name].call_tool(
                tool_call.function.name,
                json.loads(tool_call.function.arguments)
            )

            # append the response to messages
            self.messages.append({
                "role": "tool",
                "tool_call_id": tool_call.id,
                "content": result.content[0].text
            })
            
            # optionally send the response to LLM to summarize
            azure_response = openai_client.chat.completions.create(
                messages=self.messages,
                model=model_name,
                tools=available_tools,
                tool_choice="auto"
            ).choices[0].message


Hopefully, this gave you a solid starting point for implementing MCP clients. In future posts, we'll explore how to host MCPs for remote access using tools like Kubernetes and Docker.

If you’d like to dive deeper right away, check out this sample source code, which includes both an MCP client agent and server implementation.

Architecture Implementation large language model

Published at DZone with permission of Venkata Rakesh Buddhiraju. See the original article here.

Opinions expressed by DZone contributors are their own.

Related

  • Understanding MCP Architecture: LLM + API vs Model Context Protocol
  • The Architecture Tax: What Nobody Tells You About Deploying LLMs in Production
  • Engineering Agentic Workflows: Architecting Autonomous Multi-Agent Systems With MCP and LangGraph
  • Design and Implementation of Cloud-Native Microservice Architectures for Scalable Insurance Analytics Platforms

Partner Resources

×

Comments

The likes didn't load as expected. Please refresh the page and try again.

  • RSS
  • X
  • Facebook

ABOUT US

  • About DZone
  • Support and feedback
  • Community research

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 215
  • Nashville, TN 37211
  • [email protected]

Let's be friends:

  • RSS
  • X
  • Facebook