[
]

[10.13.2025]
TL;DR:
50-tool API wrappers don't work - agents can't reason through bloated context. Ship 5-10 tools, one workflow, intent-driven.
API gateways don't see MCP tool invocations, resource access, or what returns to agent context. You need protocol-aware audit trails.
Prompt injection is inevitable. Your tools return data into ChatGPT's context - attackers will inject malicious instructions. Sanitize outputs or fail compliance.
Apps SDK opens submissions in ~3 months.

I've been talking about the agentic internet since February. It's finally happening.
Back then, I said agents needed their own access to the internet - not UIs designed for humans, not OAuth flows for autonomous agents, but protocols optimized for machine-to-machine interaction. We're not there yet on auth (OAuth stays for now), but agent-native apps are here
This week, OpenAI made it mainstream. Apps SDK + MCP is the protocol layer that lets agents access any software natively. No browser automation. No pretending to be human. Just agents invoking tools through a standardized interface.
I think this is what the agentic internet looks like. MCP is the protocol. Apps SDK is the distribution platform. ChatGPT users are the market.
And if you're a VP of Engineering at a B2C or B2B internet company and you're not building an MCP server right now, you're already behind.
Why MCP + Apps SDK Is the Distribution Play
MCP is the protocol that defines how agents communicate with your software - tools, resources, and data. Think of it like HTTP for agents: it's the enabling layer that lets you build agent-native applications.
Right now, OpenAI’s Apps SDK is using MCP to give developers access to 800 million users. That's the unlock: a standardized protocol (MCP) meeting a massive platform (ChatGPT).
OpenAI's Apps SDK builds on the Model Context Protocol, extending it so developers can design both the logic and interface of their apps Introducing apps in ChatGPT and the new Apps SDK | OpenAI. Apps built with the Apps SDK can reach over 800 million ChatGPT users at just the right time.
But here's what most people miss: this is not "wrap your REST API and call it MCP." That's the fastest way to build something that doesn't work.
MCP defines the interface. You build the application layer on top. And if you're building for Apps SDK specifically, you're building for conversational, intent-driven workflows where ChatGPT decides when to invoke your tools based on user context.
For any internet company - B2C SaaS, B2B platforms, developer tools - building an MCP server that works with Apps SDK is the new distribution channel.
The Window Is Closing Faster Than You Think
Apps SDK is available in preview today for developers to begin building and testing their apps, with OpenAI opening for app submission later this year. That's roughly three months to build a production-ready server, get user feedback, iterate, and be ready when submissions open.
Pilot partners including Booking.com, Canva, Coursera, Figma, Expedia, Spotify and Zillow are already available. They launched day one. They're already learning what works, training ChatGPT's discovery algorithms, and building user expectations.
When the marketplace opens and 500 companies rush to launch "book a flight" or "search documents," who do you think wins? The one that's been live for three months or the one launching into a flood?
Here's what you should be doing right now:
Build your first MCP server focused on ONE workflow
Get it in front of early users - employees, beta customers, design partners
Learn what actually gets used vs what you thought would get used
Iterate on tool descriptions, structured outputs, error handling
Build your security and audit infrastructure from day one
If you wait until "best practices emerge" or "the marketplace matures," you'll be competing for discovery against companies that have three months of data on what works.
How to Build MCP Servers That Actually Work
The biggest mistake I'm seeing: companies treating MCP like an API wrapper. They take their OpenAPI spec, generate 50 tools, deploy it, and wonder why nobody uses it.
Why that fails?
Agents get confused with too many tools. When you fill the context window with 50 tool descriptions, there's no room left for the actual user intent. The agent can't reason effectively. Discovery breaks. The user experience degrades.
The pattern that works:
Focus on intent-based, single-workflow servers. What problem are you solving? What does the user journey look like? Build ONE thing perfectly.
Example: Don't build "access our entire API." Build "book a restaurant reservation" or "search company knowledge base" or "check deployment status." One workflow. Tight. Purposeful.
Why? Because MCP defines the protocol - you're building the application layer on top. MCP servers expose tools that a model can call during conversation and return results according to tool contracts, with results including extra metadata that the Apps SDK uses to render rich UI components alongside assistant messages. Every tool needs to be designed for conversational invocation and structured output.
What we've learned building production MCP servers:
Start with 5-10 tools maximum for your first server
Focus on one clear workflow with defined entry and exit points
Get feedback from real users - what they actually use will surprise you
Iterate on tool descriptions; discovery lives or dies on metadata quality
Apps are most valuable when they help people accomplish meaningful tasks directly within ChatGPT, without breaking conversational flow, with good use cases including booking rides, ordering food, checking availability, or tracking delivery App design guidelines
A note on state management:
ChatGPT's current MCP implementation doesn't persist state between tool invocations, so design your tools to be stateless. Each call should be independent and self-contained. This is a ChatGPT client behavior, not a protocol requirement - but it's the reality of building for Apps SDK today.
Security Cannot Be an Afterthought
Here's the part that keeps me up at night: most companies think MCP security is "just use our API gateway."
No. MCP defines the protocol for tool invocation, resource access, and prompt exchange. You need to instrument at the MCP layer, not just your API layer.
Why your existing security stack isn't enough:
MCP servers expose 6 primitives and 3 of those are mostly used: tools, resources, prompts. When you only look at API logs, you're missing the full picture.
The MCP protocol includes session metadata and request tracking - but if you're only looking at your downstream API logs, you can't see:
Which MCP prompt was invoked (no API call happens)
Which tool orchestrated multiple backend API calls (you see three API requests; you don't know they were one tool)
Who accessed what resource through the agent
What data was returned to the agent's context (and whether it contained PII, prompt injection payloads, or sensitive information)
The audit gap is real. If you're relying on API gateway logs, you don't have end-to-end audit trails. You can't answer: "Which user accessed customer PII through which MCP resource at what time?" That's a compliance failure waiting to happen.
Then there's prompt injection.
Apps SDK documentation emphasizes defense in depth: assume prompt injection and malicious inputs will reach your server, validate everything, and keep audit logs Security & Privacy. OpenAI's own docs tell you: assume it will happen.
Here's the attack vector: Your MCP tool returns data to the agent. That data goes into ChatGPT's context. If an attacker controls what your tool returns - through indirect prompt injection in upstream data sources - they can manipulate the agent's subsequent behavior.
Example:
The agent reads this, interprets the embedded instruction, and if you haven't sanitized outputs or implemented proper security controls, it might actually execute that action.
What you need to do:
Sanitize outputs: Strip PII, validate against schema, detect injection patterns in string fields
Structured outputs + semantic validation: Use strict JSON schemas, but also validate the semantic content - schemas alone don't neutralize natural language injection in string fields
Threat model from day one: What's the worst thing an attacker could do if they control tool responses?
MCP-specific audit logs: Log every tool call, every resource access, every parameter, every response payload
Protocol-aware observability: Traditional WAFs don't understand MCP semantics; you need a layer that sees tool invocations, resource access, and prompt usage—not just HTTP requests
What to Do Monday Morning
If you're a VP of Engineering reading this, here's the action plan:
Decide
Identify ONE workflow in your product that's conversational, task-oriented, and valuable
Ask: "Would ChatGPT users want to do this without leaving the chat?"
If yes, you have your use case
Build
Apps SDK supports any server implementing the MCP specification, with official Python SDK and TypeScript SDK being the fastest way to get started Set up your server
Build 3-7 tools focused on that single workflow
Implement OAuth 2.1 flow with your authorization server (Auth0, Okta, Cognito), with ChatGPT acting as the client on behalf of users Authentication
Test in developer mode with internal users
Secure
Implement structured outputs and output sanitization - validate both schema AND semantic content
Add MCP-specific logging: tool calls, parameters, OAuth scopes, response payloads
Threat model: what's the worst-case attack scenario?
Review with your security team
Iterate
Get feedback from beta users
Measure: discovery success rate, completion rate, latency
Refine tool descriptions and metadata
Optimize for the workflows people actually use (not what you thought they'd use)
When submissions open: You're ready. You've learned what works. You have user data. You're not guessing.
The Companies That Will Win
MCP is the protocol layer that enables agent-native applications. Apps SDK is the distribution channel that brings it to 800 million users. The companies that win will be the ones who:
Build NOW while the marketplace is empty
Focus on intent-based, single-workflow servers (not 50-tool API wrappers)
Instrument at the MCP layer (not just API logs)
Assume prompt injection will happen and build defenses accordingly
Learn fast from real user feedback
The companies that lose will be the ones who wait for "best practices to emerge," build general-purpose wrappers, and treat security as an afterthought.
If you're not building your MCP server today, what are you doing?
Ready to Build Your MCP Server?
At Golf.dev, we're working with companies shipping MCP servers in production. We handle the security middleware - audit trails, prompt injection detection, protocol-aware observability - so your team can focus on building the best agent-native experience for your product.
The agent-native shift has started.
We’re onboarding early teams now - first come, first served. Build the future before it builds around you.