Introduction
AI agents are becoming the primary 'reader' of web content — and they prefer Markdown over HTML. This article is for web developers and DevOps engineers who want to make their sites agent-friendly without depending on a specific hosting provider. It walks through two implementation approaches — application-specific and proxy-based — with a concrete .NET code example and a sample Nginx configuration.
Motivation
In a blog post titled 'Introducing Markdown for Agents', Cloudflare — as one of the major hosting providers — introduces a new feature for websites that reflects a clear reality: more and more web content is no longer fetched by humans, but by AI agents (aka bots). The starting point is the fact that parsing HTML is relatively expensive (read: token-heavy) for AI agents, because a web page naturally contains many elements that are about layout rather than content. Markdown, by contrast, can be parsed with significantly fewer tokens — and therefore at lower cost — since the focus is squarely on the content. The core idea in the article: serve Markdown instead of HTML when the HTTP request header Accept contains text/markdown. For example:
curl https://developers.cloudflare.com/fundamentals/reference/markdown-for-agents/ \
-H "Accept: text/markdown"
For sites hosted on Cloudflare, this feature is available automatically according to the documentation. Since the idea is compelling — in the future, content will increasingly be researched by AI agents through a chat interface rather than fetched manually via Google — the question arises how sites that are not hosted on Cloudflare could be given the same capability. This matters because, going forward, it may well be a site's 'AI-friendliness' rather than its 'Google ranking' that decides the reach of its content. The [content-signal](https://contentsignals.org/) response header plays a role here too — it lets you influence the behaviour of the calling agent in more detail:
content-signal: ai-train=yes, search=yes, ai-input=yes
Implementation approaches
Below, a look at the options available in principle for implementing a comparable feature without relying on the hoster.
Application-specific implementation
This option is particularly interesting when much of the content is already in Markdown form and can be served directly — without any conversion step — when the Accept: text/markdown header is detected. This is the case, for example, when the content-heavy pages of a site are generated from Markdown content held in a headless CMS such as Directus. Dynamic site content can then be provided either 'on the fly' or pre-generated in Markdown form.
Advantages
- High content fidelity is achievable, because you have full site-specific control over the conversion process.
Disadvantages
- A dedicated implementation is required per site (though with a uniform tech stack this can be absorbed by your own shared libraries).
Sample implementation
Here is an example for a recently built website that sources its content from a Directus headless CMS, implemented as .NET middleware. Content available in the CMS is loaded directly from the CMS (TryResolveCmsContent); other pages are converted on the fly with the ReverseMarkdown library after the page has been rendered. Note that this is a sample implementation, not production-ready code.
///-------------------------------------------------------------------------------------------------
/// <summary> Middleware that serves page content as Markdown when the request includes
/// Accept: text/markdown. Follows the Cloudflare "Markdown for Agents" proposal. </summary>
///-------------------------------------------------------------------------------------------------
internal sealed class MarkdownForAgentsMiddleware(
RequestDelegate next,
MarkdownForAgentsOptions options,
ILogger<MarkdownForAgentsMiddleware> logger)
{
///-------------------------------------------------------------------------------------------------
/// <summary> The next middleware in the pipeline. </summary>
///-------------------------------------------------------------------------------------------------
private readonly RequestDelegate mNext = next;
///-------------------------------------------------------------------------------------------------
/// <summary> The middleware configuration options. </summary>
///-------------------------------------------------------------------------------------------------
private readonly MarkdownForAgentsOptions mOptions = options;
///-------------------------------------------------------------------------------------------------
/// <summary> The logger instance. </summary>
///-------------------------------------------------------------------------------------------------
private readonly ILogger<MarkdownForAgentsMiddleware> mLogger = logger;
///-------------------------------------------------------------------------------------------------
/// <summary> The ReverseMarkdown converter instance (thread-safe singleton). </summary>
///-------------------------------------------------------------------------------------------------
private readonly Converter mConverter = new(new Config
{
UnknownTags = Config.UnknownTagsOption.Bypass,
RemoveComments = true,
GithubFlavored = true,
SmartHrefHandling = true
});
///-------------------------------------------------------------------------------------------------
/// <summary> Known CMS content categories mapped from URL path segments. </summary>
///-------------------------------------------------------------------------------------------------
private static readonly FrozenDictionary<String, String> sCmsLanguages =
new Dictionary<String, String>(StringComparer.OrdinalIgnoreCase)
{
{ "de", "Deutsch" },
{ "en", "English" }
}.ToFrozenDictionary(StringComparer.OrdinalIgnoreCase);
///-------------------------------------------------------------------------------------------------
/// <summary> Processes an HTTP request, returning Markdown content for AI agents. </summary>
///
/// <param name="ctx"> The HTTP context. </param>
///
/// <returns> A task representing the asynchronous operation. </returns>
///-------------------------------------------------------------------------------------------------
public async Task InvokeAsync(HttpContext ctx)
{ // check requirements
ArgumentNullException.ThrowIfNull(ctx);
// Quick exits - pass through to normal pipeline
if (!HttpMethods.IsGet(ctx.Request.Method) && !HttpMethods.IsHead(ctx.Request.Method))
{
await mNext(ctx).ConfigureAwait(false);
return;
}
if (!ClientWantsMarkdown(ctx))
{ // Client doesn't want markdown - skip processing
await mNext(ctx).ConfigureAwait(false);
return;
}
if (ShouldSkipPath(ctx.Request.Path))
{ // Path which should be skipped - pass through to normal pipeline
await mNext(ctx).ConfigureAwait(false);
return;
}
// Phase 1: Try CMS content (short-circuit, no Razor execution needed)
var (language, slug) = TryResolveCmsContent(ctx.Request.Path);
if (language != null && slug != null)
{ // get your content service here
var cacheService = ctx.RequestServices.GetService<IPageCacheService>();
if (cacheService != null)
{
try
{ // Attempt to get page content from cache or other backing source
var page = await cacheService.GetOrSetPageAsync(language, slug, ctx.RequestAborted).ConfigureAwait(false);
if (!String.IsNullOrWhiteSpace(page?.Text))
{
mLogger.LogServingCmsMarkdown(language, slug);
await WriteMarkdownResponse(ctx, page.Text).ConfigureAwait(false);
return;
}
}
catch (Exception ex)
{
mLogger.LogCmsMarkdownFailed(ex, language, slug, ex.Message);
}
}
}
// Phase 2: HTML fallback - let Razor render, then convert
var originalBody = ctx.Response.Body;
using var buffer = new MemoryStream();
ctx.Response.Body = buffer;
try
{
await mNext(ctx).ConfigureAwait(false);
// Only convert 200 OK HTML responses
if (ctx.Response.StatusCode == 200
&& ctx.Response.ContentType?.Contains("text/html", StringComparison.OrdinalIgnoreCase) == true)
{
buffer.Seek(0, SeekOrigin.Begin);
using var reader = new StreamReader(buffer, Encoding.UTF8, leaveOpen: true);
var html = await reader.ReadToEndAsync(ctx.RequestAborted).ConfigureAwait(false);
var content = ExtractMainContent(html);
if (!String.IsNullOrWhiteSpace(content))
{
var markdown = mConverter.Convert(content);
ctx.Response.Body = originalBody;
// Clear Razor-set headers before writing markdown response
ctx.Response.Headers.ContentType = default;
ctx.Response.Headers.ContentLength = default;
mLogger.LogServingHtmlFallbackMarkdown(ctx.Request.Path);
await WriteMarkdownResponse(ctx, markdown).ConfigureAwait(false);
return;
}
}
// Non-200 or non-HTML or extraction failed: copy buffer to original
buffer.Seek(0, SeekOrigin.Begin);
ctx.Response.Body = originalBody;
await buffer.CopyToAsync(originalBody, ctx.RequestAborted).ConfigureAwait(false);
}
catch (Exception ex)
{ // Fail open: restore original stream, log warning
mLogger.LogMarkdownFallbackFailed(ex, ex.Message);
ctx.Response.Body = originalBody;
if (!ctx.Response.HasStarted)
{ // Try to copy whatever was buffered
buffer.Seek(0, SeekOrigin.Begin);
await buffer.CopyToAsync(originalBody, ctx.RequestAborted).ConfigureAwait(false);
}
}
}
///-------------------------------------------------------------------------------------------------
/// <summary> Determines whether the client is requesting Markdown via the Accept header. </summary>
///
/// <param name="ctx"> The HTTP context. </param>
///
/// <returns> True if the client accepts text/markdown with sufficient quality. </returns>
///-------------------------------------------------------------------------------------------------
private Boolean ClientWantsMarkdown(HttpContext ctx)
{ // Simplified: basic Accept check (not a full RFC-compliant parser)
var acceptHeader = ctx.Request.Headers.Accept.ToString();
if (String.IsNullOrWhiteSpace(acceptHeader))
{
return false;
}
// Check for text/markdown
if (acceptHeader.Contains("text/markdown", StringComparison.OrdinalIgnoreCase))
{
return ParseQValue(acceptHeader, "text/markdown") > 0;
}
// Check for legacy text/x-markdown
if (mOptions.AcceptLegacyTextXMarkdown
&& acceptHeader.Contains("text/x-markdown", StringComparison.OrdinalIgnoreCase))
{
return ParseQValue(acceptHeader, "text/x-markdown") > 0;
}
return false;
}
///-------------------------------------------------------------------------------------------------
/// <summary> Parses the quality value for a specific media type from an Accept header.
/// Uses ReadOnlySpan/MemoryExtensions.Split to avoid allocations on the hot path. </summary>
///
/// <param name="acceptHeader"> The full Accept header value. </param>
/// <param name="mediaType"> The media type to find the quality value for. </param>
///
/// <returns> The quality value (0.0 to 1.0), defaulting to 1.0 if not specified. </returns>
///-------------------------------------------------------------------------------------------------
private static Double ParseQValue(ReadOnlySpan<Char> acceptHeader, ReadOnlySpan<Char> mediaType)
{
foreach (var range in acceptHeader.Split(','))
{
var part = acceptHeader[range].Trim();
if (!part.StartsWith(mediaType, StringComparison.OrdinalIgnoreCase))
{
continue;
}
// Check for ;q= parameter
var qIndex = part.IndexOf(";q=", StringComparison.OrdinalIgnoreCase);
if (qIndex < 0)
{
return 1.0; // Default quality is 1.0
}
var qValue = part[(qIndex + 3)..].Trim();
if (Double.TryParse(qValue, NumberStyles.Float, CultureInfo.InvariantCulture, out var q))
{
return q;
}
return 1.0;
}
return 0.0;
}
///-------------------------------------------------------------------------------------------------
/// <summary> Determines whether the given path should be skipped by the middleware. </summary>
///
/// <param name="path"> The request path. </param>
///
/// <returns> True if the path should be skipped. </returns>
///-------------------------------------------------------------------------------------------------
private Boolean ShouldSkipPath(PathString path)
{
var p = path.Value;
if (String.IsNullOrEmpty(p))
{
return false;
}
// Skip known prefixes
if (mOptions.SkipPathPrefixes.Any(prefix => p.StartsWith(prefix, StringComparison.OrdinalIgnoreCase)))
{
return true;
}
// Skip file extensions (static files)
var lastSegment = p.Split('/').LastOrDefault();
if (lastSegment?.Contains('.', StringComparison.Ordinal) == true)
{
return true;
}
return false;
}
///-------------------------------------------------------------------------------------------------
/// <summary> Attempts to resolve a CMS content language and slug from the request path. </summary>
///
/// <param name="path"> The request path. </param>
///
/// <returns> A tuple of (language, slug), both null if the path does not match CMS content. </returns>
///-------------------------------------------------------------------------------------------------
private static (String? Language, String? Slug) TryResolveCmsContent(PathString path)
{
var segments = path.Value?.Split('/', StringSplitOptions.RemoveEmptyEntries);
if (segments is not { Length: 2 })
{
return (null, null);
}
if (!sCmsLanguages.TryGetValue(segments[0], out var language))
{
return (null, null);
}
var slug = segments[1];
if (slug.Contains('.', StringComparison.Ordinal) || slug.Contains('\\', StringComparison.Ordinal))
{
return (null, null);
}
return (language, slug);
}
///-------------------------------------------------------------------------------------------------
/// <summary> Extracts the main page content between the content markers in rendered HTML. </summary>
///
/// <param name="html"> The full HTML page content. </param>
///
/// <returns> The extracted content between markers, or null if markers not found. </returns>
///-------------------------------------------------------------------------------------------------
private String? ExtractMainContent(String html)
{
var startIdx = html.IndexOf(mOptions.ContentStartMarker, StringComparison.Ordinal);
var endIdx = html.IndexOf(mOptions.ContentEndMarker, StringComparison.Ordinal);
if (startIdx < 0 || endIdx < 0 || endIdx <= startIdx)
{
return null;
}
return html.Substring(startIdx + mOptions.ContentStartMarker.Length, endIdx - startIdx - mOptions.ContentStartMarker.Length).Trim();
}
///-------------------------------------------------------------------------------------------------
/// <summary> Writes a Markdown response with appropriate headers. </summary>
///
/// <param name="ctx"> The HTTP context. </param>
/// <param name="markdown"> The Markdown content to write. </param>
///
/// <returns> A task representing the asynchronous operation. </returns>
///-------------------------------------------------------------------------------------------------
private async Task WriteMarkdownResponse(HttpContext ctx, String markdown)
{
// ETag + conditional 304
if (mOptions.SetWeakETag)
{
var etag = CreateWeakETag(markdown);
ctx.Response.Headers[HeaderNames.ETag] = etag;
if (ctx.Request.Headers.IfNoneMatch.ToString() == etag)
{ // Simplified: exact match only (does not handle lists, weak tags, or "*")
SetCommonHeaders(ctx);
ctx.Response.StatusCode = StatusCodes.Status304NotModified;
return;
}
}
SetCommonHeaders(ctx);
ctx.Response.StatusCode = StatusCodes.Status200OK;
ctx.Response.ContentType = "text/markdown; charset=utf-8";
if (HttpMethods.IsHead(ctx.Request.Method))
{
return;
}
await ctx.Response.WriteAsync(markdown, Encoding.UTF8, ctx.RequestAborted).ConfigureAwait(false);
}
///-------------------------------------------------------------------------------------------------
/// <summary> Sets common response headers for Markdown responses. </summary>
///
/// <param name="ctx"> The HTTP context. </param>
///-------------------------------------------------------------------------------------------------
private void SetCommonHeaders(HttpContext ctx)
{
ctx.Response.Headers["Content-Signal"] = mOptions.ContentSignalHeaderValue;
MergeVary(ctx, "Accept");
if (mOptions.SetNoTransform)
{
ctx.Response.Headers[HeaderNames.CacheControl] = "no-transform";
}
}
///-------------------------------------------------------------------------------------------------
/// <summary> Merges a value into the Vary response header without duplicating. </summary>
///
/// <param name="ctx"> The HTTP context. </param>
/// <param name="value"> The value to add to the Vary header. </param>
///-------------------------------------------------------------------------------------------------
private static void MergeVary(HttpContext ctx, String value)
{
var existing = ctx.Response.Headers[HeaderNames.Vary].ToString();
if (String.IsNullOrEmpty(existing))
{
ctx.Response.Headers[HeaderNames.Vary] = value;
return;
}
if (existing.Contains(value, StringComparison.OrdinalIgnoreCase))
{
return;
}
ctx.Response.Headers[HeaderNames.Vary] = $"{existing}, {value}";
}
///-------------------------------------------------------------------------------------------------
/// <summary> Creates a weak ETag from the SHA256 hash of the content. </summary>
///
/// <param name="content"> The content to hash. </param>
///
/// <returns> A weak ETag string in the format W/"base64hash". </returns>
///-------------------------------------------------------------------------------------------------
private static String CreateWeakETag(String content)
{
var bytes = SHA256.HashData(Encoding.UTF8.GetBytes(content));
var base64 = System.Convert.ToBase64String(bytes);
return $"W/\"{base64}\"";
}
}
#region LoggerMessages
internal static partial class MarkdownForAgentsLoggerMessages
{
[LoggerMessage(EventId = 200, Level = LogLevel.Debug, Message = "Serving CMS markdown for {Language}/{Slug}")]
public static partial void LogServingCmsMarkdown(this ILogger logger, String language, String slug);
[LoggerMessage(EventId = 201, Level = LogLevel.Warning, Message = "CMS markdown retrieval failed for {Language}/{Slug}: {Error}")]
public static partial void LogCmsMarkdownFailed(this ILogger logger, Exception ex, String language, String slug, String error);
[LoggerMessage(EventId = 202, Level = LogLevel.Debug, Message = "Serving HTML-to-Markdown fallback for {Path}")]
public static partial void LogServingHtmlFallbackMarkdown(this ILogger logger, String path);
[LoggerMessage(EventId = 203, Level = LogLevel.Warning, Message = "Markdown-for-agents fallback failed, serving HTML: {Error}")]
public static partial void LogMarkdownFallbackFailed(this ILogger logger, Exception ex, String error);
}
#endregion
And the code to register the middleware:
///-------------------------------------------------------------------------------------------------
/// <summary> Registers the Markdown-for-Agents middleware. Place before MapRazorPages(). </summary>
///
/// <param name="app"> The web application. </param>
/// <param name="configure"> Optional configuration callback. </param>
///
/// <returns> The web application for chaining. </returns>
///-------------------------------------------------------------------------------------------------
public static WebApplication UseMarkdownForAgents(
this WebApplication app,
Action<MarkdownForAgentsOptions>? configure = null)
{ // check requirements
ArgumentNullException.ThrowIfNull(app);
var options = new MarkdownForAgentsOptions();
configure?.Invoke(options);
app.UseMiddleware<MarkdownForAgentsMiddleware>(options);
return app;
}
The configuration class (adapt to your own needs):
///-------------------------------------------------------------------------------------------------
/// <summary> Configuration options for the Markdown-for-Agents middleware. </summary>
///-------------------------------------------------------------------------------------------------
internal sealed class MarkdownForAgentsOptions
{
///-------------------------------------------------------------------------------------------------
/// <summary> Gets or sets the Content-Signal header value. </summary>
///-------------------------------------------------------------------------------------------------
public String ContentSignalHeaderValue { get; set; } = "ai-train=yes, search=yes, ai-input=yes";
///-------------------------------------------------------------------------------------------------
/// <summary> Gets or sets a value indicating whether to accept the legacy text/x-markdown type. </summary>
///-------------------------------------------------------------------------------------------------
public Boolean AcceptLegacyTextXMarkdown { get; set; } = true;
///-------------------------------------------------------------------------------------------------
/// <summary> Gets or sets a value indicating whether to set Cache-Control: no-transform. </summary>
///-------------------------------------------------------------------------------------------------
public Boolean SetNoTransform { get; set; } = true;
///-------------------------------------------------------------------------------------------------
/// <summary> Gets or sets a value indicating whether to set a weak ETag header. </summary>
///-------------------------------------------------------------------------------------------------
public Boolean SetWeakETag { get; set; } = true;
///-------------------------------------------------------------------------------------------------
/// <summary> Gets or sets the HTML comment marker that indicates the start of page content. </summary>
///-------------------------------------------------------------------------------------------------
public String ContentStartMarker { get; set; } = "<!-- CONTENT_START -->";
///-------------------------------------------------------------------------------------------------
/// <summary> Gets or sets the HTML comment marker that indicates the end of page content. </summary>
///-------------------------------------------------------------------------------------------------
public String ContentEndMarker { get; set; } = "<!-- CONTENT_END -->";
///-------------------------------------------------------------------------------------------------
/// <summary> Gets or sets the URL path prefixes that should be skipped by the middleware. </summary>
///-------------------------------------------------------------------------------------------------
public List<String> SkipPathPrefixes { get; set; } =
[
"/api/",
"/auth/",
"/cart/",
"/status",
"/error",
"/chathub",
"/assets/"
];
}
A short note on two details worth flagging.
Quality value in the Accept header
The Accept HTTP header can optionally contain multiple desired formats, each with a 'quality' value:
Accept: text/html, application/json;q=0.9, text/plain;q=0.5, application/xml;q=0
This indicates 'how strongly' a specific format is desired. The code above serves Markdown when the q value for text/markdown is greater than zero.
ETag header
The ETag (Entity Tag) response header is an ID for a specific version of a resource — for example an HTML page or, as in our case, a Markdown result. This enables caching of the resource. A client that has cached the resource sends the original ETag back to the server; if nothing has changed, the server responds with HTTP status 304 (Not Modified) — otherwise with the changed resource and a new ETag. This avoids re-transmitting unchanged content. Another use case is handling colliding edit operations — see the documentation for more. The code above emits an ETag so that caching is possible.
Deployment note: depending on configuration, reverse proxies / CDNs may use their own cache key, which means you have to make sure Accept HTML vs. Markdown + ETag actually feed into the cache key. Otherwise you risk 'cross-content' cache hits (Markdown served to the browser or vice versa).
Vary header
The Vary response header tells the browser (or the agent) which request header influences the format of the response. This signals to the caller that different values of that request header change the result.
Vary: Accept
In other words: this tells the browser that the response served depends on the value of the Accept request header.
Application-independent implementation
This option is particularly interesting for hosting scenarios in which a reverse proxy (Nginx, HAProxy, Traefik, YARP, ...) or an edge proxy (Cloudflare, AWS CloudFront, Fastly, Akamai, Vercel's Edge Network) sits in front of the actual website. The idea here is to detect the Accept: text/markdown header and either
- route to a service that fetches the HTML markup of the site on the fly and converts it to Markdown, or
- use a cache such as Redis that has been pre-populated with the corresponding Markdown content.
The two approaches can of course be combined — generated Markdown can be stored in the cache.
Advantages
- A uniform implementation across many heterogeneous applications.
- One solution for several websites and hosting environments.
- Application code doesn't need to be touched.
Disadvantages
- The quality of HTML-to-Markdown conversion depends heavily on how complex the source HTML is (headers, footers, sidebars, ads, etc.).
- An extra network hop, since the conversion service sits between the proxy and the main application.
- If content is pre-generated, a corresponding generation pipeline or background job is required. The frequency of change and the invalidation of generated content also need to be handled.
Implementation
A realisation of this path depends primarily on the following factors:
- Required throughput: ideas would include a Node.js Express solution with the Turndown library, or a fast Go implementation.
- Caching support.
- Quality of boilerplate removal: possibly use a library such as @mozilla/readability + jsdom.
Sample Nginx configuration
An Nginx configuration for wiring in a Markdown conversion service might look like this.
Note: the md_renderer upstream referenced here is a placeholder — the actual conversion service (e.g. based on Turndown or a comparable library) has to be implemented and provided separately.
# /etc/nginx/conf.d/site.conf
# Decide upstream based on Accept header (safe to use 'map').
map $http_accept $wants_markdown {
default 0;
~*text/markdown 1;
~*text/x-markdown 1;
}
# Pick the upstream name based on $wants_markdown.
map $wants_markdown $backend_upstream {
0 "html_origin";
1 "md_renderer";
}
upstream html_origin {
# Your normal site/app (example)
server 10.0.10.25:8080;
keepalive 64;
}
upstream md_renderer {
# Your renderer service (example)
server 10.0.20.15:3000;
keepalive 32;
}
server {
listen 80;
server_name www.example.com;
# Optional: if you also terminate TLS, add the 443 server block and redirect HTTP->HTTPS.
location / {
# Route to either html_origin or md_renderer based on Accept.
proxy_pass http://$backend_upstream;
# Forward standard proxy headers
proxy_set_header Host $host;
proxy_set_header X-Forwarded-Host $host;
proxy_set_header X-Forwarded-Proto $scheme;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Real-IP $remote_addr;
# Tell renderer what was originally requested (path + query)
# For HTML origin this is harmless.
proxy_set_header X-Original-URI $request_uri;
# Important for caching correctness (if you enable proxy_cache later):
# The response varies by Accept.
# If you have any intermediate caches, ensure they respect Vary: Accept.
# (NGINX will forward the header; the renderer/app should set it.)
}
}
With TLS support added:
server {
listen 443 ssl http2;
server_name www.example.com;
ssl_certificate /etc/letsencrypt/live/www.example.com/fullchain.pem;
ssl_certificate_key /etc/letsencrypt/live/www.example.com/privkey.pem;
location / {
proxy_pass http://$backend_upstream;
proxy_set_header Host $host;
proxy_set_header X-Forwarded-Host $host;
proxy_set_header X-Forwarded-Proto https;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Original-URI $request_uri;
}
}
server {
listen 80;
server_name www.example.com;
return 301 https://$host$request_uri;
}
Checklist
The following points matter for both approaches above:
- Correct cache handling (when caching is used): the cache should be consistent and current (see also the ETag section above).
- Content parity: the Markdown should reflect the same primary content as the HTML — not a 'thinner' version.
- Boilerplate control: when converting from HTML, nav, footers and sidebars should be removed (Readability-style extraction helps here).
- Security: the Markdown service must not fetch arbitrary URLs (SSRF). Only your own origins should be allowed.
- Observability: track metrics such as Markdown hits, conversion time, cache hit rate, and origin fetch errors.
- Graceful fallback: if the conversion fails, you have a few options: return HTML (not ideal), return 406 Not Acceptable, or return a minimally extracted Markdown (plain text).
Alternative approaches
Google recently introduced WebMCP (Web Model Context Protocol) — a new JavaScript interface intended to give AI agents a standardised way to communicate with websites. Unlike Markdown-for-Agents, the focus here is not on scraping the website's content but on targeted interaction with the website — for example to fill in forms or place orders. The two approaches are complementary: Markdown-for-Agents optimises the passive reading of content, WebMCP enables active interaction. Implementing WebMCP requires more development effort, however, since it means working with new JavaScript APIs.
Conclusion
Serving Markdown content to AI agents is not a Cloudflare privilege — with manageable effort, this feature can be integrated into any existing web infrastructure. Whether application-specific via middleware or proxy-based via Nginx and friends, what matters is that your content is optimised for the next generation of 'readers'. Designing your site to be AI-friendly today secures the reach of tomorrow — because the Accept header of the future is, more and more often, going to be text/markdown.