Build · Reference
The Cloudflare Layer
The features you asked about that live in Cloudflare, not Astro: deploy hooks (rebuild on WordPress publish), AI-crawler controls, Markdown-for-agents, and cache purging. Verify exact dashboard paths against current Cloudflare docs — their AI/agent features move fast.
Deploy hooks
A secret URL that triggers a fresh Pages build & deploy.
Create one in Cloudflare Pages
- Pages project → Settings → Builds & deployments → Deploy hooks.
- Add a hook: give it a name (e.g.
wp-content-publish) and pick the branch to build (main). - Cloudflare gives you a unique POST URL containing a secret token. Treat it like a password.
- Anyone who POSTs to that URL triggers a build — no auth header needed, the token is the auth.
# Triggering a build is a single POST
curl -X POST "https://api.cloudflare.com/client/v4/pages/webhooks/deploy_hooks/<TOKEN>" Why it matters: Your headless site is static, so new WordPress content only goes live after a rebuild. This URL is the trigger. Store it as a secret in the WP backend — never commit it to either repo.
Trigger from WordPress
Fire the deploy hook automatically when content changes.
Hook into the publish lifecycle
Add this to the rbi-api plugin (keeps the deploy concern with your other custom backend code, not in the theme). Use transition_post_status so you catch publish, update, and unpublish.
// rbi-api: rebuild the static site on content change
add_action("transition_post_status", function ($new, $old, $post) {
// only for public post types we render on the Astro site
if (!in_array($post->post_type, ["post", "rbi_projects"], true)) return;
// only when going to/from published (skip autosaves, drafts)
if ($new !== "publish" && $old !== "publish") return;
$hook = defined("CF_DEPLOY_HOOK_URL") ? CF_DEPLOY_HOOK_URL : getenv("CF_DEPLOY_HOOK_URL");
if (!$hook) return;
wp_remote_post($hook, [ "blocking" => false, "timeout" => 5 ]);
}, 10, 3); Store the URL safely: define CF_DEPLOY_HOOK_URL in wp-config.php or a WP Engine environment variable — not in the plugin source. Mirror the cross-repo-contract pattern you already use for the GF endpoint.
Why it matters: Closes the loop — marketer clicks Publish → WP pings Cloudflare → Astro rebuilds → live in a couple minutes. The client experiences a normal CMS; you keep a static, fast, cheap site.
Debouncing builds
Don't rebuild on every keystroke
A naive hook can fire many builds during a heavy editing session (each update = a build = build minutes + a few minutes of stale lag while it runs). Mitigations, simplest first:
| Approach | How |
|---|---|
| Publish-only | The status guard above already skips drafts/autosaves — covers most cases. |
| WP transient lock | Set a 60–120s set_transient() flag; skip POST if it's set. Coalesces a burst of edits into one build. |
| Scheduled batch | For very chatty sites: flag "dirty" on change, run a WP-Cron job every N minutes that fires the hook once if dirty. |
| Manual button | An admin-bar "Publish site changes" button for editorial teams who batch their work. |
Why it matters: Cloudflare Pages has monthly build limits on lower plans, and a flood of overlapping builds just delays the one that matters. Debouncing keeps deploys predictable and cheap.
Markdown for agents
Serving clean Markdown to AI assistants and answer engines.
Two ways to deliver it
| Mechanism | What happens |
|---|---|
| Content negotiation | A request with Accept: text/markdown (or a known agent UA) gets a Markdown version of the page instead of HTML — nav/chrome/scripts stripped. |
.md twin URLs | Every /page has a parallel /page.md that returns just the content as Markdown. |
Cloudflare can perform this transformation at the edge (its AI/agent and "AI Crawl"-adjacent features can return Markdown representations), so agents get clean content without you maintaining a second render path. You can also do the .md twins yourself in Astro (endpoint per route — see the Astro reference) and let them sit behind Cloudflare's cache.
Decide where it lives: If Cloudflare handles negotiation at the edge, you get it for every page with no code. If you generate .md twins in Astro, you control the exact Markdown and it's portable to any host. For a content-heavy SaaS blog, the Astro-generated twin gives you the cleanest, most predictable output.
Why it matters: AI answer engines increasingly fetch pages to cite/summarize. Clean Markdown means they ingest the client's actual content accurately — better representation in AI answers, and dramatically cheaper for agents to parse than full HTML.
AI Crawl Control
Decide which AI bots may access a client's content — at the edge.
Allow, block, or meter AI crawlers
Cloudflare's bot/AI-crawler controls (in the dashboard under the AI/bot management area) let you set policy per known AI crawler, independent of robots.txt (which is advisory and ignorable). Enforcement happens at Cloudflare's edge, so it actually blocks rather than politely requests.
| Crawler | Used for |
|---|---|
| GPTBot | OpenAI training |
| OAI-SearchBot | OpenAI search/answers |
| ClaudeBot | Anthropic |
| PerplexityBot | Perplexity answers |
| Google-Extended | Google AI training (separate from Googlebot) |
| Bytespider / others | Various; often blocked |
Why it matters: Clients will ask "do I want AI training on my site? Do I want to show up in ChatGPT/Perplexity answers?" Those are often different answers (block training, allow answer-engine indexing). Knowing the levers — and that they're enforceable at Cloudflare, not just in robots.txt — makes you the advisor, not the order-taker.
robots.txt & llms.txt
The advisory layer
These are conventions bots may honor — pair them with Cloudflare's enforcement for anything you actually need guaranteed. Both are static files you can ship from Astro's public/ or generate via an endpoint.
# public/robots.txt — block AI training, allow search
User-agent: GPTBot
Disallow: /
User-agent: Google-Extended
Disallow: /
User-agent: *
Allow: /
Sitemap: https://redbridgenet.com/sitemap-index.xml Why it matters: robots.txt is the polite signal; Cloudflare AI Crawl Control is the enforcement. llms.txt (covered in the Astro reference) is the opposite move — actively inviting good representation by handing agents a curated map. Most clients want some mix of all three.
Cache purging
When you don't rebuild — purge instead
If a site grows large enough that full rebuilds get slow and you move blog routes to SSR + edge cache (the escalation path in the Astro reference), you swap "rebuild on publish" for "purge cache on publish." Cloudflare's purge API clears specific URLs.
# purge just the changed post + the blog index
curl -X POST "https://api.cloudflare.com/client/v4/zones/<ZONE_ID>/purge_cache" \
-H "Authorization: Bearer <API_TOKEN>" \
-H "Content-Type: application/json" \
-d '{"files":["https://site.com/blog/the-slug/","https://site.com/blog/"]}' Why it matters: Purge is near-instant and surgical, where a rebuild is whole-site and minutes-long. For a static site, stick with deploy hooks; reach for purge only once you've moved a route to SSR. Don't add this complexity preemptively.
Headers & rules
_headers, _redirects & Rules
Cloudflare Pages reads a public/_headers and public/_redirects file from your build output. Use them for security headers and edge redirects without touching the dashboard. (Astro's config redirects can also emit these.)
# public/_headers
/*
X-Content-Type-Options: nosniff
Referrer-Policy: strict-origin-when-cross-origin
Strict-Transport-Security: max-age=31536000; includeSubDomains
/blog/*.md
Content-Type: text/markdown; charset=utf-8
Cache-Control: public, max-age=3600 Why it matters: Security headers are a common audit/checklist item for SaaS clients (SOC 2-adjacent expectations). Shipping them as a versioned file in the repo is cleaner and more reviewable than dashboard clicks.
Astro vs. Cloudflare — who owns what
| Concern | Lives in |
|---|---|
| Static HTML generation | Astro (build) |
| Rebuild on content change | Cloudflare deploy hook ← WordPress |
| Markdown twin content | Astro endpoint (or Cloudflare edge transform) |
| Serving Markdown to the right UA | Cloudflare (content negotiation at edge) |
| Blocking/allowing AI crawlers | Cloudflare AI Crawl Control (enforced) |
| robots.txt / llms.txt | Astro public/ (advisory) |
| Security headers | Cloudflare _headers (in repo) |
| URL redirects (301s) | Astro config → emitted as Cloudflare _redirects |
| CDN cache / purge | Cloudflare |
| SSR / edge functions / middleware | Astro code, run on Cloudflare adapter |
Per-client launch checklist
- Create the Pages deploy hook; store its URL as a WP secret.
- Wire the WP
transition_post_statustrigger with the publish-only guard + a debounce transient. - Decide AI policy with the client: training vs. answer-engine indexing → set AI Crawl Control + matching
robots.txt. - Ship
llms.txtand (if selling AI-discoverability) Markdown twins for key content. - Add
_headerswith security headers; confirm HSTS only after HTTPS is solid. - Confirm
_redirectscovers the old site's URLs (link equity). - Verify the cache strategy: deploy-hook rebuild now; document the SSR + purge escalation path for later scale.