Architecture, the client brain & direction · team reference · 2026-05-30 · repos
ilayda-app (the GTM App) · client-portal · aicro-os
This is the shared map of what we're building: an AI engine that runs B2B outbound end-to-end (the GTM App), the client-facing product it feeds (the Client Portal), and the knowledge layer underneath it all (the client brain). It covers how we run GTM today (manually), what the app automates, how we ship it, and where it's going — a vertical GTM agent for proptech.
A note on names: the app in the ilayda-app repo is what we'll
call "the GTM App" here (Ilayda calls it "Sarah"). It is not the same thing as
"Cortex" — that name refers to the new multi-schema Supabase database (§9). Companion read:
Doug's Step-1 / scheduler
overview.
1 · How we run GTM today (manual)
2 · The three-app map
3 · The GTM App — stack
4 · How a change reaches production
5 · The pipeline the app automates (1–6)
6 · Grading & routing
7 · The tooling — data providers
8 · The client brain (Step-1 brief)
9 · What feeds the brain
10 · The scheduler (which campaign next)
11 · The system-wide learning loop
12 · The Client Portal (client-facing)
13 · Data layer, the migration & the DAL
14 · The Sarah terminal + open question
15 · Where we're going
16 · Roadmap
Before any app, this is how a GTME executes a campaign by hand today — our core operating procedure. Every campaign follows this sequence; skipping or reordering steps creates downstream problems (bad data in sending platforms, client-trust erosion, wasted enrichment credits). This 35-step playbook is exactly what the GTM App is automating — each phase below maps to a step in the app's pipeline (§5).
This manual sequence is the spec. Phase 8's monitoring (what's working, replies, deliverability) is also the raw material for the learning loop (§10) — today a human reads it; the target is the brain reads it.
AICRO's product is three applications over one shared data layer. Different audiences, deploy independently, converging on a single Supabase "client brain."
portal.aicro.coThe GTM App (the ilayda-app repo) is one repo with three deployables plus an
orchestration layer. Nothing calls anything directly — every layer coordinates through Supabase.
web/ · Next.jsweb/src/trigger/ · retries, cost predictorsmodal_app/ · grading, copy, QA, the learning loopsarah_worker/ · warm container running Claude CodeThis is the one thing blocking the team right now. The GTM App is deployed by hand from a
local machine via CLI — there's no auto-deploy from GitHub (no .github/workflows,
no Git connection on Vercel). The consequence: code can be live in production that was never
pushed to GitHub, so the repo doesn't reflect what's running, and no one else can review,
reproduce, or build on it safely. (The Client Portal already deploys the right way — Vercel
git-connected — which is the model.)
| Surface | Deployed by |
|---|---|
| Vercel (UI) | vercel --prod |
| Modal (compute) | modal deploy |
| Trigger.dev | trigger.dev deploy |
| Railway (worker) | railway up |
All from one laptop. GitHub is a delayed backup, not the source of truth.
main, deploy is automatic like the Portal| Surface | Auto-deploy from GitHub |
|---|---|
| Vercel | Connect Repo (native) · root web/ |
| Railway | Connect Repo (native) · root sarah_worker/ |
| Modal | GitHub Action runs modal deploy on merge |
| Trigger.dev | GitHub Action (or native) on merge |
One push to main → all four redeploy. Every branch gets a preview URL.
mainYou still code locally and can use the CLI for your own previews — the only change is
that production ships from a merge to main, never a laptop.
The GTM App turns the 35-step manual playbook (§1) into six operator-driven steps. Each writes its output to Supabase; the UI shows progress as a stepper. ~4,000 prospects per active campaign.
campaign_prospect_rowsGrading is the heart of the operation and one of the most valuable things the app produces. Get it wrong and we write great copy to the wrong people. It runs inside Step 4 and is two LLM grades against the campaign brief's criteria, then a routing decision. Everything is graded A/B/C/D by ICP fit, with the reasoning stored per prospect.
Graded by ICP fit for the decision-makers the client is after (target function, ideal titles, decision-maker profile, excluded roles from the brief). Actually three grades per person:
icp_label (ICP1/ICP2/…) is assigned after grading for routing,
but the A/B/C/D grade is identical across ICPs. Clients chasing multiple ICPs need
different grading per ICP, all trackable — that needs a schema change (grades keyed by
ICP), not just a prompt tweak.File data vs. LLM research: grading is hybrid —
it enriches first (crawl/search the domain, scrape LinkedIn), then the LLM grades the enriched
fields against the brief. Enrichment is cached cross-client (180 days); the grade is always re-run
per campaign because the brief differs. Grades + reasoning are stored per prospect in
campaign_prospect_rows; which grades convert is exactly what the learning loop
should feed back (§11).
Enrichment + grading (Step 4) is where the external data providers are called. Most of the work is domain crawl → LLM grade → LinkedIn scrape, with search/AI fallbacks when a crawl fails. Per-domain results are cached cross-client for 180 days, so the same company isn't re-crawled per campaign.
| Provider | What it gets | When it fires | ~Cost |
|---|---|---|---|
| RevenueBase (Snowflake) | Prospect list pulls (companies + people) from the lead database | Step 3 sourcing — manual today | — |
| Spider | Domain homepage + ~4 subpages crawl; LinkedIn company page | Step 4 — primary enrichment, per unique domain | $0.0006–0.012 / page |
| Serper | Google search — recover a domain or LinkedIn URL | Step 4 — fallback when Spider returns <300 chars / fails | $0.001 / query |
| Perplexity Sonar | AI company summary (HQ, specialties, LinkedIn about) | Step 4 — last-resort when Spider + Serper both fail | ~$0.001 |
| Apify (LinkedIn scraper) | LinkedIn profile: headline, positions, followers | Step 4 — for every kept A/B row with a LinkedIn URL (batches of 50) | $0.005 / URL |
| OpenRouter (gpt-4.1-mini / gpt-4o-mini / gpt-5-mini / claude) | The LLM for grading, message strategy, copy, QA | Steps 4 (grade), 4B (strategy), 5 (write), 6 (QA) | per-token |
| EmailBison | Email sending platform | Send — post-pipeline, manual; creds via Airtable | — |
| HeyReach | LinkedIn sending platform | Send — post-pipeline, manual; creds via Airtable | — |
| Airtable | CRM / client config + per-client platform credentials | Config lookup across steps; CRM source for the Portal | — |
| n8n | Workflow automation (metrics sync, follow-ups) | Daily monitoring (Phase 8) + Portal triggers | — |
Provider mix all-time (~$53 metered): OpenRouter $40 · Apify $7 · Perplexity $3 · Spider / Serper cents. (LinkUp / Apollo / Hunter / DataForSEO are in the legacy pipeline, not the current app.) The real cost is fixed subscriptions, not metered API — see §12.
The brief is the document that teaches the AI how to write for a client: voice rules, ICP, proof points, disqualifiers, banned phrases. Every downstream agent reads it first, so one brief shapes thousands of emails. In Doug's Step-1 design this is framed sharply:
"Doctrine is the product, not a feature."
A good brief is the whole game. The brief carries machine-readable validators (voice checks, banned-phrase counts, service-clarity hard-fails, the §38b hard-fail) so quality is auditable, not vibes — which is also what makes the system trustworthy enough to put in front of clients.
Client context is authored as markdown in the aicro-os repo and
one-way migrated into Supabase (a manual seed: aicro-os/01_clients/{slug}/*.md
→ client_doctrine, client_proof_inventory, client_outcome_anchors,
client_service_clarity, client_briefs). At compose time, the app reads the
brief entirely from Supabase — so the read side is already DB-backed. The gaps:
client_slug with no shared ID, so a naming mismatch can
silently orphan a client's context. Coverage is still thin — only a handful of clients are fully
seeded; the rest fall back to generic copy.aicro-os stays the GTM team's repo for authoring, queries, and dashboards —
it isn't going away. The shift is that it becomes one source that feeds the brain, not the
master copy the app silently derives from.
Today, "what we know about a client" is assembled manually into aicro-os:
a client fills an onboarding form, and we turn it into research reports, ICP profiles, and
profile.md / icp.md / client_summary_brief.md files — which then
seed Supabase. That's a thin, one-time, human-built slice of the truth.
The target is a living client brain fed by every signal we have about a client, with the app doing ingestion + synthesis instead of a person assembling markdown:
profile.mdOne canonical client brain in Supabase — versioned, multi-source, always current — that the app writes to via ingest workers + the learning loop, and that everything downstream reads from. The brief stops being a static file and becomes a live profile of everything we know about the client.
Once we know a client (Step 1), the next question is which specific campaign to run for them this week (Step 2). That's the scheduler — Josh's product, today an internal tool in the GTME dashboard, designed to live eventually in the Portal. It's the propose-then-promote surface for what to build next.
internal.aicro.co/gtme.POST /scheduler/accept → creates the
Airtable Campaign Creation Request that kicks off the brief-writing work./gtme propose-then-promote surface — not a parallel system."The moat is the loop, not the model."
This is bigger than the GTM builder app. The brain only gets smarter if every signal about what's actually working flows back into client context — and most of those signals originate outside the builder (in the campaigns, the inboxes, the scheduler, and the Client Portal). The loop spans all three apps and lands in one place: the client brain.
Every one of these is a vote on what the brain believes about a client. Today they're scattered (some in the app, some in the Portal, some in Airtable/CRM, some only in a person's head). The target: they all write back into client context, so the next campaign is composed from a brain that already learned from the last one. Critically, several of these — bad-ICP, client edits — we are not yet feeding back; we need to be.
The app already has the feedback engine that does part of this (the "Sarah" brain,
modal_app/sarah/) — it watches post-send telemetry and improves prompts. Much is
live in code; the cross-app pieces (bad-ICP feedback, client edits, full CRM) and the
brief-level write-back are the gaps. Honest status:
LIVE in code PARTIAL wired but a step is stubbed DESIGN in Doug's spec, not yet code
| Stage | What it does | Where | Status |
|---|---|---|---|
| Reflect | Scans telemetry for anomalies; an LLM produces root-cause +
doctrine_update_proposals |
sarah/reflect.py → brain_reflections | LIVE |
| Classify replies | Categorizes inbound replies + emits learning signals | sarah/reply_classifier.py | LIVE |
| Score / critique copy | Self-critique scores artifacts; generates prompt variants | self_critique.py · variant_generator.py | LIVE |
| A/B promote | z-test on variant performance; auto-promotes the winner into the live prompt | variant_promoter.py | LIVE |
| Propose → promote (briefs) | Proposals surface for a human to review + approve doctrine changes | /sarah/reflections + …/apply | PARTIAL — apply is a stub |
| Bad-ICP / client edits / CRM → brain | Pull Portal + CRM feedback into client context | cross-app wiring | TO BUILD |
So the copy-level loop genuinely runs in code (reply → reflect → critique → variant → promote). The gaps are the cross-app, brief-level ones: pulling bad-ICP feedback + client edits (Portal) and closed-won (CRM) into client context, and making "apply this proposed change" actually write back to the brain. That's the next build — wiring existing signals into one brain, not from scratch.
The Portal (client-portal repo) is the surface clients and the team log into. It turns the
raw campaign/reply data into dashboards and action surfaces. This is where the work becomes
client-facing — and increasingly, where the app's internal surfaces (scheduler, client brain)
will land.
The Portal is where the client brain becomes visible and editable. Instead of
"what we know about a client" living in profile.md files only we can see, the Portal
surfaces it as a "what we know about you" view:
Client input becomes another high-quality source feeding the brain — and the transparency ("here's exactly what the agent knows and why") is itself a selling point. The scheduler likely lands here too as it graduates from the internal GTME dashboard.
Built on Next.js + Supabase + Airtable + n8n; Vercel git-connected;
multi-tenant (workspaces/workspace_members + RLS, clients see only their own).
The seam to the GTM App is Supabase — as Supabase becomes the canonical brain, the
Portal reads more from it and less from Airtable.
Supabase is the real center of gravity, and it's mid-reorganization. Three things matter here.
Database calls are inline everywhere: ~519 .from()
calls across 153 files in the web app (≈69 distinct tables), plus per-file REST
helpers across the Modal workers. No single place owns table names.
The new "Cortex" Supabase project reorganizes the flat ~177-table database into
domain schemas (agent, knowledge, intelligence,
events, monitor) and renames tables (campaign_runs →
agent.runs). Actively migrating but referenced nowhere in app code yet.
Without an access layer, the cutover is a 150+-file, error-prone sweep. Introduce a data-access layer once (table names + schema in one module) and both the cutover and the client-brain work become a one-place change. This is the unlock.
web/src/lib/data/* for TS; a
centralized REST module for the Modal workers). P1b — make Supabase canonical with write-through editing;
collapse the 3-copies brief problem; add a stable client_id (keep slug as a handle). P1c —
inject the DB copy rules into the prompt so editing the brain actually changes output. P2 — repoint to the
new schema through the DAL, regression-test every step. Note: the GTM App and the Portal currently point
at different Supabase configs — part of this work is deciding the canonical project and aligning both.
The app's /sarah pages are browser windows into a warm Railway container that spawns
the Claude Code CLI in a tmux session to run an agent step, streamed back live. It
exists to dodge two real constraints: serverless cold starts, and Anthropic API rate limits —
authenticating with a Claude subscription instead of the per-token API sidesteps
both at a flat rate.
It's a smart workaround and also the most bespoke, hardest-to-own piece — a hand-built terminal handshake on a container that depends on a subscription token. The open question: should the steps run through an interactive terminal at all, or a headless job queue? The queue trades flat-rate billing + live-watching for something far easier to own, scale, and sell to clients. Worth deciding deliberately as we productize.
sarah/crm_sync.py, reply_classifier.py, performance_aggregator.py).profile.md./gtme surface the scheduler uses.From Doug's Step-1 design — the shape we're converging on:
main so GitHub == live; connect Vercel + Railway (native); add a
GitHub Action for Modal + Trigger.dev.Shared team reference. The manual order-of-operations is AICRO's current GTME playbook; pipeline,
grading, deploy, scheduler, and data-layer facts reflect the current ilayda-app and
client-portal repos plus a live inspection on 2026-05-30. Counts (client coverage, cost,
tables, call-sites) move as we build. Items marked DESIGN / TO BUILD are from
Doug's Step-1
overview and not yet in code. Questions or corrections → Josh.