The AICRO GTM App

Architecture, the client brain & direction · team reference · 2026-06-09 (corrected; first published 2026-05-30) · repos ilayda-app (the GTM App) · client-portal · aicro-os

This is the shared map of what we're building: an AI engine that runs B2B outbound end-to-end (the GTM App), the client-facing product it feeds (the Client Portal), and the knowledge layer underneath it all (the client brain). It covers how we run GTM today (manually), what the app automates, how we ship it, and where it's going — a vertical GTM agent for proptech.

A note on names: the app in the ilayda-app repo is what we'll call "the GTM App" here (Ilayda calls it "Sarah"). It is not the same thing as "Cortex," which refers to the new multi-schema Supabase database (§9). Heads-up: "Sarah" gets used for three different things in this doc, the GTM App itself (Ilayda's name for it), the Sarah worker and its terminal (§14, the Railway container that runs Claude Code), and the Sarah learning loop (§11, modal_app/sarah/). Same name, three layers. Companion read: Doug's Step-1 / scheduler overview.

What this doc aligns the team on:

How we run GTM today — the 35-step manual Critical Order of Operations. This is the core operation the GTM App is automating, step by step.
The three apps & how they connect — the GTM App (the engine), the Client Portal (client-facing), and aicro-os (the GTM team's knowledge repo), all over one Supabase.
How we ship — moving the GTM App from hand-deploys to GitHub as the single source of truth, the way the Portal already works, so anyone can contribute.
The product engine — the campaign pipeline, grading, the client brain, the scheduler, and a system-wide learning loop where everything that's working flows back into client context. "The moat is the loop, not the model."
The data direction — Supabase becomes the canonical client brain via a data-access layer + the new multi-schema Cortex database; surfaces converge into the Portal, some client-facing.

1 · How we run GTM today (manual)
2 · The three-app map
3 · The GTM App — stack
4 · How a change reaches production
5 · The pipeline the app automates (1–6)
6 · Grading & routing
7 · The tooling — data providers
8 · The client brain (Step-1 brief)
9 · What feeds the brain
10 · The scheduler (which campaign next)
11 · The system-wide learning loop
12 · The Client Portal (client-facing)
13 · Data layer, the migration & the DAL
14 · The Sarah terminal + open question
15 · Where we're going
16 · Roadmap

1 · How we run GTM today — the Critical Order of Operations the core operation

Before any app, this is how a GTME executes a campaign by hand today — our core operating procedure. Every campaign follows this sequence; skipping or reordering steps creates downstream problems (bad data in sending platforms, client-trust erosion, wasted enrichment credits). This 35-step playbook is exactly what the GTM App is automating — each phase below maps to a step in the app's pipeline (§5).

Phase 1 Client Context & Campaign Brief

ICP definition + client context — target audience, verticals, titles, signals, exclusions; grounded in client context.
Campaign brief (external) — the client-facing guiding targeting doc. No separate internal brief; execution notes live on the Airtable task.
Copy & custom variables — 3-step email sequence + A/B variants; LinkedIn connect → msg 1 → msg 2; custom variables defined here.
Client approval — client signs off on targeting + copy. The only client approval checkpoint.

Phase 2 List Building — Company Level

RevenueBase filters — translate ICP into filter criteria.
Search & download — multiple lists, segmented; the critical-thinking step.
Dedupe companies — remove duplicate domains across lists.
Domain enrichment — verify company websites (needed before blocklist).
Blocklist check (domain) — check domains against DNC.
Company grading — grade A/B/C/D by ICP fit (employee count, revenue, vertical fit).
Clean company name — normalize names if used in copy.

Phase 3 People Enrichment

Pull people from ICP-confirmed companies — top 3 per company (cluster by geo for big companies).
Clean first name.
Blocklist check (email + LinkedIn).
Recent-outreach check (90 days).
People grading — A/B/C/D by ICP fit for the decision-makers the client is after (title tier, authority, company size). Routing: A+B → Email, A only → LinkedIn.
Enrich custom attributes from the copy.

Phase 4 Email Path

Find + validate email (re-find on bounce).
Blocklist check (new emails).
ESP provider check (optional) — exclude/segment by ESP.
Upload Grade A+B to EmailBison.

Phase 5 LinkedIn Path

LinkedIn activity check (180 days).
Upload Grade A + active to HeyReach.

Phase 6 Internal QA & Campaign Configuration

Copy review — verify brief copy maps to platform; test emails + merge vars.
Targeting review — leads align with ICP + brief.
Custom-variable check — all mapped + populated.
Blocklist compliance — final check across every stage.
Platform configuration — schedules, daily limits, sender accounts, signatures.

Phase 7 Launch

Link campaigns in Airtable (store Clay table, brief, notes).
Launch notification → client Slack channel.
Activate campaigns in EmailBison + HeyReach.
Close out the Airtable task with all artifacts.

Phase 8 Monitoring & Iteration

Daily — n8n syncs metrics to Airtable; monitor bounce (<5%), reply, deliverability.
Weekly — positive-reply rate vs. benchmarks (Email ≥25% PRR, LinkedIn ≥40% PRR); iterate copy + targeting.
Ongoing — monitor replies; pause/kill underperformers.

Hard rules (never skip): no list pull without approved ICP + brief · no blocklist check without domain enrichment first · no people enrichment without the company-grading gate · no platform upload without blocklist clearance at every stage · no campaign live without the Airtable task storing full context · no launch without the client Slack notification · no copy ships without stop-slop QA.

This manual sequence is the spec. Phase 8's monitoring (what's working, replies, deliverability) is also the raw material for the learning loop (§11), today a human reads it; the target is the brain reads it.

How the 35 manual steps map to the 6 app steps

Manual phase (§1)	App step (§5)
Phase 1, client context, brief, copy	Steps 1 (Brief), 2 (Campaign), 5 (Write)
Phases 2 and 3, company + people list build	Step 3 (import) + Step 4A (grade & route)
Phase 6, internal QA	Step 4A routing + Step 6 (QA)
Phase 7, launch	Outside the app (EmailBison / HeyReach upload, still manual)
Phase 8, monitor + iterate	Outside the app (n8n metrics sync, the learning loop §11)

2 · The three-app map

AICRO's product is three applications over one shared data layer. Different audiences, deploy independently, converging on a single Supabase "client brain."

⬢ The GTM App repo: ilayda-app

The AI engine (internal) · aka "Sarah"

Automates the 35-step playbook above: grades prospects, writes copy in each client's voice, runs quality gates, learns from replies. Next.js + Modal + Trigger.dev + a Railway worker, on Supabase.

▲ Client Portal repo: client-portal

The client-facing product · portal.aicro.co

Where clients (and our team) see campaign performance, replies, follow-ups, onboarding — and, increasingly, the client brain. Next.js + Supabase + Airtable + n8n. Multi-tenant. Live.

◳ aicro-os repo: aicro-os

The GTM team's knowledge repo + a local run surface

Where the GTM team (Josh, Doug, David) authors client context (ICP, profiles), runs queries, and builds dashboards. Today it's the upstream source that seeds the client brain. It is also where the team runs the campaign pipeline locally via Claude Code (the campaign-builder skill), so it is an execution surface too, distinct from the app's own Sarah-worker runs.

◳ aicro-osauthored context

→seeds

⛁ Supabasethe client brain

←writes

⬢ GTM Appruns · grades · replies · learning

↓reads

▲ Client Portal

→shows

clients & team

All three read/write one Supabase — the convergence point.

3 · The GTM App — stack

The GTM App (the ilayda-app repo) is one repo with three deployables plus an orchestration layer. Nothing calls anything directly — every layer coordinates through Supabase.

▲ Vercel

The UI / dashboard

web/ · Next.js

◆ Trigger.dev

Job orchestration

web/src/trigger/ · retries, cost predictors

⬢ Modal

Heavy compute

modal_app/ · grading, copy, QA, the learning loop

▶ Railway

"Sarah" worker + terminal

sarah_worker/ · warm container running Claude Code

4 · How a change reaches production the immediate fix

This is the one thing blocking the team right now. The GTM App is deployed by hand from a local machine via CLI — there's no auto-deploy from GitHub (no .github/workflows, no Git connection on Vercel). The consequence: code can be live in production that was never pushed to GitHub, so the repo doesn't reflect what's running, and no one else can review, reproduce, or build on it safely. (The Client Portal already deploys the right way — Vercel git-connected — which is the model.)

Today — manual CLI drifts

Surface	Deployed by
Vercel (UI)	`vercel --prod`
Modal (compute)	`modal deploy`
Trigger.dev	`trigger.dev deploy`
Railway (worker)	`railway up`

All from one laptop. GitHub is a delayed backup, not the source of truth.

Target — push to `main`, deploy is automatic like the Portal

Surface	Auto-deploy from GitHub
Vercel	Connect Repo (native) · root `web/`
Railway	Connect Repo (native) · root `sarah_worker/`
Modal	GitHub Action runs `modal deploy` on merge
Trigger.dev	GitHub Action (or native) on merge

One push to main → all four redeploy. Every branch gets a preview URL.

code locallyyour terminal — unchanged

→

push branch → open PRauto preview URL · safe

→

review + merge to main

→

Vercel · Railway · Modal · Triggerredeploy automatically · no manual step

You still code locally and can use the CLI for your own previews — the only change is that production ships from a merge to main, never a laptop.

5 · The pipeline the app automates — Steps 1–6

The GTM App turns the 35-step manual playbook (§1) into six operator-driven steps. Each writes its output to Supabase; the UI shows progress as a stepper. ~4,000 prospects per active campaign.

Brief — the client's strategic brief (voice, ICP, value prop, proof, exclusions). Manual Phase 1. The foundation every later step reads from. See §8.

Campaign — the specific campaign brief (offer, angle, targeting). Manual Phase 1. "Which campaign this week" is the scheduler's job — §9.

List import — company + people. Bind an already-built list (a Google Sheet or CSV upload), then clean, classify columns, dedupe, enrich, and blocklist at both the company and people level. Sourcing the list itself (RevenueBase pulls) still happens manually, upstream of the app. Manual Phases 2 and 3.

Grade & route. Grade company (A/B/C/D) and person (A/B/C/D) by ICP fit, then route A+B → Email, A → LinkedIn. Runs via Trigger.dev → Modal. Manual Phases 2, 3, 6. See §6, grading is the highest-leverage step.

Writing strategy. Compile the per-segment message strategy and templates (hooks, proof, CTA, hesitations) from the briefs plus the 4A snapshot. Runs as a Claude Code agent on the Sarah worker (§14). This is where the standard build lands today.

Write — compose the email/LinkedIn copy in the client's voice from the brief's hooks, proof, CTA. Manual Phase 1 copy, applied per prospect.

QA — run quality gates over the copy before export to EmailBison/HeyReach. Manual Phase 6.

What "Run Step 5" actually does

Browser

→

Vercel UI

→

Next.js API routeapi/campaigns/[id]/step5

→

Trigger.dev taskcost predict · poll · retry

↓HTTP POST

Modal — aicro-pipelinefans out to N containers · calls the LLM

→

writes campaign_prospect_rows

↑ Browser reads Supabase realtime for live progress.

Reading run state, a few gotchas operator note

The campaign_runs.status field is not a reliable "is this step done" signal. The real source of truth is whether the step's output exists in campaign_step_outputs or the relevant columns on campaign_prospect_rows are populated.
Step-1 completion is tracked on the client_step1_complete flag, not on a per-run output. A run with no Step-1 output is not necessarily missing Step 1; check the flag.
Duplicate and test runs accumulate per campaign (the same campaign can show many run IDs). There is no automatic cleanup yet, so filter to the real run when reading the data.
step1-backfill runs were a one-time retroactive seed of Step-1 briefs for existing clients, not part of the normal pipeline.

6 · Grading & routing the highest-leverage step

Grading is the heart of the operation and one of the most valuable things the app produces. Get it wrong and we write great copy to the wrong people. It runs inside Step 4A and is two LLM grades against the campaign brief's criteria, then a routing decision. Everything is graded A/B/C/D by ICP fit, with the reasoning stored per prospect.

Company grade — A/B/C/D step 07

A = ideal ICP (structural fit + a positive signal) · B = good fit · C = marginal/adjacent · D = hard disqualifier.
Graded by an LLM from crawled site content (about, services, industry, B2B/B2C, size signal, social proof) against the brief's company criteria + disqualifiers.
Gate: D → skipped; only A/B/C companies get people pulled (a hard rule).

Person grade — A/B/C/D step 08

Graded by ICP fit for the decision-makers the client is after (target function, ideal titles, decision-maker profile, excluded roles from the brief). Actually three grades per person:

Person grade (overall fit) · Authority grade (buying power) · Operational-relevance grade (closeness to the problem).
Gate: D → skipped.

How the grade becomes a channel

Company A/B/C/Dstep 07 · D = skip

→

Person + Authority + Op-Relevancestep 08 · D = skip

→

Routestep 11 · channel + route code

↓clean data + authority decide the channel

Emaildomain clean

LinkedInLI clean + authority A/B

Bothdomain + LI + authority A/B

Noneno clean path

Two things to fix — and they matter a lot

Multi-ICP grading isn't supported yet gap — a prospect gets one company grade and one person grade, evaluated against the campaign's whole ICP union. An icp_label (ICP1/ICP2/…) is assigned after grading for routing, but the A/B/C/D grade is identical across ICPs. Clients chasing multiple ICPs need different grading per ICP, all trackable — that needs a schema change (grades keyed by ICP), not just a prompt tweak.
Criteria live only in the brief. Grading reads the campaign brief's Section 5 (company) + Section 4 (people) JSON from Supabase — there's no per-client grading config table and no operator UI, so tuning grading = editing the brief. No Clay benchmark yet.

File data vs. LLM research: grading is hybrid — it enriches first (crawl/search the domain, scrape LinkedIn), then the LLM grades the enriched fields against the brief. Enrichment is cached cross-client (180 days); the grade is always re-run per campaign because the brief differs. Grades + reasoning are stored per prospect in campaign_prospect_rows; which grades convert is exactly what the learning loop should feed back (§11).

7 · The tooling — data providers & when each fires enrichment stack

Enrichment + grading (Step 4) is where the external data providers are called. Most of the work is domain crawl → LLM grade → LinkedIn scrape, with search/AI fallbacks when a crawl fails. Per-domain results are cached cross-client for 180 days, so the same company isn't re-crawled per campaign.

Provider	What it gets	When it fires	~Cost
RevenueBase (Snowflake)	Prospect list pulls (companies + people) from the lead database	Step 3 sourcing — manual today	—
Spider	Domain homepage + ~4 subpages crawl; LinkedIn company page	Step 4 — primary enrichment, per unique domain	$0.0006–0.012 / page
Serper	Google search — recover a domain or LinkedIn URL	Step 4 — fallback when Spider returns <300 chars / fails	$0.001 / query
Perplexity Sonar	AI company summary (HQ, specialties, LinkedIn about)	Step 4 — last-resort when Spider + Serper both fail	~$0.001
Apify (LinkedIn scraper)	LinkedIn profile: headline, positions, followers	Step 4 — for every kept A/B row with a LinkedIn URL (batches of 50)	$0.005 / URL
OpenRouter (gpt-4.1-mini / gpt-4o-mini / gpt-5-mini / claude)	The LLM for grading, message strategy, copy, QA	Steps 4 (grade), 4B (strategy), 5 (write), 6 (QA)	per-token
EmailBison	Email sending platform	Send — post-pipeline, manual; creds via Airtable	—
HeyReach	LinkedIn sending platform	Send — post-pipeline, manual; creds via Airtable	—
Airtable	CRM / client config + per-client platform credentials	Config lookup across steps; CRM source for the Portal	—
n8n	Workflow automation (metrics sync, follow-ups)	Daily monitoring (Phase 8) + Portal triggers	—

Provider mix all-time as of 2026-05-30 (~$53 metered): OpenRouter $40 · Apify $7 · Perplexity $3 · Spider / Serper cents. (LinkUp / Apollo / Hunter / DataForSEO are in the legacy pipeline, not the current app.) The real cost is fixed subscriptions, not metered API, see §12.

When each provider fires inside Step 4 (enrichment → grade)

prospect rowdomain / name / LinkedIn

→

Spider crawlhomepage + subpages

→if fails

Serpersearch recovery

→if fails

SonarAI summary

↓enriched fields (cached 180d per domain)

OpenRouter LLMcompany grade A/B/C/D

→A/B/C

ApifyLinkedIn profile

→

OpenRouter LLMperson grade A/B/C/D

D-grades drop out at each gate. Survivors get routed (§6) and written (Step 5).

The app is trying to do a lot in two steps. "List building" and "Grade & route" each pack in many of the 35 manual sub-steps (source → dedupe → domain-enrich → blocklist → company-grade → people-pull → people-enrich → people-grade → route). Worth decomposing these into finer, individually re-runnable sub-steps in the UI so an operator can see, retry, and trust each one — rather than two big opaque steps.

8 · The client brain — the Step-1 brief the foundation

The brief is the document that teaches the AI how to write for a client: voice rules, ICP, proof points, disqualifiers, banned phrases. Every downstream agent reads it first, so one brief shapes thousands of emails. In Doug's Step-1 design this is framed sharply:

"Doctrine is the product, not a feature."

A good brief is the whole game. The brief carries machine-readable validators (voice checks, banned-phrase counts, service-clarity hard-fails, and the Step-1 hard-fail that blocks output when a client's Voice & Identity and outcome translation are both missing) so quality is auditable, not vibes, which is also what makes the system trustworthy enough to put in front of clients.

Where the brief lives today — and the catch

Client context is authored as markdown in the aicro-os repo and one-way migrated into Supabase (a manual seed: aicro-os/01_clients/{slug}/*.md → client_doctrine, client_proof_inventory, client_outcome_anchors, client_service_clarity, client_briefs). At compose time, the app reads the brief entirely from Supabase — so the read side is already DB-backed. The gaps:

The same brief effectively lives in three unreconciled places (the markdown, a per-campaign step output the UI edits, and a brief record).
UI edits don't flow back to the markdown, and re-running the seed can overwrite them.
The global copy rules are currently hard-coded in the compose prompt, so editing rules in the database doesn't yet change output (a known fix on the roadmap).
Records join on a text client_slug with no shared ID, so a naming mismatch can silently orphan a client's context. Coverage is still thin — only a handful of clients are fully seeded; the rest fall back to generic copy.

aicro-os stays the GTM team's repo for authoring, queries, and dashboards — it isn't going away. The shift is that it becomes one source that feeds the brain, not the master copy the app silently derives from.

9 · What feeds the brain — today vs. where it's going

Today, "what we know about a client" is assembled manually into aicro-os: a client fills an onboarding form, and we turn it into research reports, ICP profiles, and profile.md / icp.md / client_summary_brief.md files — which then seed Supabase. That's a thin, one-time, human-built slice of the truth.

The target is a living client brain fed by every signal we have about a client, with the app doing ingestion + synthesis instead of a person assembling markdown:

Sources we create / ingest today

Onboarding form / intake
Hand-built research reports, ICP profiles, profile.md
Grading (company + person A/B/C/D by ICP fit) — we generate this
Reply outcomes (interested / not)
Campaign & copy performance

Sources we're adding

Sales / client calls — the calls our clients run with their prospects (transcripts → voice, objections, proof, won-language)
Client CRM data — deals, stages, closed-won, account notes
Bad-ICP flags — client feedback via the Portal (see §11)
Slack / team signals · market shifts · ongoing research refreshes

What it becomes

One canonical client brain in Supabase — versioned, multi-source, always current — that the app writes to via ingest workers + the learning loop, and that everything downstream reads from. The brief stops being a static file and becomes a live profile of everything we know about the client.

sales / client calls client CRM onboarding + research grading replies copy performance bad-ICP feedback

↓GTM App ingest workers + learning loop — synthesize, not hand-assemble

⛁ Supabase — the client brainone living, versioned profile per client

↓

GTM App agentswrite better copy from richer context

▲ Client Portaldisplays "what we know about you" → client + team see & edit

10 · The scheduler — which campaign to run next internal, in flight

Once we know a client (Step 1), the next question is which specific campaign to run for them this week (Step 2). That's the scheduler — Josh's product, today an internal tool in the GTME dashboard, designed to live eventually in the Portal. It's the propose-then-promote surface for what to build next.

How it works

A nightly cron scores per-client opportunities across ~9 weighted factors (dormant lead pool, quarter-end promo, hiring signals, conference windows, …).
Each factor reads real data from Supabase / Airtable.
Operators see high-scoring proposal cards at internal.aicro.co/gtme.
Click Accept → fires POST /scheduler/accept → creates the Airtable Campaign Creation Request that kicks off the brief-writing work.

Where it fits

The Step-1 brief is an input the scheduler reads when scoring — change a client's voice/ICP and next night's proposals shift.
One pattern, not two: the brief's own refresh proposals route through the same /gtme propose-then-promote surface — not a parallel system.
Status: architecture shipped, internal testing stage — not yet daily-driver. The natural home as it matures is the Portal.

⛁ Supabase / Airtablereal per-client signals

→

nightly cron~9-factor weighted score

→

/gtme proposal cardsoperator reviews

→Accept

Airtable Campaign Creation Requestkicks off Step 1/2 brief

11 · The system-wide learning loop the moat

"The moat is the loop, not the model."

This is bigger than the GTM builder app. The brain only gets smarter if every signal about what's actually working flows back into client context — and most of those signals originate outside the builder (in the campaigns, the inboxes, the scheduler, and the Client Portal). The loop spans all three apps and lands in one place: the client brain.

The signals that must feed the brain

Campaign performance — which campaigns are working (open/reply/positive rates), surfaced in the Portal's trends.
How copy is resonating — which copy / variants win, per slot.
Reply outcomes — interested vs. not, and the language of the wins.

The bad-ICP table — client feedback via the Portal: "these aren't our people." A direct, high-value correction to the ICP in the brain.
Client edits — corrections clients make to "what we know about you" (§11).
Closed-won / CRM outcomes — what actually converted downstream.

Every one of these is a vote on what the brain believes about a client. Today they're scattered (some in the app, some in the Portal, some in Airtable/CRM, some only in a person's head). The target: they all write back into client context, so the next campaign is composed from a brain that already learned from the last one. Critically, several of these — bad-ICP, client edits — we are not yet feeding back; we need to be.

campaign performance how copy resonates reply outcomes bad-ICP feedback (Portal) client edits closed-won / CRM

↓feedback from across all three apps — not just the GTM builder

⛁ client context / the brain

→

next campaign starts smarter

What runs this today, the "Sarah" engine

The app already has the feedback engine that does part of this (the "Sarah" brain, modal_app/sarah/). It watches post-send telemetry and improves prompts. Much is live in code; the cross-app pieces (bad-ICP feedback, client edits, full CRM) and the brief-level write-back are the gaps. Honest status:

Live status (2026-06-09): these components are real, deployed code and invocable on demand, but their scheduled crons were disabled (commented out) on 2026-05-19 and 2026-05-20 to stop a cost runaway, so they are not auto-firing right now. The only live recurring job is sync_sarah_top_emails_weekly (Mondays). Read "LIVE" below as "real code that runs when invoked," not "running on a live schedule."

LIVE real code, runs when invoked (schedules off) PARTIAL wired but a step is stubbed DESIGN in Doug's spec, not yet code

Stage	What it does	Where	Status
Reflect	Scans telemetry for anomalies; an LLM produces root-cause + `doctrine_update_proposals`	`sarah/reflect.py` → `brain_reflections`	LIVE
Classify replies	Categorizes inbound replies + emits learning signals	`sarah/reply_classifier.py`	LIVE
Score / critique copy	Self-critique scores artifacts; generates prompt variants	`self_critique.py` · `variant_generator.py`	LIVE
A/B promote	z-test on variant performance; auto-promotes the winner into the live prompt	`variant_promoter.py`	LIVE
Propose → promote (briefs)	Proposals surface for a human to review + approve doctrine changes	`/sarah/reflections` + `…/apply`	PARTIAL — apply is a stub
Bad-ICP / client edits / CRM → brain	Pull Portal + CRM feedback into client context	cross-app wiring	TO BUILD

So the copy-level loop genuinely runs in code (reply → reflect → critique → variant → promote). The gaps are the cross-app, brief-level ones: pulling bad-ICP feedback + client edits (Portal) and closed-won (CRM) into client context, and making "apply this proposed change" actually write back to the brain. That's the next build — wiring existing signals into one brain, not from scratch.

12 · The Client Portal — the client-facing product live: portal.aicro.co

The Portal (client-portal repo) is the surface clients and the team log into. It turns the raw campaign/reply data into dashboards and action surfaces. This is where the work becomes client-facing — and increasingly, where the app's internal surfaces (scheduler, client brain) will land.

What it does today

Campaigns dashboard — trends, campaign list, bad-ICP flags, 7/30/90-day rollups.
Follow-ups — escalation queue; send/dismiss; per-client cadence config.
Reply categorization — positive/negative, by channel.
Settings — profile, 2FA, team invites, reply-agent setup.
Admin onboarding tracker — client setup stage-gates (internal).
Roadmap / What's New — a living changelog clients can read.

What it becomes — the client-context surface the big one

The Portal is where the client brain becomes visible and editable. Instead of "what we know about a client" living in profile.md files only we can see, the Portal surfaces it as a "what we know about you" view:

Clients can see it — their ICP, voice, proof points, what the agent learned.
Clients can edit / correct it — feeding the brain directly.
We can all see & edit it too — one shared, versioned source.

Client input becomes another high-quality source feeding the brain — and the transparency ("here's exactly what the agent knows and why") is itself a selling point. The scheduler likely lands here too as it graduates from the internal GTME dashboard.

GTM App / pipeline

→writes

⛁ Supabasereplies · leads · follow-ups · workspaces

←reads

AirtableCRM today

↓Portal reads Supabase + Airtable

▲ Client Portalclient sees campaigns · replies · follow-ups · brain

↔trend refresh · follow-up send

n8n

Built on Next.js + Supabase + Airtable + n8n; Vercel git-connected; multi-tenant (workspaces/workspace_members + RLS, clients see only their own). The seam to the GTM App is Supabase — as Supabase becomes the canonical brain, the Portal reads more from it and less from Airtable.

13 · The data layer — two databases, the migration & the DAL architecture debt → priority

Supabase is the real center of gravity, and it's mid-reorganization. Three things matter here.

① No data-access layer

Database calls are inline everywhere: ~519 .from() calls across 153 files in the web app (≈69 distinct tables), plus per-file REST helpers across the Modal workers. No single place owns table names.

② A new multi-schema DB ("Cortex")

The new "Cortex" Supabase project reorganizes the flat ~177-table database into domain schemas (agent, knowledge, intelligence, events, monitor) and renames tables (campaign_runs → agent.runs). Actively migrating but referenced nowhere in app code yet.

③ Why DAL-first

Without an access layer, the cutover is a 150+-file, error-prone sweep. Introduce a data-access layer once (table names + schema in one module) and both the cutover and the client-brain work become a one-place change. This is the unlock.

The plan of record: P1a — add the DAL (web/src/lib/data/* for TS; a centralized REST module for the Modal workers). P1b — make Supabase canonical with write-through editing; collapse the 3-copies brief problem; add a stable client_id (keep slug as a handle). P1c — inject the DB copy rules into the prompt so editing the brain actually changes output. P2 — repoint to the new schema through the DAL, regression-test every step. Note: the canonical Supabase project is still being settled across the app and the Portal (legacy vs. the new "Cortex" project), so part of this work is locking one project and aligning both. (The GTM App itself currently uses a single Supabase config; the open item is app-vs-Portal alignment, to confirm with Anderson.)

14 · The "Sarah" terminal — what it is & the open question clever, fragile

What it is / why

The app's /sarah pages are browser windows into a warm Railway container that spawns the Claude Code CLI in a tmux session to run an agent step, streamed back live. It exists to dodge two real constraints: serverless cold starts, and Anthropic API rate limits — authenticating with a Claude subscription instead of the per-token API sidesteps both at a flat rate.

The decision to make

It's a smart workaround and also the most bespoke, hardest-to-own piece — a hand-built terminal handshake on a container that depends on a subscription token. The open question: should the steps run through an interactive terminal at all, or a headless job queue? The queue trades flat-rate billing + live-watching for something far easier to own, scale, and sell to clients. Worth deciding deliberately as we productize.

15 · Where we're going the company's leverage

The thesis: a vertical GTM agent is only as good as the client brain it writes from. The moat isn't the pipeline plumbing — it's the accumulated, versioned, multi-source context per client that lets the agent write like that client's best rep, and the loop that keeps it learning. "Doctrine is the product; the moat is the loop." Plus: vertical depth beats horizontal width, and internal-first, externally-ready — prove it on AICRO's own clients, then expose the right surfaces to clients.

The target model

Supabase is the one client brain — authoritative, versioned, edited directly.
Many sources feed it — sales/client calls, CRM, grading, replies, copy performance, bad-ICP, onboarding/ICP research — via ingest workers (bones exist in sarah/crm_sync.py, reply_classifier.py, performance_aggregator.py).
The app synthesizes, not humans — the brain is assembled + refreshed by agents, instead of someone hand-writing profile.md.
Surfaced & editable in the Portal — clients + team see "what we know about you" and correct it; client input becomes another source.
Agents propose, humans approve — finish the propose-then-promote loop so brief refreshes apply with a click, on the same /gtme surface the scheduler uses.

Two surfaces, one boundary

From Doug's Step-1 design — the shape we're converging on:

The Sarah IDE stays the developer environment for building agents, but stops being the launch path.
A GTM/operator surface (and ultimately the Client Portal) is where non-engineers run agents, refresh briefs, and review proposals.
Both run the same agent logic server-side — one pattern, not two parallel systems.
Approach: "internal-first, externally-ready."

The destination: a vertical-specific GTM agent for proptech, built on a living client brain, delivered through the Portal as a multi-tenant product — where clients see the agent's work, its proof, and its learning.

16 · Roadmap — where we are → where we're going

P0 — Get the GTM App on one codebase now

Push everything to main so GitHub == live; connect Vercel + Railway (native); add a GitHub Action for Modal + Trigger.dev.
Outcome: one push redeploys all four; anyone can open a branch and get a preview — the way the Portal already works.

P1 — The client brain

Data-access layer → write-through editing → inject DB copy rules into the prompt → stable client ID → non-engineer edit surface → finish the learning loop (apply proposals; wire bad-ICP, client edits, CRM) → multi-source ingestion (calls, CRM).

P2 — Cutover, convergence & productization

Repoint to the new multi-schema "Cortex" database (one-place change via the DAL); converge the scheduler + client-brain surfaces into the Portal; expose the right surfaces to clients; build toward the sellable proptech-vertical product.

Shared team reference. The manual order-of-operations is AICRO's current GTME playbook; pipeline, grading, deploy, scheduler, and data-layer facts reflect the current ilayda-app and client-portal repos plus a live inspection on 2026-05-30, with a code re-validation and corrections pass on 2026-06-09 (Steps 3 / 4A / 4B, the learning-loop cron status, and the Step-1 hard-fail). Counts (client coverage, cost, tables, call-sites) move as we build. Items marked DESIGN / TO BUILD are from Doug's Step-1 overview and not yet in code. Questions or corrections → Josh.