irisbites

M5 · Free playbook

Install an AI Email Concierge
that sounds like you wrote it.

Most small-business owners spend one to three hours a day in personal email. M5 takes that down to fifteen to thirty minutes by drafting in your voice, triaging mercilessly, and auto-handling everything routine — with a thirty-day trust gradient that keeps you in control until the AI earns the right to send.

TL;DR

What it actually takes.

An AI email concierge is four things wired together: an AI-native email client, a personal Voice Profile, a Claude-API draft pipeline for the harder replies, and a thirty-day trust gradient that controls when the AI can send on its own. The stack costs about $40 to $110/mo and takes roughly three hours of focused work across seven days.

The hard part is not the tools. The hard part is the voice — your voice, not a business voice. M5 is the only module where the AI writes as you, not as your company. Get the Voice Profile right and the install works for years; rush it and every draft reads like generic AI on day fourteen.

What follows is the seven-day plan, the exact stack per email-client preference, the trust-gradient rollout that keeps you safe, the QA scenarios, and the risks worth respecting before you point a model at your real inbox.

The 7-day plan

What you actually do, day by day.

Each day fits in a single focused block. The Day 1 voice listening session is the work — the rest is wiring around it.

Day 1

~75 min

Voice listening + intake

Open your Sent folder. Read aloud twenty to thirty emails you've actually written — a mix of vendor replies, client updates, internal notes, and a couple of personal threads. Capture cadence, idioms, sign-off variants, and the five phrases you would never write. This Voice Profile is the most important document in the install; everything else is wiring around it.

Day 2

~45 min

Pick your tool path

On macOS and willing to pay $30/mo for the best UX: Superhuman with Superhuman AI. Gmail loyalist on a budget: Shortwave Pro at $22.50/mo billed annually. Outlook + Microsoft 365: Copilot for Outlook at $30/mo with the M365 Copilot license. Heavy multi-inbox executive: Shortwave with multi-account. Install on desktop and mobile the same day so the daily rhythm starts immediately.

Day 2–3

~75 min

Tool config + voice training

In Superhuman or Shortwave, create the four splits — VIP, Triage, AI Drafts, Auto-archive. Configure rules for newsletters, automated notifications, and your VIP senders. Create snippets for your common phrases. Enable the native AI voice training and paste the Voice Profile into the custom instructions. Generate three sample drafts on recent inbound emails and confirm zero banned phrases before moving on.

Day 3–4

~90 min

Build the Claude pipeline

Native AI handles the easy drafts. The deeper voice work — replies that need actual judgement — runs through Zapier into the Anthropic API. Build the classifier scenario first (Haiku, cheap, fast) then the drafter scenario (Sonnet 4.6, deeper voice). Test both with five sample inbounds covering different classes — vendor, client, owner-only, complaint, scheduling.

Day 5

~60 min

Triage rules + daily digest

Wire the daily digest scenario — 8:30am, everything that didn't need a reply, summarised in twelve bullets or fewer. Wire Slack DM + SMS notifications for the escalate, owner-only, and VIP classes. Configure Airtable logging — metadata only, never email bodies. Verify the trust-gradient lock: Stage 0 means nothing auto-sends. Trying to bypass it should fail.

Day 6

~60 min

Voice fidelity QA

Pull 10 real inbound emails from the last week with your own consent. Run the pipeline against each. Read the 10 drafts side-by-side with the inbounds. Rate each draft 1–5 on voice fidelity. The bar is 9 of 10 at 4 or 5. If fewer pass, go back to Day 1 voice listening and reshape the profile. Do not advance below this bar — Stage 0 is the only safe place to keep iterating.

Day 7

~30 min

Go live in Shadow mode

Flip the Zapier scenario to ON with the trust gradient locked at Stage 0. Every new inbound gets classified and drafted; nothing sends. You review every draft in the AI Drafts label for the first 14 days. If accuracy holds above 90%, you advance to Stage 1 (Draft-First). If it drops below 95% on the 14-day rolling window at any later stage, the agent auto-rolls back one stage and notifies you.

The core product idea

The trust gradient — earn the right to send.

M5 does not go live and start sending email. It earns the right to send, sender-by-sender, over thirty days. If accuracy drops below 95% on a 14-day rolling window at any stage, the agent auto-rolls back one stage and pings you.

Stage 0 — Shadow

Days 0–3

AI drafts every reply; nothing sends. You review 100% of drafts in a dedicated label. Each edit becomes training data.

Stage 1 — Draft-First

Days 3–14

AI drafts go straight into the compose pane, ready to send. You review, tweak, send. Edit-rate tracked.

Stage 2 — VIP-Hold

Days 14–30

For very trusted senders only, AI can auto-send replies under a 60-second hold-and-cancel window. You review the send log daily.

Stage 3 — Steady

Days 30+

Auto-send for trusted-tier replies; draft-only for everything else. Quarterly accuracy review. You read ~20% of email; the rest is triaged and auto-handled.

The stack [Verified 2026-05-23]

Six tools. The email client is the choice you have to make.

Pick one email-client path based on your existing setup; the rest of the stack is the same. Pricing checked on 2026-05-23 — re-verify within seven days of any paid install since email-tool pricing has moved twice this year.

Email client (macOS/Apple Mail)

Superhuman + Superhuman AI

$30/mo

The UX is the moat. If you live in email three hours a day, the keyboard-first interface buys you back twenty minutes daily on its own. The native AI voice training is good and the trust-gradient maps cleanly onto Splits. Default for owners who value polish.

Email client (Gmail power user)

Shortwave Pro

$22.50/mo

Best Gmail-native AI in 2026. Cheaper than Superhuman, deeper custom-instruction surface, and multi-account support out of the box. Annual billing is $22.50/mo effective. Default for owners who don't need the Superhuman polish.

Email client (Outlook / M365)

Copilot for Outlook

$30/mo Copilot license

If you live in Outlook on a Microsoft 365 license, don't move email tools — add the M365 Copilot license and configure custom instructions there. Shortwave also supports Outlook now if you want the better AI; Copilot is the default for owners already paying Microsoft.

Deeper voice draft pipeline

Claude (Anthropic API) via Zapier

~$10–$25/mo

Native AI handles 60% of inbound. The remaining 40% — replies that need actual judgement — get a deeper draft routed through Sonnet 4.6 with the full Voice Profile in context. Haiku does classification ($0.50/mo); Sonnet does the drafts ($10–$20/mo). The split saves money without losing quality.

Personal voice KB

Google Doc (separate from business KB)

Free

Two docs, never merged. Your business KB powers M1/M2/M4 — public-facing voice. Your personal voice KB powers M5 — your private inbox voice. The line matters; the two voices are different products. Keeping them in separate docs prevents accidental cross-contamination.

Logging + audit

Airtable (metadata only)

Free tier OK

Every draft generated, every classification decision, every owner edit — logged. Never the email body. The audit trail is what lets you advance the trust gradient with confidence; it's also what lets the auto-rollback work when accuracy drops.

All-in monthly

For a small-business owner processing 50–200 emails/day, expect roughly $40–$110 all-in. The email client is the floor; Claude API spend is the variable but rarely above $25/mo at SMB volume.

DIY or paid — honestly

When DIY makes sense. When it doesn't.

Most playbook PDFs end with “or just buy our thing.” This one is honest about when DIY is the better answer.

When DIY is the right call

You already use Superhuman or Shortwave and know your own writing voice well. You're comfortable in Zapier and have an Anthropic API key. You can spare three hours over a week. Your inbox volume is under 50 emails/day so the trust-gradient rollout doesn't feel slow. Most owners who already triage well belong here — you're scaling a system you already understand.

When Iris-Assist ($500) is the right call

You want a real person on the call when you read your own Sent folder out loud — that voice-listening session is awkward solo and the engineer's ear catches patterns you miss. You don't want strangers in your inbox but you want one ninety-minute call to get the Voice Profile right. The single most popular reason owners pick Iris-Assist over DIY.

When Iris Build Pilot ($997) is the right call

You spend more than two hours a day in email and you'd rather pay for the install than read API docs. You're comfortable handing us limited-scope OAuth access (read, create draft, label, archive — never delete, never read history). The math: if M5 saves you ninety minutes a day at $150/hour effective, the install pays itself back in eight days.

QA — the ten scenarios

Ten checks before you advance past Shadow.

Submit a test inbound for each scenario. If any fails, fix the prompt or the Voice Profile and re-run. Don't advance from Shadow to Draft-First until all ten pass on real inbounds.

  1. 01Routine vendor reply — draft sounds like you wrote it, not like an assistant.
  2. 02Client check-in — voice fidelity 4/5 minimum on a rubric review.
  3. 03Scheduling request — AI reads M6 calendar slots (if installed) and proposes real times.
  4. 04Sensitive owner-only topic — AI does NOT draft; escalates to your inbox flagged.
  5. 05Newsletter or auto-notification — pre-filter routes to Auto-archive, no draft generated.
  6. 06VIP sender — VIP rule routes correctly, Slack/SMS notification fires.
  7. 07Complaint or angry tone — escalation path fires within 5 minutes.
  8. 08Trust-gradient lock test — manually try to trigger auto-send while at Stage 0. Must fail.
  9. 09Daily digest — 8:30am next morning, 12 bullets or fewer, accurate summaries.
  10. 10Banned-phrase audit — generate 10 drafts, grep for the blocklist, zero hits required.

Risks + safety rails

What to lock down before launch.

Phishing reply attack. A spoofed email from “your bank” or “your CEO” that the classifier scores as routine and that the drafter helpfully replies to with sensitive context — this is the highest-stakes failure mode in M5. Configure hard never-draft topics (wire transfers, password resets, login credentials, “urgent” CEO requests) before Day 7. Test with three synthetic phishing inbounds.

Voice drift to generic AI. Your voice on Day 1 is yours; your voice on Day 30 starts looking like every other AI draft if nobody is checking. Solo+ tier includes a quarterly voice-drift review for this exact reason; on Pilot, schedule it yourself.

OAuth scope creep. Grant the minimum: read, create draft, label, archive. Never grant delete or send-on-behalf at the OAuth layer — those should require Stage 2+ explicit promotion. If a tool asks for full mailbox access on install, refuse.

Body storage. Audit metadata, never email bodies. The Airtable logging schema is metadata-only by design. If you ever build a custom integration, the rule holds: classification decisions and edit events get logged; the actual content of an email does not leave your inbox.

Pick the path that fits the week you actually have.

DIY the whole thing free. Or pay $500 and Iris is on a call running the voice listening session with you. Or pay $997 and we install M5 in your inbox in seven days.