Module 4, Workflow Automation & Building Skills
⚡ The 30-second version. Use it for: automating the same repetitive tasks you do every week. The method: automate the rule-bound grind, keep a human in control of anything that matters, and climb the "automation ladder" one rung at a time, starting with no code. Start here: the 15-minute starter at the bottom. (New to AI terms like "MCP" or "agent"? See the glossary.)
This is the "I want to build something" module: reusable skills, connected workflows, dashboards, agents, and integrations (one firm wired together project management, calendar, email, and lead intake; another used the developer tools to replace an outsourced dev team). It's the highest-leverage module, and the one where it's easiest to over-build or quietly create a data and control problem.
The one-sentence method: Automate the repetitive and rule-bound; keep judgment and outward actions human. Climb the ladder one rung at a time, and only as far as the task actually needs.
The respondent who said "I need to think smaller, my ideas are large, but I'm missing the daily things that save us time" named the whole game. Leverage comes from automating one annoying recurring task well, not from building an "AI employee" on day one.
The automation ladder, go up only as far as you need
Each rung adds leverage and adds setup, cost, and risk. Most accountants get enormous value from rungs 1–2 and never need 4–5.
| Rung | What it is | Code? | Good for |
|---|---|---|---|
| 1 · Saved prompt / skill | A reusable instruction set (like the tax-research skill) |
No | Any task you re-explain often |
| 2 · Project / custom GPT | Persistent context + your templates & knowledge | No | A recurring role: research, drafting, review |
| 3 · Connected tools (MCP / integrations) | AI that can read your email, calendar, PM, drive | Light | Pulling context together; status, triage |
| 4 · No-code automation (Zapier/Make/n8n) | Trigger → steps → action pipelines | Low | Multi-step recurring processes |
| 5 · API / agents | Custom-built automation | Yes | High-volume, bespoke firm workflows |
Worked examples you already have: the tax-research skill is
rung 1; the redactor tool is a small rung-5 utility. Both started
as "this one task is annoying", that's the right origin for every automation.
What to automate, the candidate test
Before automating anything, score it. More yeses = better candidate:
- Frequent, you do it weekly/monthly, not once a year.
- Rule-bound, it follows steps, not case-by-case professional judgment.
- Structured inputs, the inputs are predictable (a report, a file, a form).
- Reversible / low cost of error, a mistake is caught and fixed cheaply.
Low-frequency, high-judgment, or irreversible tasks (signing an opinion, taking a filing position, sending money) stay manual or human-approved. Automate the grind around the judgment, not the judgment.
The build approach, one rung at a time
- Find the candidate. Audit a typical week for repetitive, rule-bound tasks (prompt below).
-
Start at the lowest rung that works. Can a saved prompt/skill do it? Do that first. Don't build a pipeline for something a Project solves.
-
Design the human gate. Decide up front where a person reviews before anything goes out or becomes irreversible (see guardrails). Draft-only automations are low-risk; acting automations need a gate.
-
Build small, test on safe/anonymized data, then expand. Verify each step before chaining the next, errors compound in pipelines.
-
Document it so it's a firm asset, not tribal knowledge, and review it periodically.
Prompt & build library
1. Find your automation candidates
Act as an operations analyst. Here's how I spend a typical week: [list recurring tasks].
For each, score it on: frequency, how rule-based vs. judgment it is, how structured the
inputs are, and reversibility. Rank them as automation candidates and tell me the lowest-
effort way to automate the top 3 (saved prompt, project, connected tool, or pipeline).
2. Turn a repetitive task into a reusable skill/SOP
I do this task repeatedly: [describe the steps and the inputs/outputs]. Write it as a reusable
instruction set I can save and rerun: the goal, the inputs it needs, the steps, the rules/
guardrails, and the output format. Flag where a human must review before anything is finalized.
3. Scope a connected workflow safely
I want to connect [tools, e.g., email + calendar + PM] so AI can [goal]. Before building:
list what data each connection exposes, which steps should be read-only vs. able to take action,
where a human approval gate belongs, and the smallest version worth building first.
4. Design the human-in-the-loop gate
Here's an automation I'm considering: [describe]. Identify every step that sends, files, pays,
or touches client data, and propose where I should require human review/approval so nothing
irreversible or client-facing happens without a person signing off.
Tool picks (by rung)
- Claude Projects & skills (rungs 1–2, no code), start here. Save your context once, reuse it.
-
Claude Code (rungs 1, 5), for building skills and small tools the way we built the
tax-researchskill andredactor(it can write the script and the guardrails). -
MCP connectors (rung 3), let AI read your calendar/email/drive/PM. Powerful, and exactly where the data-exposure guardrail below bites.
-
Zapier / Make / n8n (rung 4, low-code), trigger-based pipelines for recurring processes.
- The developer API (rung 5), for high-volume, bespoke firm workflows; the "replaced our outsourced dev team" tier.
Three worked examples (from simplest to most built)
Concrete starting points, each with the human gate and verification built in.
A. Monthly close checklist, a reusable skill (rung 1, no code)
You run the same close steps every month. Turn them into a saved skill once: it takes your rough inputs (which accounts are reconciled, what's outstanding) and returns a clean status checklist with flags for anything missing. Why it's safe: no client identifiers needed (use account types, not names), and you review the output. Build it with: Prompt 2 above, describe the steps once, save the result as a Project/skill, rerun monthly.
B. Morning inbox triage, a connected tool (rung 3, MCP / integration, read-only)
Connect your email read-only so the AI reads the morning's client messages and returns: reply today / quick acknowledgment / can wait or delegate, with a one-line draft for the urgent ones. Why it's safe: the connection is read-only, the AI never sends anything; you send every reply. The catch: the moment you connect a real inbox, the connector is a service provider touching client data, vet it and bind it under your WISP (see guardrails).
C. Bank statement → draft journal entry (rung 4–5, no-code flow or small script)
A flow takes a statement export, categorizes transactions against your rules, and outputs a draft journal entry. Why it's safe: it produces a draft you review, and you apply the Module 5 verification (foot it, tie it to the source) before anything posts, and you run real client data only in a firm-approved tool. Don't let it post on its own.
Pattern across all three: start from one annoying recurring task, keep the AI's output a draft, and put a human between the AI and anything irreversible.
Agentic AI in a small firm, safe boundaries
"Agents" (AI that takes a sequence of actions on its own) are the most-asked-about and the most hazardous. You can use them, inside fences:
-
Read-only by default. An agent that reads and drafts is low-risk. One that acts (sends, files, pays, edits the books) needs a human approval gate on every such step.
-
Sandbox first. Build and test on anonymized or dummy data before it ever touches a real client file.
-
Log what it does. Keep a record of the agent's actions so you can review and, if needed, unwind them.
-
One task, well-bounded. A narrow agent ("triage these emails," "draft these entries") is controllable; an open-ended "run my practice" agent is not. Start narrow.
-
You're still the reviewer of record. Nothing an agent produces is final until you've reviewed and adopted it (SSTS §1.4).
Script vs. API vs. connected tool, which is which
A quick rule of thumb several of you asked for:
- Saved prompt / skill, you run it occasionally, by hand. Start here.
- No-code flow (Zapier/Make/n8n), it should run on a schedule or trigger, across a couple of apps.
-
Script (e.g., Claude Code), repetitive, local, file-heavy work you'd otherwise do by hand (like the
redactortool). Cheap to run, easy to inspect. -
API / agent, high-volume or programmatic work that needs to run without you, or act across systems. Most powerful, most oversight required, and where the data and control guardrails matter most.
Guardrails
Automation is where small mistakes scale and where data exposure quietly multiplies. These matter more here than anywhere else in the library.
-
Every connection is a service provider, and a new data exposure. Connecting email, PM, or a drive lets AI (and the connector vendor) reach real client data across systems, at scale. Each connector/MCP server/integration must be vetted and contractually bound like any service provider under your WISP, and watch the downstream-tools problem, a connector that calls another provider is a further disclosure (IRC §7216 / FTC Safeguards / AICPA Confidentiality Rule, see Regulatory Foundation and the AI Acceptable-Use Policy).
-
Keep a human gate on anything irreversible or outward-facing. AI may draft; a person reviews before it sends, files, pays, or goes to a client or authority. An automation that acts on its own is the single biggest risk in this module. You remain the reviewer of record.
-
Verification gates in pipelines. Errors and hallucinations compound. Validate each step's output before the next consumes it, especially anything involving numbers or authority (reuse the Module 2 verify discipline).
-
Secure the credentials. Connected tools mean stored API keys and tokens, protect them under your WISP (access control, no keys in shared chats or repos).
-
Start small; resist over-building. The smallest automation that saves real time beats an ambitious agent you can't trust or maintain.
Your 15-minute starter
- List every recurring task from your last week.
- Run Prompt 1 and let it score them on the candidate test.
-
Take the #1 candidate and run Prompt 2 to turn it into a saved skill/SOP, staying on rung 1. Don't connect anything yet.
-
Use it next week. If it earns its keep and you keep hitting the same data by hand, then consider climbing to rung 2–3, with the human gate designed in.
Win condition: you automate one real recurring task this week without a line of code, and you have a repeatable way to spot, scope, and safely build the next one.
Next module: Data Cleanup & Extraction.