AI 2026 to (?) All models □ chat □ agent
BDD002
Dad's Workshop Manual
Photo-realistic cutaway figure: a warm, smiling British dad holding a mug of tea in a chromed robotic hand, machine arms fitted, two spare robotic arms shown exploded either side, and a small dotted heart outline with a question mark over his heart.
©BDD Mark Bunce

v2 · July 2026 · the original v1 is kept here

AI doesn't replace you. It amplifies everything except the part only you have.

Section 01 · Operating principleThesis

AI doesn't replace you. It amplifies the parts of you that are reproducible. The part only you have, intent, is still almost entirely yours.

The whole piece builds toward one equation:

Agent+Knowledge+Brain+Memory+Senses+Interface+Tools=Capability

Intent+Capability=Work getting done

Humans have always been this stack. AI is a new stack with the same shape. Combine them and work that used to take weeks takes hours. But the thing that starts any of it, wanting something done, is still yours.

The same framework describes one chat window and a team of agents running a business. What changes is the number of agents, the depth of each block, and how they're connected. The shape holds.

Section 02 · Parts catalogueThe framework: eight blocks

Each block has: human version, AI version, a one-line "why this matters," and a drill-down for the curious.

Plate 01 · The two agents Rev. 2026-07 · Scale: as fitted
A British dad figure labelled with the eight blocks of the framework, rendered as a luminous schematic. A humanoid AI figure labelled with the same eight blocks, its dotted empty heart marking the absent Intent.
Plate 01 · The two agents, exploded view. Same eight components throughout. Note absence of component 1 in right-hand figure.
Plate 02 · Intent

1. Intent

The thing that starts everything. Before the tools, before the models, before the brainpower: the wanting.

Human
you, deciding you want something
AI
not yet (*), a contested and occasionally alarming open question
Why it matters
every block below amplifies intent. No intent, no amplification. "Use AI more" is useless advice. "I want to get X done, faster" is where it starts.

Intent is the one block you cannot outsource. You can outsource knowledge (tell the AI what you know). You can outsource brainpower (ask the AI to think). You can outsource memory (let the AI remember for you). You can even outsource deciding (ask for a recommendation and take it). What you cannot outsource is wanting in the first place. Every block below exists to serve yours. Without it, none of them run.

* Terminator caveat acknowledged. See you in 2029.

Drill-down

The practical version of this: when an AI session goes sideways, it's almost always because the intent wasn't clear. Vague inputs produce vague outputs. "Help me with my emails" is a wish. "Draft three replies to the ones from the school, polite but firm because the headmaster is being unreasonable again" is an intent.

The better you get at articulating intent, the less time you'll spend wrangling unusable AI responses. This is the single most-underrated skill in the field, and nobody teaches it, because it isn't really AI. It's communication. The dad who shows a picture and says "number 2 on the sides, scissor-trim on top, leave the fringe" gets the haircut he wanted. The dad who says "just a little off" comes home looking like a banker from 1987. Same principle.

Academic literature calls this "prompt engineering" but don't be put off by the grandness of the term. It's just asking clearly.

Plate 03 · Agent

2. Agent

The thing that does the work. When people say "AI", this is usually what they mean.

Human
you, your colleagues, the people you hire
AI
a growing family of specialists, including chat models, image generators, video generators, voice models, code models, and increasingly, autonomous agents that string these together
Why it matters
"AI" is a family, not a single thing. Picking the right agent for the task is half the win. You don't call a plumber for a haircut.
Drill-down

Think of AI agents like kitchen appliances. Within the family of "blenders" there are Nutribullets, Vitamixes, and Magic Bullets. They're all blenders, but you'd choose differently depending on what you're making. AI works the same way: a family of underlying kinds, each with specific products.

Language models (LLMs). The workhorse family. They process and generate text. Underneath ChatGPT is a model called GPT (made by OpenAI). Underneath Claude is Claude (Anthropic). Underneath Gemini is Gemini (Google). The product is the shop window; the model is what's behind the counter. Products: ChatGPT, Claude, Gemini, Perplexity, Copilot. Use for: thinking, writing, summarising, explaining, deciding. If in doubt, start here.

Image models. A family (mostly built on a technique called diffusion) that produces images from text descriptions or reference pictures. Products: ChatGPT's built-in image maker (currently the best at images with readable text, and you already have it), Midjourney (v8), Nano Banana (Google's Gemini image model), Flux, Ideogram, Adobe Firefly. Use for: illustrations, logos, concept art, Christmas cards.

Video models. A family that has grown up fast: clips now run from a few seconds to a couple of minutes, with sound and lip-sync generated alongside the picture. Products: Veo, Kling, Runway, Luma. Use for: social content, adverts, short montages. Feature-length is still a stretch; your daughter's birthday montage is not.

Voice and audio models. Three flavours: transcription (Whisper), voice generation (ElevenLabs), music generation (Suno, Udio).

Design and app-builder models. A newer family (often called vibe design) that turns a plain-English description, or a rough sketch, into a working interface: screens, layouts, and increasingly the actual code sitting behind them. Some hand you a picture to refine; some hand you a running app. Products: Google Stitch, Figma Make, v0, Lovable, Bolt, Framer, Uizard. Use for: mocking up an app or a website, a landing page for the padel tournament, turning a napkin sketch into something you can actually click.

Decision-making systems (reinforcement learning). A quieter family that learns by trial and error rather than by reading text. Famously: AlphaGo. Also used in robotics, self-driving cars, and game-playing AI. You're unlikely to use one directly, but worth knowing exists.

Scientific models. Trained to respect specific laws, not just spot patterns in data. Example: PINNs (Physics-Informed Neural Networks), used in fluid dynamics, engineering simulations, climate modelling. Not consumer-facing, but you'll hear more as industries build their own.

Rule-based systems. The oldest family. No learning, just explicit logic. Expert systems, spam filters, the decision tree in your car's fault diagnostics. Technically still AI, and in the right context, still the right answer.

Model, product, agent: three things often confused

  • A model is the underlying AI engine. Think GPT-5.5, Claude Sonnet, Midjourney v8. You don't usually interact with models directly.
  • A product is the branded app wrapping a model. ChatGPT, Claude.ai, the Midjourney Discord bot. This is what most people mean when they say "AI."
  • An agent is a model (or product) given a role, knowledge, tools, and a goal, and let loose to pursue it semi-autonomously. Agents are what the composed setups later in this manual use. A chat window is not an agent; an agent is a chat window with intent, memory, and hands.

This manual uses "Agent" as the friendly shorthand for "the thing doing the work," because the word has caught on generally. But when someone in the group says "I built an agent," they probably mean the third thing, not the first two.

Persona files: giving each agent a "who am I"

Every serious agent carries a persistent set of instructions defining who it is, what it does, what it doesn't do, and how it writes. Different tools and frameworks call this by different names, but it's the same idea:

  • CLAUDE.md: Claude Code projects
  • soul.md: Hermes and other character-first frameworks
  • AGENTS.md: emerging convention in several open-source stacks
  • .cursorrules: the Cursor IDE
  • .clinerules / .windsurfrules: Cline, Windsurf, similar tools
  • Custom Instructions: the field in ChatGPT Projects and Claude Projects
  • System prompt: the generic name that sits underneath all of the above

The file (or field) is where you write things like "You are a research analyst. Respond concisely. Never invent citations. Flag uncertainty explicitly." A 200-word persona file well done is the difference between a competent agent and a generic assistant. Done badly or not at all, the agent defaults to a vaguely helpful, verbose, uncertain non-specialist.

Parts availability. Models get repriced, restricted and discontinued with little notice; the group has watched one launch and vanish inside a single week. Treat any specific model like a part number: check it is still made before building around it, and keep your setup portable enough to swap it.

Foundation models and where this is heading. The newest concept in this space is foundation model: a single large model trained so broadly that it can be adapted to many tasks (text, images, audio, reasoning) rather than specialised for one. GPT-5.5, Gemini, Claude are all foundation models now. The taxonomy above is still useful, but expect the families to bleed into each other as one model learns to handle everything.

Plate 04 · Knowledge

3. Knowledge

What the agent already knows before you ask. The equivalent of everything you learned at school, plus every book you've read, plus every job you've had.

Human
school, university, work experience, reading, apprenticeship, years of doing
AI
pre-training (read most of the internet, learned language and world), fine-tuning (apprenticed in a speciality), RLHF (learned from human feedback), safety training (learned what not to do)
Why it matters
an AI knows an enormous amount in general, and absolutely nothing about you. Your company, your kids, your life, your business: not in its training. You'll have to supply that knowledge separately.
Drill-down

There are actually three kinds of knowledge in play, and confusing them is the source of most AI frustration.

Parametric knowledge. What's baked into the model during training. Fixed. Hard to change. This is why you can't "teach ChatGPT your company's jargon" by telling it once in a chat. The chat ends, and the knowledge evaporates.

Contextual knowledge. What's in the current conversation. Everything you've pasted in, uploaded, or told the model since the session started. Powerful, but limited by the context window (see Memory) and gone when the session ends.

Retrieved knowledge (RAG, for "retrieval-augmented generation"). External knowledge the agent can look up on demand, usually from a vector database you've set up. This is how serious business setups give agents access to CRM data, documents, and policies without retraining anything.

Most of us will lean on the first two. A serious personal setup eventually grows into the third.

The honest limitation: no matter how clever the AI, if the knowledge it needs isn't in one of these three buckets, it will either guess (plausibly, sometimes wrongly) or tell you it doesn't know. Noticing this distinction is half of working with AI well.

Plate 05 · Brain

4. Brain

The processing engine. What the thinking actually runs on.

Human
one biological brain, roughly 86 billion neurons, runs on sandwiches
AI
neural networks running on specialised chips (GPUs, TPUs). Bigger models mean more capable, but more expensive and slower.
Why it matters
not all AIs are the same class of mind. A small model running locally on your phone is not the same thing as GPT-5.5 on a cloud server. When answers feel flat, you might be on the wrong brain.
Drill-down

Brains come in tiers. You don't need to care about the technical details, but you should care that the differences are real.

Tier 1: consumer defaults. The free or cheapest paid tier of any major model. Fine for 80% of what most dads will ever ask. ChatGPT free, Claude free, Gemini free.

Tier 2: capable paid tiers. Where serious work happens. ChatGPT Plus, Claude Pro, Google AI Pro (the subscription Google used to call Gemini Advanced). Around £15–20 a month each. The models are bigger, smarter, and more tolerant of long or messy inputs.

Tier 3: pro and team tiers. For heavy users. Larger context windows, priority access, usually some agent-building tools bundled in. £80–250 a month.

Open-weight models, local and hosted. A separate family worth knowing about. You can run free open-weight models (Llama, Kimi, Qwen, DeepSeek, Mistral, GLM) on your own hardware via Ollama, LM Studio, and similar, or rent the same models on managed cloud via Ollama Cloud, Together, Groq, Fireworks, and Replicate. Two years ago these were fine for dabbling, not serious work. Today they are genuinely competitive for most tasks: summarisation, extraction, routing, a lot of reasoning, substantial writing. The frontier proprietary models (Claude, GPT-5.5, Gemini) still lead for the hardest reasoning, for agents that juggle many tools without dropping one, and for keeping very long documents straight, but the gap is narrower than it was, and closing. What changed: the open-weight model quality caught up, and managed hosting made running them practical at a fraction of frontier prices.

One useful rule: if a task matters and a free-tier answer feels weak, try the same prompt on a paid tier before concluding AI can't do it. Most "AI is useless" stories are actually "I used a small brain for a big problem" stories.

Hybrid routing: pick the cheapest brain that does the job. The operating pattern for anyone running multiple AI tasks is to match brain to task. Bulk work (summarising, categorising, triage, routing) goes to a fast cheap model, often open-weight on managed cloud. Client-facing writing, tricky reasoning, or analysis goes to a capable expensive one, usually frontier proprietary. Specialised jobs (vision, code, legal) go to a specialist. Paying top-tier prices to sort an inbox is the same mistake as hiring a partner to do the filing.

Plate 06 · Memory

5. Memory

What the agent holds onto, and what it forgets.

Human
working memory (what's in your head right now), long-term memory (years of accumulated experience), external memory (your notebook, Obsidian vault, that shoebox of receipts)
AI
context window (the current chat, wiped when it ends), parametric weights (baked in from training, almost impossible to update), external stores (RAG, vector databases, auto-memory systems that remember you across sessions)
Why it matters
memory is where AI becomes personal. A fresh chat forgets you every time. A properly configured one remembers. That's the difference between a tool and an assistant.
Drill-down

Working memory matters more than people realise. Every AI chat has a context window: the amount of text the model can hold in mind at once. A year ago that was a long novel; today's frontier models hold a stack of them, and the biggest open models claim a whole shelf. The wall is further away, but it's still a wall: hit the limit and the earliest parts of the conversation drop off, silently. If the AI suddenly "forgets" something you told it an hour ago, that's what happened.

Long-term memory has gone from party trick to standard kit. ChatGPT, Claude, and most serious agents now remember things about you across sessions: your preferences, your household context, your work, your name. Set this up once and every future conversation starts with context you'd otherwise have to re-explain. The thirty seconds you spend configuring memory pays back every time you open the app.

One quirk worth knowing: no amount of memory teaches the machine what day it is. Models default to the date their training ended; tell it the date, or give it a tool that can check.

External memory is where serious setups live. A personal knowledge base (Obsidian, Notion, Tana) that the AI can read into is the dad equivalent of having a second brain, and one that a well-built agent can query. A business-grade setup is this at scale: shared knowledge bases that every agent reads from, so nobody is ever the only one who knows something.

Plate 07 · Senses

6. Senses

How the agent perceives what you show it.

Human
sight, hearing, touch, smell, taste
AI
reads text, sees images and PDFs, hears audio, processes video. When a model can do several of these at once it's called multimodal.
Why it matters
you are not limited to typing. Take a photo of a receipt. Record a voice note. Drop in a PDF of a school letter. The modern AIs absorb all of it.
Drill-down

The unlock here is that modern chat apps (ChatGPT, Claude, Gemini) accept far more than text. A few things worth knowing:

Photos. Point your phone at something and ask about it. A label in Arabic, a broken fuse box, a receipt to expense, a maths question from your kid's homework. Often faster than typing a description.

Documents. Drop a PDF, a Word document, a scanned school letter. The AI reads it. Good for summaries, finding specific clauses, translating jargon, preparing questions before a meeting.

Voice. All major apps now accept voice input. For dads who are better talkers than typers (or driving, or doing the school run), this is the unlock. Voice in, structured text out.

Video and audio files. Less common in daily use but widely supported: meeting recordings, voice notes, dashcam clips. Upload, ask for a summary, get one in seconds.

A quiet revolution: if you're still only typing to AI, you're using a fraction of what it can do. The single most useful habit a dad can build this year is show it, don't describe it.

Plate 08 · Interface

7. Interface

How you reach the agent, and how it reaches back into your world.

Human
phone, keyboard, microphone, voice, eye contact
AI
chat apps, voice mode, code editors, browser agents, desktop agents, robots, raw APIs
Why it matters
choosing the right interface is half the battle. Chat for conversations. Voice for hands-free. Browser for web tasks. IDE for code. Matching the interface to the task is an underrated skill.
Drill-down

The interface is where AI meets your actual life. A few that matter for BDDs:

Chat apps (web and mobile). The default. ChatGPT, Claude, Gemini. Great for anything that fits a conversation. Almost certainly where you're starting.

Voice mode. The mobile app of any major chat tool, with voice turned on. Genuinely useful for the school run, the commute, a walk. Ask questions, dictate emails, get summaries hands-free.

Browser agents. Claude in Chrome, ChatGPT Atlas, Perplexity Comet. The AI takes control of a browser tab and uses the web for you: books flights, fills forms, finds things. No longer a novelty, still occasionally confused. Give it bounded jobs and watch it work.

Desktop agents. Cowork (the app you may be reading this in), Claude Code. The AI works with files on your computer. For anyone with folders of documents, photos, or spreadsheets to wrangle.

Code editors. Cursor, Claude Code, GitHub Copilot. The AI sits in the editor with you, reading and writing code. If you code, or want to start.

Agent-building platforms. If you want to build your own agent rather than rent a pre-made one, platforms like Hermes, Gumloop, Make, n8n, and Zapier stitch together models, tools, and triggers without coding. Make and Zapier are workflow-automation veterans that recently grew AI teeth. n8n is open-source and self-hostable. Gumloop and Hermes are AI-native. The learning curve is steeper than a chat app, lower than writing code.

APIs. The plumbing underneath all of the above. Relevant if you're building something, irrelevant otherwise.

Where agents live. An always-on agent needs an always-on home: anything from an old laptop in the study to a tiny single-board computer works. One house rule from the group's hard experience: while you are figuring it out, do not run it on your primary computer.

The single mistake most dads make is staying in the chat-app interface for things better done in a voice or browser interface. If you find yourself typing out what you could say, switch to voice. If you find yourself copy-pasting between the AI and a website, try a browser agent.

Plate 09 · Tools

8. Tools

What the agent can reach for to actually do the work, beyond talking.

Human
hammers, spreadsheets, search engines, email, your team
AI
web search, code execution, file handling, Gmail, Calendar, Slack, Linear, your Drive, image generators, video generators, and increasingly, other agents
Why it matters
a smart agent without tools can only talk. An agent with tools can actually do things: send the email, book the flight, draft the spreadsheet, file the receipt.
Drill-down

Tools are the difference between advice and action. A chat model with no tools can tell you what to say in an email. A chat model with tool access to your Gmail can draft it in your drafts folder for you to review and send. Same brain, different reach.

A useful rough taxonomy:

Information tools. Web search, database queries, knowledge-base retrieval. How the AI finds things.

Action tools. Send email, book calendar, write file, post to Slack, create a task in Linear. How the AI does things.

Creation tools. Generate image, generate video, generate code, generate audio. How the AI makes things.

Meta tools. Call another agent. The thing that turns one helpful AI into a multi-agent workspace.

For most dads, the highest-value tools to wire up first are these: web search (real-time answers), Calendar (scheduling help), Gmail or equivalent (inbox triage), and a file-handling interface (let it work with your documents). Each of those, once connected, compounds what the agent can do.

A word you'll see on the box: these connectors now run on an industry-wide plumbing standard called MCP (Model Context Protocol). Every major AI company has adopted it, and there are tens of thousands of connectors built on it. You don't need to know how it works, any more than you need to know how a USB plug works. If an app says it has an MCP connector, your AI can probably reach it.

One caution before you hand an agent your inbox: anything that reads messages from strangers can be tricked by a cleverly worded message (the trade calls it prompt injection). The manual's companion piece on when AI is the wrong tool covers it; the short version is, keep a human between the AI and anything that matters. See the other half of the manual below.

The group's AI tools list is maintained collectively and kept current. Worth bookmarking.

DIY or hire? Wiring up search, calendar and mail is genuinely a settings-page job. Building a private assistant over your company's documents is a project: doable yourself with patience, or a fair thing to pay someone for. The line sits roughly where the last mile begins.

Specific products named in the drill-downs are current as of July 2026. A reference point, not a forever list.

Section 03 · AssemblyHow the blocks compose

Once you see the eight blocks, the next thing to notice is that they compose.

A single agent with its own knowledge, memory, senses, interface, and tools is already useful. That's a chat window with ChatGPT or Claude. Powerful on its own.

But nothing stops you from running several agents in parallel. One handles email. One handles calendar. One handles research. Each has its own role (intent delegated from you), its own knowledge, its own tools. They share a memory layer, a knowledge base everyone reads from. They can call each other when they need to. The whole network runs toward a larger intent: "keep my week on track," or "find, qualify, and follow up on new business leads."

That's what agentic workflows are. Not a different thing: the same eight blocks, plural and connected. The industry has already made the move from prompts (one ask, one answer) to workflows (ongoing, self-directed, multi-step). That's what people mean by "AI agents": the blocks composed and pointed at a standing goal.

This matters for the reader in two ways:

Section 04 · Operating altitudesWorked examples: three altitudes

Instead of three parallel tasks, three altitudes of the same framework. Same blocks throughout. What changes is how many instances of each, and how deeply they're used.

Where are you flying today?

Altitude 1: One agent, one chat

"I want to understand this article someone sent me."

Someone in the group chat drops a link. It's a 3,000-word piece from a newspaper you don't usually read, about a subject you don't usually think about, and you don't have the time or energy to read it properly. You want the shape of it in two minutes so you can respond without pretending.

Block by block
  1. Intent. You, wanting: "I want this article in plain English, shorter, and I want to know if it's worth reading properly."
  2. Agent. One general language model. ChatGPT, Claude, or Gemini: it genuinely doesn't matter which one. The default app on your phone is fine.
  3. Knowledge. Nothing extra. The model's general training is enough to understand the article.
  4. Brain. Whichever tier you're on. The free tier handles this without breaking a sweat.
  5. Memory. Just the current chat.
  6. Senses. You paste the article text in. Or, if it's a screenshot or a PDF, you drop that in.
  7. Interface. The phone app. On the sofa, one-handed.
  8. Tools. None. Pure conversation.

Result: a two-paragraph summary with the key claims, a note on what's well-argued and what's thin, and a recommendation on whether the full piece is worth your time. Thirty seconds from paste to answer.

The point: this is AI at its most useful and least impressive. One chat window, one conversation, one article. No setup, no configuration, no agents. This is enough. The majority of us will get enormous value out of nothing more than this, forever.

Altitude 2: One agent, real work

"Plan a family weekend in Ras Al Khaimah."

It's April. The kids are climbing the walls. You've half-decided to take them somewhere within a three-hour drive for a long weekend, but the three-browser-tab research phase you'd normally go through has not happened. You want something sketched out before dinner.

Block by block
  1. Intent. You, with a real goal: "Two adults, two kids, three nights, leaving Friday morning. Budget around AED 3,000 for the hotel portion. Kids need a pool they'd actually swim in. One adult-only dinner."
  2. Agent. Still one general language model, but now a capable tier.
  3. Knowledge. The model knows RAK in general: hotels, beaches, the main resorts, how long the drive is. It does not know your budget, your kids' ages, your dates, or that you promised someone a decent spa. You'll add that context.
  4. Brain. Cloud model, standard paid tier.
  5. Memory. Set up household context once in the app's memory (family size, ages, allergies, preferences). Every future trip-planning chat starts knowing this.
  6. Senses. Upload photos from last year's trip to somewhere you loved. Drop in a PDF of a brochure that caught your eye.
  7. Interface. Start on the laptop. Switch to voice on the drive to pick up the kids, to refine the shortlist out loud.
  8. Tools. Web search for current rates and availability. Calendar integration to block the dates. Booking handoff at the end if you want to go straight to reservation.

Result: a shortlist of three resorts with quick pros and cons, a suggested itinerary for the weekend, and a booking link for your top pick. Twenty minutes of back-and-forth where two hours of tabs used to live.

The point: this is where most of us should be within a month of reading this manual. One capable agent, properly configured, with the right tools available. No multi-agent systems, no code, no CRM. AI as a competent personal fixer.

Altitude 3: Many agents, composed

"Run the top of my sales funnel."

You've got a business. Leads come in from a website form, LinkedIn, referrals, and the occasional speaking event. Someone (you, or an assistant, or nobody at all) has to qualify them, research the company, send a first response, schedule a call, take notes on the call, send a follow-up with a proposal, and remind you about it all when the time comes. This is a meaningful chunk of someone's week.

Block by block
  1. Intent. Yours, at the business level: "New leads get qualified and routed within an hour. Research gets done before my first call. Follow-ups don't drop. I see a weekly summary on Friday."
  2. Agents. Several, each with a specific role:
    • A lead-qualifier reads each inbound form submission, scores it against your ideal-customer profile, and routes high-value leads to you immediately.
    • A research agent pulls public information on the prospect's company: size, sector, recent news, who else you know there.
    • A content drafter writes a first response and a tailored follow-up, matching your tone.
    • A call analyst transcribes and summarises the conversation, flagging next actions.
    • A reporting agent pulls it all into a Friday digest.
    • An orchestrator decides which agent runs when, and passes information between them.
  3. Knowledge. Shared knowledge base with your CRM data, past deals, product docs, and tone-of-voice guidelines. Each agent reads what's relevant to its role.
  4. Brain. A mix. A cheap fast model for triage and routing. A capable model for writing. A specialised model for call analysis.
  5. Memory. Shared long-term store so agents don't re-learn things. Per-agent working memory for each task.
  6. Senses. Inbound emails, call transcripts, uploaded documents, signals from your CRM, LinkedIn activity if you wire it in.
  7. Interface. Mostly invisible. The agents work in the background. You see summaries in Slack, a dashboard on your laptop, the Friday report in your inbox.
  8. Tools. CRM write access, email send, calendar booking, web search, document generation, and connections to each other (agents calling agents).

Result: a working setup one of us runs today, roughly. Qualification goes from "hours" to "instant." Research is done before every first call. Follow-ups stop dropping. The Friday report arrives without you doing anything. The work didn't vanish; it just isn't all yours anymore.

The point: this is the shape of where the industry is going, and it's less exotic than it looks. The framework hasn't changed. There are just more instances of each block, networked. If you understand the eight blocks, you understand what's happening here, even if you never build one yourself.

A note on altitudes. None of these three is "better" than the others. Altitude 1 is the right altitude for "I want to understand this article." Altitude 3 would be absurd overkill for the same task. The skill isn't always flying higher; it's flying at the right altitude for the task in front of you.

Section 05 · CautionsThe other half of the manual

Everything above is relentlessly upbeat, on purpose. It exists to get you off the fence and using the thing. But a manual that only tells you when to reach for a tool, and never when to put it down, is selling you a hammer and calling every problem a nail. So here is the other half, in two parts, both grown from questions raised in the group.

When AI is the wrong tool

What you do with AI, and when you shouldn't. Grown from a question raised in the group.

A language model doesn't look things up and doesn't calculate. It predicts what sounds right, very fast, and it is confident in exactly the same tone whether it's right or wrong. So before you hand it a task, ask one question: would I accept a confident guess here? If yes (a draft, a summary, a brainstorm), crack on. If no, slow down.

The five places to put it down

When the answer has to be exactly right, and checkable. School-fees totals, VAT, medicine doses by weight. That's arithmetic wearing a suit. Use the boring tool: a spreadsheet, a formula, a calculator. Ask the AI to set the formula up, not to produce the final number nobody rechecks.

When someone is accountable for the decision. Medical, legal, financial, hiring. Use AI to prepare. Keep the decision, and the accountability, human. "The chatbot said so" is not a defence you want to be running anywhere.

When there is no undo. The email to the whole client list, the transfer, the legal letter. On no-take-backs actions the review step is not optional, and it is not the AI's job. It's yours.

When something is trying to trick it. An agent reading messages from strangers can be talked into doing something daft by a cleverly worded message (prompt injection). Anywhere the input is hostile, keep a human between the AI and anything that matters.

When you need the same answer every time, and an explanation. Who qualifies for the discount, how the penalty is calculated. Ask a model twice and you can get two confident, different answers. That's what rules are for. You don't bring a chainsaw to prune the roses, however much you enjoy the chainsaw.

AI defence: the flip side

What AI does to you. Grown from a question put to the group.

Every one of us is now both a user of AI and a target of it. Fluent, personalised, and endless is the new default for anything unsolicited, so judge less by polish and more by provenance: do I know where this came from, and did I go looking for it? You don't need to become paranoid. You need to be slightly harder to fool than the average target, which is most of the game.

The habits that cover most of it

Treat anything unsolicited as unproven. If "your bank" writes, go to the bank the way you always do, never through their link. The oldest advice in the book, it just matters more now that the bait is well written.

Agree a family code word. The scam that will reach a dad in this group is the urgent voice: a call or voice note, your kid or a colleague, distressed, needing money or a code right now. The voice may genuinely be cloned. If it can't give the code word, it isn't them, however much it sounds like them. Hang up, call back on the number you already have. Anything engineered to rush you is telling you to slow down.

Guard what you paste. Nothing goes into a chatbot you'd mind seeing on a billboard: no passwords, card numbers, client secrets, passport scans. Find the training-data setting in every app you use and set it the way you actually want. Treat an AI asking to connect to your email the way you'd treat a stranger asking for your keys: fine for the ones you trust, revocable when you change your mind.

The workshop shelf

Eight companion guides, each a ten-minute read, each grown from what this group actually hit:

If you got from a Commodore 64 to here, this won't break you. The parts are fewer than you think.

Another way to slice this: the Six Verbs

A parallel framework. When the eight-block view feels abstract, start from what you're trying to do.

Understand.
Absorb and simplify information. Articles, documents, jargon, voice notes, whole books. The "make this clearer" use case.
Create.
Generate new things. Text, images, code, music, a first draft of anything. The "I need a starting point" use case.
Visualize.
Make ideas visible. Diagrams, mock-ups, logos, charts, scenes. The "let me see it" use case.
Build.
Assemble working things. Websites, scripts, automations, templates, agents. The "make it real" use case.
Automate.
Put work on autopilot. Recurring tasks, triggers, workflows, inbox rules, multi-agent loops. The "I only want to do this once" use case.
Decide.
Narrow options, weigh trade-offs, recommend. Comparisons, risk analyses, prioritisation, second opinions. The "help me choose" use case.

The full Six Verbs companion is on the shelf. Pick the verb that matches what you're trying to do, and come back to the blocks when you're ready to build.