The premise
Most AI training fails the same way: one generic course is rolled out to the whole company, everyone watches the same overview of prompts and policies, and a month later almost nothing has changed in how work actually gets done. The content was not wrong. It was just not aimed at anyone in particular.
AI training for employees works when it is built around the job, not around the tool. A sales rep, an HR partner, a support agent, and a backend engineer all touch AI, but the realistic goal, the risks that matter, and the proof that training landed are different for each. This article walks through what good looks like role by role, and why measuring a competency baseline before you start is the part most programs skip.
Why generic AI training underperforms
A single company-wide course optimizes for coverage, not for change. It teaches the average employee, who does not exist.
Generic training tends to stop at awareness. People learn what a large language model is, see a few prompt examples, and hear a list of things not to paste into a chatbot. That is useful context, but it rarely survives contact with a real workflow. The moment someone has to apply it to a quote, a candidate shortlist, or a flaky test, the gap between watching a demo and doing the work becomes obvious.
The deeper issue is relevance. The risks a finance leader needs to weigh (model error in a board number, data residency, vendor lock-in) have almost nothing in common with the risks a support agent faces (confidently wrong answers sent to a customer, leaked account data in a transcript). When the same slides try to cover both, each audience gets a diluted version that is too abstract to act on and too generic to remember.
Role-based training flips the design. You start from a small number of jobs people actually do every week, decide what competent use of AI looks like in those jobs, and teach to that. Coverage drops on paper, but the share of people who change how they work goes up, which is the only number that matters.

Measure the baseline before you teach anything
Before the first session, capture where each role actually stands. A short, practical baseline (not a quiz on definitions) tells you what people can already do: can a marketer get a usable first draft and spot where it is wrong, can an analyst check an AI answer against a source, can an engineer review AI-generated code rather than paste it blind. Without this, you cannot tell whether training worked or whether confident people were already good.
Keep the baseline grounded in tasks. Ask people to complete a representative task with AI and rate the output yourself against a simple rubric: was the prompt specific, was the result checked, were the obvious risks handled. Record a number per role and per competency. This is your before measure, and it doubles as a needs analysis: the lowest scores tell you where to spend the most time.
Run the same task again a few weeks after training for the after measure, ideally on a fresh example so people cannot rehearse. The gap between before and after, broken out by role, is the honest signal of whether the program changed behavior. If a role barely moves, that is information, not failure: it usually means the training was pitched at the wrong altitude for that job.

What good looks like, job by job
Leadership and executives. The realistic goal is judgment, not keyboard skills: knowing where AI is worth the investment, what it cannot be trusted with, and how to ask for evidence. The risks they must understand are governance and accountability (who owns a model decision, how regulated use cases are controlled, where data goes) and the cost of theatre projects that demo well and deliver nothing. The hands-on outcome is a leader who can read an AI proposal and pressure-test it: what is the baseline, what is the measured lift, what happens when the model is wrong.
Operations. The goal is to find and redesign the repetitive, rules-light work where AI removes drudgery (triage, extraction, drafting, summarizing) without breaking a controlled process. The risks are silent errors that compound across a workflow and over-automation of steps that needed a human check. The hands-on outcome is one real process mapped, one step safely augmented, and a simple control that catches the model when it is wrong. Sales and marketing. The goal is faster, better first drafts and research, not autopilot. The risks are off-brand or factually wrong output sent to a prospect, and quietly handing customer data to a tool that should not have it. The hands-on outcome is a personalized, accurate piece of outreach or content produced in a fraction of the usual time and visibly checked before it ships.
HR and L&D. The goal is to speed up drafting and synthesis (job descriptions, summaries, learning content) while keeping fairness and privacy intact. The risks are bias in anything touching hiring or evaluation, and sensitive employee data leaving controlled systems. The hands-on outcome is a faster draft workflow plus a clear rule for where AI must not make or rank a people decision. Support. The goal is faster, more consistent responses with the human firmly in the loop. The risks are confidently wrong answers and leaked account data in transcripts. The hands-on outcome is an agent who can use AI to draft and research a reply, then verify it against the real account before sending. Technical and engineering. The goal is to use AI as a force multiplier across code, tests, and docs while owning the result. The risks are insecure or subtly broken generated code, license and data leakage, and skill atrophy from pasting without review. The hands-on outcome is an engineer who ships AI-assisted work they have read, tested, and can defend in review.

Role-based training that changes the work
We design training around the jobs your people do, measure before and after, and aim every session at a hands-on outcome rather than awareness.
Start from a baseline
We run a short, task-based assessment per role to capture where people actually stand, then use the lowest scores to decide where the program spends its time. The same task is rerun after training as an honest after measure.
Tracks, not one course
Leadership, operations, sales and marketing, HR and L&D, support, and engineering each get a track with their own realistic goal, the risks that apply to them, and exercises drawn from their real work, not a shared overview.
Practice on real tasks
People train on their own workflows and walk out with one concrete result: a checked draft, an augmented process step with a control, a verified support reply, reviewed AI-assisted code. Competence is shown, not asserted.
Measurable lift, role by role
Good training shows up as a visible move from the before baseline to the after measure in each role, and in work that ships faster without new risk.
A few weeks in, you should see specific changes: leaders asking for baselines and measured lift before approving AI spend, operations running one process with a human-in-the-loop control, sales and support shipping checked output faster, HR with a clear line around people decisions, and engineers reviewing rather than pasting. Each is tied to a number that moved from your baseline.
Just as important is what does not happen: fewer confident-but-wrong outputs reaching customers, no sensitive data quietly leaving controlled systems, and no theatre projects that demo well and deliver nothing. Role-based training does not just raise capability, it raises the floor on risk because each group learned the failure modes that actually apply to them.

AI training
questions we get asked.
Direct answers to the questions we get asked the most. If yours isn't covered, write to the team.