Adopting AI in Audit: A Practitioner's Honest Playbook (What Works, What Doesn't)

There's a lot of noise about AI in audit. Most of it is either marketing ("automate 80% of your audit!") or fear ("AI will replace CAs!"). The reality, for the practising Indian Chartered Accountant in 2026, sits between both extremes — and depends a lot on which AI tool and which task.

This post is the practitioner's playbook. What public LLMs (ChatGPT, Claude, Perplexity, Grok) are actually good at for audit work. Where they fail. The DPDPA-and-client-data trap most firms haven't thought through. And when an audit-grade alternative like CORAA makes sense versus stitching together public tools.

No theory. No marketing. Just the trade-offs.

The four tools most CAs are evaluating

Before going into what works, here's the lay of the land as of May 2026:

Tool	Subscription cost (India, monthly)	Context window	Long-term memory
ChatGPT Plus	~₹1,999 / month per user	128K tokens (GPT-4o)	Persistent memory across sessions (opt-in)
Claude Pro	~₹1,700 / month per user	200K tokens (Claude Opus / Sonnet)	No persistent memory; Projects feature persists context within a project
Perplexity Pro	~₹1,600 / month per user	~32K tokens varying	Limited
Grok (X Premium+)	~₹1,300 / month per user (bundled with X)	131K tokens	Limited
CORAA	~₹2,000-3,000 per ENTITY per year (unlimited users)	Audit-specific, full ledger ingestion	Per-engagement persistent

A 10-person CA firm subscribing all four ChatGPT / Claude / Perplexity / Grok at one user each pays ~₹6,600/month — ~₹80K/year. With every staff member subscribed (10×), it's ~₹8 lakh/year. Without security guarantees on client data.

That cost framing matters. Public LLMs are not cheap when scaled across a team — and they're not designed for audit-data confidentiality.

What public LLMs are GENUINELY good at for audit

Despite the limitations, public LLMs do specific things well:

1. Drafting narrative content

Engagement letter drafts. Management Representation Letter language. CARO 2020 observation narratives. KAM paragraph drafts. Subsequent events working paper text. Going concern discussion drafts.

The LLM is essentially a fast typist. Give it the facts, get a competent first draft. Auditor reviews, edits, signs off.

Time saved per engagement: 2-4 hours of drafting work that becomes 30-45 minutes of review.

2. Standards research and summarisation

"What does SA 540 require for accounting estimates?" "Walk me through the Ind AS 115 5-step model." "What's the difference between SA 240 paragraph 26 and paragraph 33?"

Claude, ChatGPT, Perplexity all answer these well — Perplexity with citations from the ICAI website.

Time saved: 15-30 minutes per research question vs reading the standard yourself.

3. Templates and checklists

"Generate a checklist for SA 530 audit sampling working paper." "What should a Section 188 RPT engagement memo cover?" "Create a list of red flags for SA 240 journal entry testing."

LLMs are excellent at expanding a topic into a structured list.

4. Excel formulas and macros

"Write a SUMIFS formula to total transactions above ₹10,000 in column D where column F = 'CASH'." "Give me a VBA macro to highlight duplicate vendor names in column A."

LLMs are excellent at code / formula generation.

5. Translation and simplification

Converting accounting jargon into client-friendly language. Translating between Ind AS terminology and AS terminology for mixed audiences. Simplifying complex audit findings for management presentations.

6. Brainstorming and creative tasks

Naming a new service line. Brainstorming KAMs for a listed audit. Generating discussion topics for a partner meeting on quality. Drafting LinkedIn posts about a recent regulatory update.

What public LLMs are TERRIBLE at for audit

Equally important — what NOT to use public LLMs for:

1. Client data analysis

This is the big one. Never paste a client's ledger, trial balance, GST returns, or any other confidential data into ChatGPT, Claude, Perplexity, or Grok.

Three reasons:

(a) DPDPA exposure. Section 8 of the DPDP Act 2023 requires "reasonable security safeguards" for personal data. Section 9 requires breach notification within 72 hours. Pasting payroll data with employee names + PAN + bank accounts into a public LLM is a breach of contractual obligations to your client AND likely a DPDPA failure on the firm's part.

(b) Confidentiality obligations. Every engagement letter under SA 210 includes a confidentiality clause. Client data shared with a third-party LLM provider violates that.

(c) No audit trail. When you paste data into ChatGPT for analysis, there's no record of what you sent, what came back, when, or under what prompt. Five years later when a peer reviewer or NFRA inspector asks for the basis of a finding, you can't reproduce the analysis.

The practical rule: Treat ChatGPT, Claude, Perplexity, Grok exactly like a partner you'd never share client data with. Anonymise everything you put into them.

2. Anything requiring 100% accuracy

LLMs hallucinate. They generate confident-sounding text that's plausible-but-wrong about 5-15% of the time on factual questions. For audit work, this is fatal:

"What is the threshold for tax audit under Section 44AB?" — LLM might confidently say "₹2 crore" instead of "₹1 crore"
"When does CARO 2020 apply?" — LLM might give the FY 2020-21 effective date (deferred to FY 2021-22)
"What is the penalty for Section 269ST breach?" — LLM might confuse with Section 271DA vs other sections

For substantive citations, always verify against the source — Bare Act, ICAI Standards, MCA notifications. Use the LLM for direction-of-research, not for citation.

3. Anything where the answer determines liability

The auditor's opinion form (unmodified / qualified / adverse / disclaimer). Whether to file Form ADT-4. Whether a transaction is a related party. Whether materiality is breached. Whether going concern is doubtful.

These are professional judgement calls. The auditor is liable. The LLM can help structure the thinking, but the decision is yours.

A CA who relies on an LLM's opinion to skip filing ADT-4 within 60 days has no defence at NFRA proceedings. The LLM isn't a defendant.

4. Anything that needs persistent memory across engagements

Each ChatGPT / Claude / Perplexity session starts fresh. Even ChatGPT's "memory" feature retains general user preferences, not engagement-specific context.

If you spent 30 minutes today walking the LLM through your client's business, the structure of their group, the related-party map, and the specific risks — tomorrow you start from scratch. There's no "engagement file" the LLM remembers.

(Claude Projects partially addresses this — you can group conversations under a Project with shared context. But it's still session-bounded and doesn't persist forever.)

For audit work, where 80% of the partner's value is engagement context built up over weeks, this is a major limitation.

5. Generating audit documentation that survives review

A CARO 2020 working paper. A SA 240 fraud testing log. A Section 143(12) communication letter.

LLMs can DRAFT these. They cannot create the audit trail showing when the analysis was performed, what data was examined, who performed the work, and what the response was. That trail has to come from your audit tool (CORAA / Caseware / CCH / Excel + filing system) — the LLM is only the first-draft generator.

The 7-rule adoption framework

For a firm seriously adopting AI tools in audit, the rules:

Never paste client data into a public LLM. Anonymise or use only synthetic / publicly-available data for any LLM testing.
Treat LLM output as a draft, always. Auditor reviews, edits, takes responsibility. The LLM didn't sign the report; you did.
Verify factual citations against the source. Bare Act, ICAI Standards, MCA notifications, NFRA orders. Not what the LLM says.
Document the LLM use in working papers — what prompt, what was substantially the response, what you changed. SA 230 documentation expectation.
Don't subscribe to all four tools. Pick one or two based on actual usage. ChatGPT Plus + Claude Pro is the practical pair for most CAs.
For client-data analysis, use audit-grade tools — CORAA, audit-tech with India hosting and DPDPA compliance. Not public LLMs.
Train staff explicitly on what NOT to put into LLMs. The biggest risk is junior staff pasting raw client data because they don't know the limits.

Subscription cost — honest math

For a 10-partner mid-tier firm, the public LLM subscription stack:

Partner level — 5 partners × Claude Pro (₹1,700) + ChatGPT Plus (₹1,999) = ₹18,500 / month
Manager level — 8 managers × ChatGPT Plus only = ₹15,992 / month
Staff level — 15 staff × ChatGPT free (works for basic tasks) = ₹0

Total: ~₹34,500 / month, or ~₹4.1 lakh / year for public LLM access across the team.

What this gets you: drafting, research, brainstorming, code generation, narrative content. Not client-data analysis. Not audit trail. Not engagement persistence.

For comparison, CORAA's mid-tier plan (100 entities): ₹2.4 lakh / year, unlimited users, full client-data analysis with audit trail, India-hosted, DPDPA-compliant. The two are complementary — CORAA for client-data work, public LLMs for everything else.

The smart firm runs both: ~₹4 lakh on public LLMs + ~₹2.4 lakh on CORAA = ₹6.4 lakh / year combined. Versus a junior CA hire at ₹6-10 lakh that provides ~1,800 hours of capacity. The AI stack provides 10,000+ hours of compressed time across the team. Different value proposition.

See the AI ROI Calculator to model your firm's specific numbers.

Adoption sequence — 90 days

For a firm that's never used AI tools, the rollout sequence that works:

Days 1-30 — Partner trials

2-3 partners try Claude Pro for ~1 hour / day on real (anonymised) tasks
Document what works (research, drafting) and what doesn't
Decide which subscriptions to standardise on

Days 31-60 — Manager onboarding

Roll out chosen tools to managers
Training session on the 7-rule framework (especially Rule 1: no client data)
Manager-level use cases: research, drafting, brainstorming, code generation

Days 61-90 — Staff onboarding

Selective rollout to senior staff
Explicit training on what NOT to do
Maintain a list of "approved" and "prohibited" tasks

Days 90+ — Ongoing

Quarterly review of usage and value
Annual cost-benefit review
Add specialised tools (audit-grade) where the public LLMs hit limits

For firms that want a structured starting point, the CORAA AI Lab has practical guides on using Claude, ChatGPT, and Perplexity for specific audit tasks — including the prompts that work and the prompts that don't.

When does an audit-grade alternative make sense?

Three triggers tell you the public-LLM stack isn't enough:

You're routinely tempted to paste client data into ChatGPT. This is a DPDPA / confidentiality red flag. Move client-data work to an India-hosted audit-grade tool.
You're rebuilding engagement context every session. If you find yourself explaining the same client's business to Claude every week, you've hit the persistent-memory limit.
Peer review / NFRA inspection is upcoming. Public LLMs don't generate an audit trail. Audit-grade tools (CORAA, others) timestamp every analysis with the underlying transaction reference.

For those use cases, see the AI Audit Tool Evaluation Checklist — 46 criteria across India compliance, data security, audit-grade features, integrations, pricing, vendor quality. Use it to evaluate CORAA or any alternative.

What about Grok and Perplexity specifically?

Beyond the headline ChatGPT vs Claude debate:

Perplexity is excellent at citation-anchored research. "What does the latest SEBI BRSR Core circular say about Scope 3 emissions?" — Perplexity pulls up source documents with citations. Best for current-events / regulatory research where citation matters.

Grok is bundled with X Premium+ and is increasingly capable. Useful for current-events research (X integration), less differentiated for audit-specific work. Watch the space — Grok is improving rapidly.

The 3-tool stack that works for many mid-tier firms: Claude Pro (analytical / drafting) + ChatGPT Plus (general / code) + Perplexity Pro (research with citations). ~₹65K/year for one user across all three.

A deeper comparison is in the next post in this series: ChatGPT vs Claude vs Perplexity vs Grok for Indian CAs.

Bottom line

AI tools in audit are real value when used correctly. The misuse is what causes problems:

Public LLMs (ChatGPT, Claude, Perplexity, Grok): excellent for drafting, research, brainstorming, code. Subscription cost ~₹1,300-2,000 / user / month. Don't put client data in them.
Audit-grade tools (CORAA, others): for client-data analysis, engagement persistence, audit trail. ~₹2.4 lakh / year for unlimited users in a typical firm.
The 7-rule framework: never paste client data, treat output as draft, verify citations, document use, subscribe selectively, use audit-grade tools for client work, train staff on limits.
90-day rollout: partners first, then managers, then staff. With explicit training on what not to do.

The firms doing this well are typically: 5-20 partner mid-tier, 50-300 engagements / year, mixed statutory + tax + advisory practice. They run a public-LLM stack for narrative / research and an audit-grade tool for client-data work. Combined cost ~₹5-8 lakh / year against ~30,000-50,000 hours of audit work — material ROI.

The firms doing it badly: pasting client ledgers into ChatGPT, treating LLM output as final, no audit trail, no documented limits. That's a DPDPA breach waiting to be discovered + a peer review finding waiting to happen.

Decide which firm you want to be.

Try CORAA → Audit-grade AI for client-data work. India-hosted, DPDPA-aligned, audit-trail-by-default. Per-entity flat pricing, unlimited users. See pricing · Trust Centre · AI ROI Calculator · AI Audit Tool Evaluation Checklist.

Next in this series: Claude for Indian Audit Work — A 90-Day Practitioner's Guide (June 11) and ChatGPT vs Claude vs Perplexity vs Grok for Indian CAs (June 25).

Topics

AI in audit IndiaChatGPT for auditClaude for auditPerplexity auditGrok auditAI adoption CA firmsDPDPA AI auditpublic LLM audit data

← Back to all articles

Adopting AI in Audit: A Practitioner's Honest Playbook (What Works, What Doesn't)

Adopting AI in Audit: A Practitioner's Honest Playbook (What Works, What Doesn't)

The four tools most CAs are evaluating

What public LLMs are GENUINELY good at for audit

1. Drafting narrative content

2. Standards research and summarisation

3. Templates and checklists

4. Excel formulas and macros

5. Translation and simplification

6. Brainstorming and creative tasks

What public LLMs are TERRIBLE at for audit

1. Client data analysis

2. Anything requiring 100% accuracy

3. Anything where the answer determines liability

4. Anything that needs persistent memory across engagements

5. Generating audit documentation that survives review

The 7-rule adoption framework

Subscription cost — honest math

Adoption sequence — 90 days

When does an audit-grade alternative make sense?

What about Grok and Perplexity specifically?

Bottom line

More in ai in audit.

Ready to automate your audit work.