Auditing AI Systems: A Practical Framework for CA Firms [2026]
Your clients are adopting AI. Banks are using AI for credit scoring. E-commerce companies are using AI for dynamic pricing. Manufacturers are using AI for predictive maintenance that affects their provision estimates. Insurance companies are using AI to process claims. Healthcare companies are using AI for diagnostic support that influences revenue recognition.
This is not a future scenario. These systems are operational today in Indian companies across industries. And when these companies are audited, the auditor must assess whether the AI systems that affect the financial statements are reliable, controlled, and producing outputs that are fairly stated.
Most CA firms have well-established procedures for auditing traditional IT systems — ERP controls, access management, change management, data integrity. But AI systems are fundamentally different from traditional IT. They learn from data rather than following static code. Their outputs may change as the model is retrained. Their decision-making logic may be difficult to explain or inspect.
This article provides a practical, standards-based framework for auditing clients that use AI systems. It is built on SA 315 (understanding the entity and its environment) and SA 330 (audit responses to assessed risks), supplemented by emerging guidance on technology and model risk.
Why AI Systems Are Different From Traditional IT
Before presenting the framework, it is important to understand why existing IT audit approaches are insufficient for AI systems.
Traditional IT: Static Logic
A traditional ERP system executes predefined business rules. When an invoice is posted, the system follows a defined workflow: check credit limit, apply payment terms, post to the correct GL accounts, update the subsidiary ledger. The logic is written in code that does not change unless a developer modifies it. Auditing this system means verifying that the code operates as documented and that changes are controlled.
AI Systems: Learned Behaviour
An AI system does not follow static rules. A credit scoring model, for example, learns patterns from historical data — correlations between borrower characteristics and default outcomes. The model's decision logic is not written by a programmer; it emerges from the training process. If the model is retrained with new data, the decision logic changes.
This creates audit challenges that do not exist in traditional IT:
-
Explainability: The model's decisions may be difficult to trace to specific rules or factors. A traditional system can explain why an invoice was posted to a particular account. A neural network may not be able to explain why it assigned a particular credit score.
-
Stability: The model's behaviour may drift over time as the underlying data distribution changes. A credit scoring model trained on pre-pandemic data may behave differently when applied to post-pandemic borrowers.
-
Bias: The model may incorporate biases present in its training data, leading to systematically incorrect outputs for certain populations.
-
Dependency on data quality: The model's outputs are only as reliable as the data it was trained on and the data it processes. Data quality issues can propagate through the model in ways that are not immediately visible.
The Six-Step Framework
Step 1: Identify AI Systems in the Client's Environment
The first step is simply to know what AI systems exist. This is less straightforward than it sounds. AI is increasingly embedded in commercial software products without being prominently labelled. The client's management may not describe their dynamic pricing engine as "artificial intelligence" — they may call it their "pricing optimisation tool" or simply "the system."
Practical procedures:
-
Inquiry of management: Ask specifically about systems that learn from data, make predictions, classify information, or generate recommendations. Avoid using the term "AI" exclusively — management may not categorise their systems that way.
-
Inquiry of IT personnel: Ask about machine learning models, neural networks, data science platforms, or model deployment infrastructure. IT teams are more likely to use technical terminology.
-
Review of IT systems documentation: Examine the entity's IT landscape documentation for systems that involve model training, feature engineering, prediction endpoints, or model monitoring.
-
Review of vendor contracts: Third-party AI services (credit bureau scoring, fraud detection platforms, pricing APIs) may be embedded in the client's processes through vendor relationships.
-
Industry awareness: Understand which AI applications are common in the client's industry. Banking clients likely use AI in credit risk. Retail clients likely use AI in demand forecasting and pricing. Manufacturing clients likely use AI in quality control and maintenance scheduling.
Output of Step 1: A complete inventory of AI systems that exist in the client's environment, including both internally developed and third-party systems.
Step 2: Assess AI Relevance to Financial Statements
Not every AI system is relevant to the audit. A chatbot that answers customer service queries may have no impact on financial statements. A credit scoring model that determines loan provisions has a direct, material impact.
Assessment criteria:
-
Does the AI system affect revenue recognition? For example, an AI pricing engine determines the prices at which transactions are recorded.
-
Does the AI system affect provisions or estimates? For example, an expected credit loss model uses AI to predict default probabilities that directly determine the provision for bad debts.
-
Does the AI system affect asset valuations? For example, an AI system estimates the fair value of financial instruments or real estate.
-
Does the AI system affect classifications? For example, an AI system classifies transactions as operating or capital, or categorises expenses across cost centres.
-
Does the AI system affect completeness? For example, an AI system identifies transactions for accrual or determines which items to include in consolidation.
-
Does the AI system affect presentation and disclosure? For example, an AI system generates segment reporting data or calculates metrics disclosed in the financial statements.
Output of Step 2: A prioritised list of AI systems that are relevant to the financial statement audit, ranked by their potential impact on material accounts and assertions.
Step 3: Understand the AI System
For each relevant AI system, the auditor must develop sufficient understanding to assess its reliability and the risks it poses to the financial statements. This is the SA 315 requirement applied to AI specifically.
Key areas of understanding:
Inputs:
- What data does the model use? Source, format, volume, frequency of updates.
- How is input data validated before it enters the model?
- Are there data quality controls — completeness checks, accuracy validations, timeliness monitoring?
- What happens when input data is missing, incomplete, or anomalous?
Processing logic:
- What type of model is it? (Decision tree, neural network, regression model, ensemble method, large language model)
- What was the training process? What data was used for training and validation?
- How often is the model retrained?
- What is the model's documented accuracy, precision, recall, or other performance metrics?
- Is the model's decision logic explainable? Can individual outputs be traced to input factors?
Outputs:
- What does the model produce? Scores, classifications, predictions, recommendations?
- How are outputs used in business processes and financial reporting?
- Is there a confidence threshold — are outputs below a certain confidence level handled differently?
- What override mechanisms exist — can humans override model outputs?
Controls:
- Who is responsible for model governance?
- What change management process governs model updates?
- How is model performance monitored on an ongoing basis?
- What access controls restrict who can modify the model, its parameters, or its training data?
- Is there an independent model validation function?
Output of Step 3: A documented understanding of each relevant AI system sufficient to assess the risks it poses to the financial statements.
Step 4: Evaluate AI-Specific Risks
Traditional IT risk assessment categories — confidentiality, integrity, availability — remain relevant but are insufficient for AI systems. AI introduces additional risk categories that the auditor must evaluate.
Model risk:
-
Bias risk: Does the model systematically over- or under-estimate for certain populations? A credit scoring model that is biased against certain borrower categories will produce provisions that are systematically misstated for those categories.
-
Drift risk: Has the model's accuracy degraded over time as the real-world data distribution has shifted from the training data distribution? A model trained on historical data may perform poorly on current data if economic conditions, customer behaviour, or business processes have changed.
-
Overfitting risk: Was the model trained too closely on historical data, capturing noise rather than signal? An overfitted model performs well on historical data but poorly on new data.
-
Explainability risk: If the model cannot explain its outputs, how does management assess whether those outputs are reasonable? How does the auditor evaluate them?
Data risk:
-
Quality risk: Are the inputs to the model accurate, complete, and timely? Errors in input data propagate through the model and may be amplified.
-
Relevance risk: Is the training data representative of the current environment? Training data from a different economic period, geographic market, or product mix may not be relevant.
-
Completeness risk: Are all relevant data points included? Missing features in the input data can lead to biased or inaccurate outputs.
Operational risk:
-
Availability risk: What happens if the AI system fails? Is there a fallback process?
-
Version control risk: Is the correct version of the model in production? Are historical versions preserved for auditability?
-
Override risk: How are manual overrides of model outputs controlled and documented?
Output of Step 4: A risk assessment for each relevant AI system, identifying specific risks and their potential impact on the financial statements.
Step 5: Design Audit Procedures
Based on the risk assessment, the auditor designs procedures to obtain sufficient appropriate audit evidence about the AI system's outputs. This is the SA 330 response.
Test input controls:
- Verify that data flowing into the model is complete and accurate. Reconcile input data to source systems.
- Test data validation controls — confirm that the system rejects or flags invalid inputs.
- Verify that the data used matches what is documented as the model's expected input format and schema.
Verify outputs against independent calculations:
- For a sample of model outputs, perform an independent calculation or obtain independent evidence to verify the output is reasonable.
- For a credit scoring model: independently assess a sample of borrowers and compare to the model's scores.
- For a pricing engine: independently calculate prices for selected products and compare to the engine's output.
- For a provision model: independently estimate expected losses for a sample and compare to the model's predictions.
Review model validation reports:
- Obtain and review management's model validation documentation.
- Assess whether the validation methodology is appropriate — does it use out-of-sample data, does it test for bias, does it measure relevant performance metrics?
- Evaluate whether validation is performed with sufficient independence — is the validation function separate from the development function?
- Review model monitoring reports — is model performance tracked over time, and are degradation trends identified?
Test override controls:
- Identify cases where human operators overrode model outputs.
- Assess whether overrides are documented, authorised, and reasonable.
- Evaluate whether override patterns suggest systematic issues with the model (frequent overrides in one direction may indicate model bias).
Test change management:
- Verify that model changes (retraining, parameter updates, feature changes) follow a documented change management process.
- Confirm that changes are tested before deployment and that pre- and post-change performance is compared.
- Verify that the version of the model in production at period-end is the version documented in the client's records.
Output of Step 5: A set of audit procedures tailored to each relevant AI system, responsive to the specific risks identified in Step 4.
Step 6: Document
SA 230 requires documentation that enables an experienced auditor to understand the procedures performed, evidence obtained, and conclusions reached. For AI-related procedures, documentation must be particularly thorough because the technology is evolving rapidly, and reviewers (including NFRA) may need additional context to understand the auditor's approach.
Documentation should include:
- The inventory of AI systems identified and the rationale for which systems were considered relevant
- The auditor's understanding of each relevant AI system (inputs, processing, outputs, controls)
- The risk assessment for each system, including AI-specific risks
- The audit procedures designed in response to those risks
- The results of those procedures, including any exceptions
- The auditor's conclusions about the reliability of the AI system's outputs and their impact on the financial statements
- Any limitations encountered — areas where the auditor was unable to obtain sufficient understanding or evidence
Practical Examples
Example 1: Auditing a Bank's AI Credit Scoring Model
Context: The bank uses a machine learning model to assign credit scores to retail loan applicants. The model's output directly affects the expected credit loss (ECL) provision under Ind AS 109.
Step 1: Identify the credit scoring model through inquiry of the risk management and IT teams.
Step 2: Assess relevance — the model directly determines probability of default (PD), which is a key input to the ECL calculation. High relevance.
Step 3: Understand the model — it is a gradient boosted decision tree trained on five years of historical loan performance data. Inputs include borrower income, employment tenure, existing obligations, credit history, and property value. Output is a PD score between 0 and 1.
Step 4: Key risks include drift risk (economic conditions have changed since the training period), bias risk (potential geographic or demographic bias in training data), and data quality risk (borrower-provided information may be inaccurate).
Step 5: Procedures include reviewing the model validation report, independently calculating ECL for a sample of loans using alternative PD estimates, testing input data quality controls, reviewing override patterns, and comparing model predictions to actual default experience (backtesting).
Step 6: Document all of the above with sufficient detail for a reviewer to understand the approach.
Example 2: Auditing an E-Commerce Company's AI Pricing Engine
Context: The company uses an AI system that dynamically adjusts product prices based on demand, competitor pricing, inventory levels, and other factors. Prices affect revenue recognition.
Step 2: Assess relevance — the pricing engine determines the transaction price for every sale. Direct impact on revenue. High relevance.
Step 3: Understand the system — it is a reinforcement learning model that adjusts prices to maximise revenue within defined bounds (minimum and maximum prices set by management).
Step 4: Key risks include boundary control risk (are the min/max price constraints actually enforced?), explainability risk (can individual pricing decisions be explained?), and competitive pricing risk (does the system ever set prices below cost?).
Step 5: Procedures include testing boundary controls (verify that no transactions occurred at prices outside the defined range), reconciling system prices to invoiced amounts, testing a sample of pricing decisions against the documented pricing policy, and reviewing management's monitoring of pricing patterns.
Example 3: Auditing AI-Generated Provisions
Context: An insurance company uses an AI model to estimate claim reserves. The model analyses historical claims data, current claims characteristics, and external factors to predict ultimate claim costs.
Step 2: Direct impact on insurance liabilities. High relevance.
Step 4: Key risks include model accuracy (are historical patterns predictive of future outcomes?), data completeness (are all open claims included?), and catastrophe risk (does the model account for low-probability, high-impact events?).
Step 5: Procedures include engaging an actuarial specialist to independently estimate reserves for a sample of claims, comparing the AI model's aggregate estimate to the specialist's estimate, reviewing the model's treatment of large and catastrophic claims, and testing data completeness by reconciling the model's claim count to the claims register.
Building Competence Within Your Firm
Auditing AI systems requires knowledge that goes beyond traditional audit skills. Firms should consider:
Training: Invest in training that covers AI fundamentals — not at the level of a data scientist, but sufficient to have informed conversations with client management and IT teams.
Specialist resources: For complex AI systems, consider engaging IT audit specialists or data science professionals as part of the engagement team.
Industry knowledge: Develop industry-specific knowledge about how AI is commonly used and what risks are typical. A firm that audits multiple banking clients will develop expertise in credit model auditing that applies across engagements.
Methodology updates: Incorporate AI-specific considerations into your firm's audit methodology and quality management framework. This includes updating risk assessment templates, work programme libraries, and documentation standards.
Standards awareness: Monitor evolving standards and regulatory guidance on AI in audit, both from Indian regulators (ICAI, NFRA) and international bodies (IAASB, PCAOB). This area is developing rapidly, and firms that stay current will be better positioned.
The Auditor's Responsibility
The fundamental principle remains unchanged: the auditor is responsible for the audit opinion. Using AI does not delegate that responsibility to the AI system. When a client's AI system produces an output that flows into the financial statements, the auditor must obtain sufficient appropriate evidence to conclude whether that output is fairly stated.
The framework presented in this article provides a structured approach to meeting that responsibility. It is not exhaustive — every engagement will have unique characteristics that require professional judgement. But it provides a starting point, a common vocabulary, and a methodology that is grounded in existing auditing standards.
As AI adoption accelerates across Indian businesses, the ability to audit AI systems will transition from a specialist skill to a core competency. Firms that develop this competence now will be prepared for an environment where AI is not the exception but the norm in their clients' operations.
Get weekly audit insights
Practical guides on audit automation, SQM1 compliance, and Ind AS procedures — delivered to 2,000+ CA professionals every Friday.
No spam. Unsubscribe any time.
Topics