KEMBAR78
Data mining and Forensic Audit | PPTX
DATA MINING
&
FORENSIC AUDIT
By
Dhruv Seth
ds@sethspro.com | www.sethspro.com
CONTENT
• Data Mining
• Methods of doing
• Difference with standard auditing
• Benefits and Risks
• Patterns in data
• Utilisation in different audits
• Forensic Audit
• What is a fraud
• Profile of a fraudster
• Tools available in excel
• Theorems
A PROBLEM…
•A large retail chain doing substantially well
had
•Dismal diaper sale ; Excellent Beer sale
SOLUTION
Place them together !
WHICH IS SUSPICIOUS ?
• User 1: Login → Click on Product #8473 → Click
on Product #157 → Click on Product #102 →
Complete Purchase
• User 2: Failed Login → Request Password →
Direct Link to Product #821 → Change Shipping
Address →Complete Purchase
Data Mining
Computer Expertise
≠
WHAT IS DATA MINING
Data
Information
IDENTIFYING SUSPICIOUS TRANSACTIONS
Computer Behavioral Smartphone Analytics
Mouse Dynamics Screen Pressure
Typing Speed Angle of usage of phone
Previous Navigation
Habits
Movement across screen
Entry & Exit points on
website
Heart Rate
DATA MINING - VALUE ADDITION
What was my total revenue in the last five years?
TO
What were sales in UP last March? Drill down to Kanpur
TO
What’s likely to happen to Kanpur sales next month? Why?
DATA MINING - METHODS
•Association
•Sequence or path analysis
•Classification
•Clustering
•Prediction
DATA MINING - TECHNIQUES
•Artificial neural networks
•Decision trees
•The nearest neighbour method
DATA MINING V. REGULAR AUDIT
Labor Verification
Regular Audit Data Mining
1. Contracted rate =
Billing rate
1. Contracted rate = Billing rate
2. The billing is relevant to
the audit period.
2. Employee Pay grade wise payment
3. Statutory Compliances 3. Mapping resignation to Last Pay
4. Mapping computer / biometric logins
after resignation / termination
5. Overtime Analysis to determine
a.) Regular Overtime
b.) Employees who worked 100 hrs
6. Those not availing leaves
DATA MINING - STEPS
•Business Understanding
•Data Understanding
•Data Preparation
•Data Modelling
•Evaluation
•Deployment
DATA MINING – WHY INTEGRATE
•Transaction Volume
•Mitigate Inherent Risk
•Value addition to the client
•Cost Effective
DATA MINING – BENEFITS
•Remove Sampling risk – 100% coverage
•Decrease in Audit costs
•Provide Real time audit opinions
•Establish Completeness and accuracy
DATA MINING – SOFTWARE TYPES
•Generalized Software
•Specialized Software
DATA MINING – SOFTWARE TYPES
Characteristics Generalised Specialised
Batch Processing No Yes
Support entire audit procedures No Yes
User friendly Yes No
Require technical skill No Yes
Automated No Yes
Capable of learning No Yes
Cost Lower Higher
DATA MINING – RISKS
•First year costs might be higher
•Strong understanding of operations
•Availability of data in desired format
•Risk of Control totals
DATA MINING – PATTERNS
•Numeric Patterns
•Time Patterns
•Name Patterns
•Geographical Patterns
•Relationship Patterns
•Textual Patterns
• Purchases
• Vendors and accounts payable
• Employees and payroll
• Expense reimbursement
• Credit Card utilisation
• Sales & Debtors
• Inventory
• Commission Payouts
DATA MINING – INTERNAL AUDITS
PURCHASES
•Round number transactions
•Duplicate transactions
•Same, Same, Different Test
•Above average payments
•Transactions exceeding PO quantity
•Sequential Invoice numbers
•Too many invoices beginning with “9”
DATA MINING – INTERNAL AUDITS
CREDITORS
•Those with high percentage of returns
•Those with rapid increasing purchases
•Small denomination but quick frequency
•SOD for vendor approver and purchaser
DATA MINING – INTERNAL AUDITS
PAYMENT TREND ANALYSIS
By the day of week
By the day of Month
DATA MINING – INTERNAL AUDITS
DATA MINING – EXPENSES
VENDORS MASTER
• Analysis of Vendors master for creation date
• Identifying regular prompt vendor payment
• Cross reference vendors to employees
• Same, Same and Different test
DATA MINING – INTERNAL AUDITS
EMPLOYEES AND PAYROLL
• Regularly working overtime
• Not taking leaves
• Satisfied with unjustified salary deduction
• Segregating employees with salary in cash
• Biometric analysis – First to enter / last to leave
DATA MINING – INTERNAL AUDITS
TRAVEL EXPENSES
• Identify weekend or holiday travel
• Search for same or similar claims
• Identify costs outside of policy or costly late bookings
• Identify conveyance claim made for the same time period
as car rental or other transportation
• Compare mileage claims to distances reported
• Instances where employee has refunded a first class
ticket for an economy, but not reimbursed the balance
back to the company.
DATA MINING – INTERNAL AUDITS
SALES & DEBTORS
• Comparing Invoice to Shipping
• Conversely comparing Shipping to Invoice
• Preference in sale to a particular customer
• Same, Same, Different test to sale price
• Debtors
• Lapping
• Old outstanding invoices
DATA MINING – INTERNAL AUDITS
INVENTORY
• Determining slow moving inventory
• Determining quick moving inventory
• Purchasing frequency of a particular product
• Mapping stock valuation to last sale price
DATA MINING – INTERNAL AUDITS
• Transactions a customer does before shifting? (to
prevent attrition)
• Profile of an ATM customer and what type of
products is he likely to buy? (to cross sell)
• Patterns in credit transactions lead to fraud? (to
detect and deter fraud)
• Traits of a high-risk borrower? (to prevent
defaults, bad loans, and improve screening)
DATA MINING – BANKS
• Duplicate Customer id
• DP Limit = Limit = Outstanding
• Comparing Unsecured and secured within scheme
• Rate of Interest being applied
• Last Credit amount and Date
• Same PAN – Different Customer id
• Last Stock statement summary
DATA MINING – BANKS
•Rubbing Nose
•Frequent blinking
•Moving or Tapping feet
•Crossing Arms
•Clearing throat
•Pinched eyebrows
•Smirk
DATA MINING – BEHAVIOR
FORENSIC AUDIT
REPORT TO THE NATION
• Each organization loses 5% of their REVENUE to fraud
• Asset Misappropriation is the biggest factor
• Fraud are generally NOT discovered for 18 months
• Higher the fraud perpetrator BIGGER the fraud
• 58% organizations NEVER recovered anything
FRAUD DETECTION
BANK FRAUDS – 9 MONTHS FY 2014-15
Name Number of Cases Amount
PNB 123 2036,00,00,000
CBI 174 1736,00,00,000
SBI 474 1327,00,00,000
Syndicate 114 749,00,00,000
OBC 86 719,00,00,000
BOB --- 597,00,00,000
IDBI --- 507,00,00,000
UCO --- 424,00,00,000
United Bank --- 376,00,00,000
TOTAL 7542,00,00,000
• A false representation of a matter of fact
• whether by words or by conduct,
• by false or misleading allegations, or
• By concealment of what should have been
disclosed
• that deceives and is intended to deceive another
• so that the individual will act upon it to her or his
legal injury.
WHAT IS FRAUD ?
FRAUD TRIANGLE
WHAT IS FORENSIC AUDIT
•The use of accounting skills;
•To investigate frauds / embezzlement and
•To analyze financial information
•For use in legal proceedings
FORENSIC VIS-À-VIS STATUTORY
Forensic Statutory
Very focused and micro approach Macro approach with wide coverage
Examines Reliability of documentation Relies on Documentary evidences
Not compulsory Regulatory compliance
Establishing existence of fraud Ensuring True and fair view
Determining the quantum of loss Verifying correct representations
Gathering evidences Evaluating Internal Controls
GHOST EMPLOYEES
NEED FOR LEARNING THE TRAITS
Why frauds go unnoticed during stat audit -
• extremely intelligent
• Conversant with internal systems
• Technology savvy
• Aware of stale audit procedures
FRAUDSTERS PROFILE
• Flamboyant lifestyle
• Very aggressive in his approach / targets
• Over protectiveness of data / documents
• Being the first one in and last one out
• Unusual close association with vendor / customers
FRAUDSTERS PROFILE
FRAUDSTERS PROFILE
FORENSIC AUDITOR
Forensic
Accountant
Law
Accounting
Criminology
Investigative
Auditing
Computer
Science
TRAITS OF A FORENSIC AUDITOR
•Think out of the box
•Distrust the obvious
•Develop cognitive dissonance
•Test of absurdity
TEST OF ABSURDITY
Think of events which may be possible but
not probable.
TOOLS AVAILABLE IN EXCEL
•Analyze round number transactions
•Duplicate detection
•Same, Same and different tests
•Above average payments to vendors
TOOLS AVAILABLE IN EXCEL
•Gap detection
•Automated sampling
•MATCH function
•Employee – Vendor match
SPECIAL MENTION – TIME & SPACE
•Establish transactions in quick successions
which take a substantial time in happening
•Storage in excess of the possible space
SPECIAL MENTION – RSF
•Ratio of Largest number to the second
largest number in the set
RSF = Largest Number / 2nd Largest
•RSF greater than 10 highlights probability of
fraud / error
SPECIAL MENTION – RSF
•Types of errors / frauds it can unearth
• Data Entry mistakes
• Fat Finger errors
• Wrong coding with masters
• Capital Asset written off in expense
• Excess payments in payroll
SPECIAL MENTION – BENFORD’S
LAW
•Formulated by Simon Newcomb in 1881 ;
further researched by Frank Benford in 1938
•U.S. accepts Benford’s law as an evidence
•Statistical tool which can be applied to
normal audits also to automate samples
SPECIAL MENTION – BENFORD’S
LAW
SPECIAL MENTION – M-SCORE
•Theory propounded by Prof. Beneish
•Stipulates the accuracy of financial
statements based on certain ratios
•Ratios such as
• Sales to receivables and Sales Growth Index
• Gross margin Index
• Asset Quality Index
• Depreciation Index
SPECIAL MENTION – M-SCORE
•Financial statements score >-2.22 is
considered as fudging
•Statistically proven to have 76% accuracy
•Model being adopted by Income Tax
Department for CASS
EXCEL LIMITATIONS
•Absence of Log
•Not admissible in court
•Involves slight complexity in applying
•Data size limitation / Instability
•Risk of Hidden data
THANK YOU !
By
Dhruv Seth
ds@sethspro.com | www.sethspro.com

Data mining and Forensic Audit

  • 1.
    DATA MINING & FORENSIC AUDIT By DhruvSeth ds@sethspro.com | www.sethspro.com
  • 2.
    CONTENT • Data Mining •Methods of doing • Difference with standard auditing • Benefits and Risks • Patterns in data • Utilisation in different audits • Forensic Audit • What is a fraud • Profile of a fraudster • Tools available in excel • Theorems
  • 3.
    A PROBLEM… •A largeretail chain doing substantially well had •Dismal diaper sale ; Excellent Beer sale SOLUTION Place them together !
  • 4.
    WHICH IS SUSPICIOUS? • User 1: Login → Click on Product #8473 → Click on Product #157 → Click on Product #102 → Complete Purchase • User 2: Failed Login → Request Password → Direct Link to Product #821 → Change Shipping Address →Complete Purchase
  • 5.
  • 6.
    WHAT IS DATAMINING Data Information
  • 7.
    IDENTIFYING SUSPICIOUS TRANSACTIONS ComputerBehavioral Smartphone Analytics Mouse Dynamics Screen Pressure Typing Speed Angle of usage of phone Previous Navigation Habits Movement across screen Entry & Exit points on website Heart Rate
  • 8.
    DATA MINING -VALUE ADDITION What was my total revenue in the last five years? TO What were sales in UP last March? Drill down to Kanpur TO What’s likely to happen to Kanpur sales next month? Why?
  • 9.
    DATA MINING -METHODS •Association •Sequence or path analysis •Classification •Clustering •Prediction
  • 10.
    DATA MINING -TECHNIQUES •Artificial neural networks •Decision trees •The nearest neighbour method
  • 11.
    DATA MINING V.REGULAR AUDIT Labor Verification Regular Audit Data Mining 1. Contracted rate = Billing rate 1. Contracted rate = Billing rate 2. The billing is relevant to the audit period. 2. Employee Pay grade wise payment 3. Statutory Compliances 3. Mapping resignation to Last Pay 4. Mapping computer / biometric logins after resignation / termination 5. Overtime Analysis to determine a.) Regular Overtime b.) Employees who worked 100 hrs 6. Those not availing leaves
  • 12.
    DATA MINING -STEPS •Business Understanding •Data Understanding •Data Preparation •Data Modelling •Evaluation •Deployment
  • 13.
    DATA MINING –WHY INTEGRATE •Transaction Volume •Mitigate Inherent Risk •Value addition to the client •Cost Effective
  • 14.
    DATA MINING –BENEFITS •Remove Sampling risk – 100% coverage •Decrease in Audit costs •Provide Real time audit opinions •Establish Completeness and accuracy
  • 15.
    DATA MINING –SOFTWARE TYPES •Generalized Software •Specialized Software
  • 16.
    DATA MINING –SOFTWARE TYPES Characteristics Generalised Specialised Batch Processing No Yes Support entire audit procedures No Yes User friendly Yes No Require technical skill No Yes Automated No Yes Capable of learning No Yes Cost Lower Higher
  • 17.
    DATA MINING –RISKS •First year costs might be higher •Strong understanding of operations •Availability of data in desired format •Risk of Control totals
  • 18.
    DATA MINING –PATTERNS •Numeric Patterns •Time Patterns •Name Patterns •Geographical Patterns •Relationship Patterns •Textual Patterns
  • 19.
    • Purchases • Vendorsand accounts payable • Employees and payroll • Expense reimbursement • Credit Card utilisation • Sales & Debtors • Inventory • Commission Payouts DATA MINING – INTERNAL AUDITS
  • 20.
    PURCHASES •Round number transactions •Duplicatetransactions •Same, Same, Different Test •Above average payments •Transactions exceeding PO quantity •Sequential Invoice numbers •Too many invoices beginning with “9” DATA MINING – INTERNAL AUDITS
  • 21.
    CREDITORS •Those with highpercentage of returns •Those with rapid increasing purchases •Small denomination but quick frequency •SOD for vendor approver and purchaser DATA MINING – INTERNAL AUDITS
  • 22.
    PAYMENT TREND ANALYSIS Bythe day of week By the day of Month DATA MINING – INTERNAL AUDITS
  • 23.
  • 24.
    VENDORS MASTER • Analysisof Vendors master for creation date • Identifying regular prompt vendor payment • Cross reference vendors to employees • Same, Same and Different test DATA MINING – INTERNAL AUDITS
  • 25.
    EMPLOYEES AND PAYROLL •Regularly working overtime • Not taking leaves • Satisfied with unjustified salary deduction • Segregating employees with salary in cash • Biometric analysis – First to enter / last to leave DATA MINING – INTERNAL AUDITS
  • 26.
    TRAVEL EXPENSES • Identifyweekend or holiday travel • Search for same or similar claims • Identify costs outside of policy or costly late bookings • Identify conveyance claim made for the same time period as car rental or other transportation • Compare mileage claims to distances reported • Instances where employee has refunded a first class ticket for an economy, but not reimbursed the balance back to the company. DATA MINING – INTERNAL AUDITS
  • 27.
    SALES & DEBTORS •Comparing Invoice to Shipping • Conversely comparing Shipping to Invoice • Preference in sale to a particular customer • Same, Same, Different test to sale price • Debtors • Lapping • Old outstanding invoices DATA MINING – INTERNAL AUDITS
  • 28.
    INVENTORY • Determining slowmoving inventory • Determining quick moving inventory • Purchasing frequency of a particular product • Mapping stock valuation to last sale price DATA MINING – INTERNAL AUDITS
  • 29.
    • Transactions acustomer does before shifting? (to prevent attrition) • Profile of an ATM customer and what type of products is he likely to buy? (to cross sell) • Patterns in credit transactions lead to fraud? (to detect and deter fraud) • Traits of a high-risk borrower? (to prevent defaults, bad loans, and improve screening) DATA MINING – BANKS
  • 30.
    • Duplicate Customerid • DP Limit = Limit = Outstanding • Comparing Unsecured and secured within scheme • Rate of Interest being applied • Last Credit amount and Date • Same PAN – Different Customer id • Last Stock statement summary DATA MINING – BANKS
  • 31.
    •Rubbing Nose •Frequent blinking •Movingor Tapping feet •Crossing Arms •Clearing throat •Pinched eyebrows •Smirk DATA MINING – BEHAVIOR
  • 32.
  • 33.
    REPORT TO THENATION • Each organization loses 5% of their REVENUE to fraud • Asset Misappropriation is the biggest factor • Fraud are generally NOT discovered for 18 months • Higher the fraud perpetrator BIGGER the fraud • 58% organizations NEVER recovered anything
  • 34.
  • 35.
    BANK FRAUDS –9 MONTHS FY 2014-15 Name Number of Cases Amount PNB 123 2036,00,00,000 CBI 174 1736,00,00,000 SBI 474 1327,00,00,000 Syndicate 114 749,00,00,000 OBC 86 719,00,00,000 BOB --- 597,00,00,000 IDBI --- 507,00,00,000 UCO --- 424,00,00,000 United Bank --- 376,00,00,000 TOTAL 7542,00,00,000
  • 36.
    • A falserepresentation of a matter of fact • whether by words or by conduct, • by false or misleading allegations, or • By concealment of what should have been disclosed • that deceives and is intended to deceive another • so that the individual will act upon it to her or his legal injury. WHAT IS FRAUD ?
  • 37.
  • 38.
    WHAT IS FORENSICAUDIT •The use of accounting skills; •To investigate frauds / embezzlement and •To analyze financial information •For use in legal proceedings
  • 39.
    FORENSIC VIS-À-VIS STATUTORY ForensicStatutory Very focused and micro approach Macro approach with wide coverage Examines Reliability of documentation Relies on Documentary evidences Not compulsory Regulatory compliance Establishing existence of fraud Ensuring True and fair view Determining the quantum of loss Verifying correct representations Gathering evidences Evaluating Internal Controls
  • 40.
  • 41.
    NEED FOR LEARNINGTHE TRAITS Why frauds go unnoticed during stat audit - • extremely intelligent • Conversant with internal systems • Technology savvy • Aware of stale audit procedures
  • 42.
    FRAUDSTERS PROFILE • Flamboyantlifestyle • Very aggressive in his approach / targets • Over protectiveness of data / documents • Being the first one in and last one out • Unusual close association with vendor / customers
  • 43.
  • 44.
  • 45.
  • 46.
    TRAITS OF AFORENSIC AUDITOR •Think out of the box •Distrust the obvious •Develop cognitive dissonance •Test of absurdity
  • 47.
    TEST OF ABSURDITY Thinkof events which may be possible but not probable.
  • 48.
    TOOLS AVAILABLE INEXCEL •Analyze round number transactions •Duplicate detection •Same, Same and different tests •Above average payments to vendors
  • 49.
    TOOLS AVAILABLE INEXCEL •Gap detection •Automated sampling •MATCH function •Employee – Vendor match
  • 50.
    SPECIAL MENTION –TIME & SPACE •Establish transactions in quick successions which take a substantial time in happening •Storage in excess of the possible space
  • 51.
    SPECIAL MENTION –RSF •Ratio of Largest number to the second largest number in the set RSF = Largest Number / 2nd Largest •RSF greater than 10 highlights probability of fraud / error
  • 52.
    SPECIAL MENTION –RSF •Types of errors / frauds it can unearth • Data Entry mistakes • Fat Finger errors • Wrong coding with masters • Capital Asset written off in expense • Excess payments in payroll
  • 53.
    SPECIAL MENTION –BENFORD’S LAW •Formulated by Simon Newcomb in 1881 ; further researched by Frank Benford in 1938 •U.S. accepts Benford’s law as an evidence •Statistical tool which can be applied to normal audits also to automate samples
  • 54.
    SPECIAL MENTION –BENFORD’S LAW
  • 55.
    SPECIAL MENTION –M-SCORE •Theory propounded by Prof. Beneish •Stipulates the accuracy of financial statements based on certain ratios •Ratios such as • Sales to receivables and Sales Growth Index • Gross margin Index • Asset Quality Index • Depreciation Index
  • 56.
    SPECIAL MENTION –M-SCORE •Financial statements score >-2.22 is considered as fudging •Statistically proven to have 76% accuracy •Model being adopted by Income Tax Department for CASS
  • 57.
    EXCEL LIMITATIONS •Absence ofLog •Not admissible in court •Involves slight complexity in applying •Data size limitation / Instability •Risk of Hidden data
  • 58.
    THANK YOU ! By DhruvSeth ds@sethspro.com | www.sethspro.com

Editor's Notes

  • #10 Association Association is one of the best-known data mining technique. In association, a pattern is discovered based on a relationship between items in the same transaction. That’s is the reason why association technique is also known as relation technique. The association technique is used in market basket analysis to identify a set of products that customers frequently purchase together. Retailers are using association technique to research customer’s buying habits. Based on historical sale data, retailers might find out that customers always buy crisps when they buy beers, and, therefore, they can put beers and crisps next to each other to save time for customer and increase sales. Sequential Patterns Sequential patterns analysis is one of data mining technique that seeks to discover or identify similar patterns, regular events or trends in transaction data over a business period. In sales, with historical transaction data, businesses can identify a set of items that customers buy together different times in a year. Then businesses can use this information to recommend customers buy it with better deals based on their purchasing frequency in the past. Classification Classification is a classic data mining technique based on machine learning. Basically, classification is used to classify each item in a set of data into one of a predefined set of classes or groups. Classification method makes use of mathematical techniques such as decision trees, linear programming, neural network and statistics. In classification, we develop the software that can learn how to classify the data items into groups. For example, we can apply classification in the application that “given all records of employees who left the company, predict who will probably leave the company in a future period.” In this case, we divide the records of employees into two groups that named “leave” and “stay”. And then we can ask our data mining software to classify the employees into separate groups. Clustering Clustering is a data mining technique that makes a meaningful or useful cluster of objects which have similar characteristics using the automatic technique. The clustering technique defines the classes and puts objects in each class, while in the classification techniques, objects are assigned into predefined classes. To make the concept clearer, we can take book management in the library as an example. In a library, there is a wide range of books on various topics available. The challenge is how to keep those books in a way that readers can take several books on a particular topic without hassle. By using the clustering technique, we can keep books that have some kinds of similarities in one cluster or one shelf and label it with a meaningful name. If readers want to grab books in that topic, they would only have to go to that shelf instead of looking for the entire library. Prediction The prediction, as its name implied, is one of a data mining techniques that discovers the relationship between independent variables and relationship between dependent and independent variables. For instance, the prediction analysis technique can be used in the sale to predict profit for the future if we consider the sale is an independent variable, profit could be a dependent variable. Then based on the historical sale and profit data, we can draw a fitted regression curve that is used for profit prediction.
  • #11 Artificial neural networks are non-linear, predictive models that learn through training. Although they are powerful predictive modelling techniques, some of the power comes at the expense of ease of use and deployment. One area where auditors can easily use them is when reviewing records to identify fraud and fraud-like actions. Because of their complexity, they are better employed in situations where they can be used and reused, such as reviewing credit card transactions every month to check for anomalies. Decision trees are tree-shaped structures that represent decision sets. These decisions generate rules, which then are used to classify data. Decision trees are the favored technique for building understandable models. Auditors can use them to assess, for example, whether the organization is using an appropriate cost-effective marketing strategy that is based on the assigned value of the customer, such as profit. The nearest-neighbor method classifies dataset records based on similar data in a historical dataset. Auditors can use this approach to define a document that is interesting to them and ask the system to search for similar items.
  • #19 Numeric Patterns – fictitious invoice numbers, fictitiously-generated transaction amounts… Time Patterns – Transactions occurring too regularly, activity at unusual times or dates… Name Patterns – Similar and altered names and addresses… Geographic Patterns – Proximity relationships between apparently unrelated entities… Relationship Patterns – Degrees of separation… Textual Patterns – Detection of “tone” rather than words…