KEMBAR78
Module 5 | PDF | Artificial Intelligence | Intelligence (AI) & Semantics
0% found this document useful (0 votes)
5 views36 pages

Module 5

The document outlines the ethics of data science, emphasizing the responsibilities of actuaries in managing data integrity, developing AI models, and ensuring compliance with ethical standards. It discusses key ethical principles such as privacy, bias, transparency, accountability, and the impact of data science on society. Additionally, it highlights laws governing data privacy and the importance of informed consent in data collection practices.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views36 pages

Module 5

The document outlines the ethics of data science, emphasizing the responsibilities of actuaries in managing data integrity, developing AI models, and ensuring compliance with ethical standards. It discusses key ethical principles such as privacy, bias, transparency, accountability, and the impact of data science on society. Additionally, it highlights laws governing data privacy and the importance of informed consent in data collection practices.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 36

Module 5

Ethics of data science


An actuary

● An actuary is a professional who uses mathematics,


statistics, and financial theory to assess and manage risk
and uncertainty.
● They help businesses and clients develop policies to
minimize the financial costs of potential events.
Responsibilities of actuaries around data science and
AI

Actuaries play a critical role in integrating data science and AI into their
work, enhancing their ability to analyze risks and predict future outcomes.
Their responsibilities in this area include:
1. Data Management and Integrity: Actuaries ensure the accuracy,
quality, and ethical use of large datasets, crucial for building reliable
predictive models. They clean and prepare data for analysis, ensuring
it meets high standards.
2. AI and Machine Learning Model Development: Actuaries
develop and apply AI and machine learning models to predict
risks, price insurance products, and optimize financial strategies.
They ensure that these models are transparent, explainable,
and aligned with actuarial principles.
3. Risk Evaluation and Mitigation: They assess the risks
introduced by AI, such as algorithmic bias or cybersecurity
vulnerabilities. Actuaries are responsible for ensuring fairness in
the use of AI and mitigating potential negative impacts.
4. Ethical and Regulatory Compliance: Actuaries ensure
that AI and data science practices comply with industry
regulations and ethical standards, preventing discriminatory
outcomes in areas like insurance underwriting.
5. Automation and Process Optimization: They use AI to
automate repetitive tasks, such as claims processing or data
analysis, improving efficiency and allowing actuaries to focus
on more strategic decision-making.
data science ethics

The ethics of data science revolve around the moral principles and
guidelines that govern the responsible use of data in research, analysis,
and application.

1. Privacy and Consent


● Data Privacy: Ensuring that personal data is protected and that
sensitive information is not misused.
● Informed Consent: Individuals should be informed about how their
data will be used and must provide consent.
● Anonymization: Data should be anonymized or de-identified to
protect individual identities when analyzing personal information.
2. Bias and Fairness
● Bias in Data: Data sets may have inherent biases that can lead to unfair
outcomes, especially in predictive models. Ensuring data is
representative and fair is essential.
● Algorithmic Fairness: Algorithms should not disproportionately
disadvantage or discriminate against particular groups based on race,
gender, socioeconomic status, or other factors.
● Equity in Outcomes: Consideration should be given to ensuring
equitable outcomes for all groups, especially when algorithms are used in
high-stakes areas like healthcare or criminal justice.
3. Transparency and Explainability
● Transparency: Data scientists must be transparent about the
methods and data used, especially when their work affects public
policy or individuals' lives.
● Explainability: Models, especially complex ones like deep learning,
can be opaque ("black-box" models). It’s important to make these
models interpretable or explainable so that their decisions can be
understood by non-experts.
4. Accountability
● Responsibility: Data scientists and organizations must take
responsibility for the outcomes of their models, especially when
errors or biases occur.
● Governance and Regulation: There should be appropriate
oversight, laws, and policies to ensure that data science is applied
in a way that aligns with ethical standards.
5. Impact on Society
● Social Good: Data science should be used to promote social
good and minimize harm. Professionals must consider the
broader impact of their work on society.
● Autonomy and Human Rights: Data science should respect
human rights and personal autonomy, avoiding manipulative
practices such as "surveillance capitalism" or unjust data-based
decision-making.
6. Data Ownership and Usage
● Who Owns Data?: Questions about data ownership,
particularly for large platforms that collect massive amounts of
user data, need ethical considerations.
● Usage Boundaries: Even if data is legally obtained, its usage
might have ethical implications. For example, predictive policing
based on historical data may perpetuate existing injustices.
owners of the data
● The owners of the data refer to individuals,
organizations, or entities that have legal rights and
control over the data.
● Data ownership determines who has the authority to
decide how the data is accessed, used, shared, and
monetized.
1. Individuals (Data Subjects):
Individuals are often considered the rightful owners of their personal
data. For example, personal information such as names, addresses,
medical records, and browsing behavior is directly linked to
individuals.
Example: If a person uses a fitness tracker, the data collected from
the device (heart rate, activity level, etc.) belongs to the individual, but
the company may also have rights to process and use it depending on
the terms agreed to.
2. Organizations that Collect the Data (Data Controllers):
Companies and organizations that collect and store data are often
considered data controllers. While they may not “own” the personal
data in the strictest sense , they control how the data is processed and
used.
Example: Social media platforms like Facebook or Twitter collect vast
amounts of user data, which they control and often monetize. Even
though the data comes from individuals, the platform has the right to
use the data based on the user agreements signed during account
creation.
3. Data Processors (Third-Party Organizations):
● Data processors are entities that process data on behalf of the data
controller. While they may not "own" the data, they have access to it
and are bound by the agreements they have with the data controller.
They must adhere to privacy regulations and terms of use outlined by
the data controller.
● Example: A cloud service provider storing customer data for a bank
would be a data processor. They do not own the customer data but
are responsible for safeguarding it and following the bank's
instructions regarding its use.
4. Government Bodies:
● Governments may claim ownership of certain types of data,
particularly public data such as demographic information or data
collected for regulatory purposes (e.g., tax records). Governments
also have rights over data that they collect for national security or
public health.
● Example: A government agency like the U.S. Census Bureau owns
the national population data collected during the census, and this data
is used for policy-making and resource allocation.
5. Data Brokers and Aggregators:
● Data brokers are companies that collect data from various
sources, aggregate it, and sell or license it to other companies.
While they may not technically "own" the original raw data, they
often gain ownership of the compiled datasets they create by
adding value (e.g., cleaning, organizing, or analyzing data).
● Example: , iCIBIL one of India’s major credit information
companies. CIBIL collects and aggregates data on individuals’
credit histories from banks and financial institutions.The data is
sold to banks, insurance companies, and lenders to assess an
individual's creditworthiness, which helps them make decisions on
loans and credit approvals.
Valuing Different Aspects of Privacy in Data Science

Data Privacy
data privacy refers to the right of individuals to control how
their personal information is collected, used, and shared by
others.
This can include a wide range of information, from basic
demographic data such as age and gender to more sensitive
data such as medical history or financial information.
laws that govern data privacy
1. General Data Protection Regulation (GDPR):
The GDPR is a regulation enacted by the European Union (EU) in 2018 that
applies to all EU member states. The GDPR aims to protect individual's personal
data by regulating its processing and transfer, giving individuals more control
over their data, and establishing penalties for non-compliance.
2. California Consumer Privacy Act (CCPA):
The CCPA is a law enacted in California, USA, in 2018 that gives California
residents more control over their personal data. The CCPA requires
organizations to disclose the types of personal data they collect, how it is used,
and to whom it is sold. The CCPA also gives individuals the right to request
access to their personal data, have it deleted, and opt out of its sale.
India’s data privacy laws

1. India Digital Personal Data Protection Act (DPDPA) of 2023

This law went into effect on September 1, 2023 and applies to all organizations
that process personal data of individuals in India.
The law requires companies to get users' consent before processing their
personal data, and gives users the right to withdraw consent at any time. The
law also establishes the Data Protection Board (DPB) to enforce the law and
ensure companies comply with data protection regulations.
2. Information Technology Act of 2000
This act and its subsequent revisions recognize the need for
data privacy protection in India. The act includes provisions
that cover some data collection, usage, retention, and
disclosure issues.
The Supreme Court of India has also recognized the right
to privacy as a fundamental right under Article 21 of the
Constitution of India.
How data privacy is managed and preserved in data science?
1. Anonymization and pseudonymization:

● Anonymization involves removing personally identifiable


information from data, so it cannot be linked back to an
individual.
● Pseudonymization involves replacing personally identifiable
information with a pseudonym or code, so the data is not
directly identifiable but can be linked back to an individual if
necessary.
2. Data minimization:

● Data minimization involves collecting and storing only


the minimum amount of data necessary to achieve a
specific purpose.
● This can help to reduce the amount of personal data
being collected and limit the risk of data breaches or
privacy violations.
● Data scientists should also consider the specific types of
data they are collecting and ensure that sensitive or
confidential information is not being collected
unnecessarily.
3. Access controls:

● Access controls can help to protect data privacy by


limiting who has access to sensitive or confidential
information.
● Access controls can include password protection,
multi-factor authentication, or other security measures
to limit access to sensitive data.
● Data scientists should also ensure that access
controls are regularly reviewed and updated to ensure
that access is only granted to authorized personnel.
4. Secure data storage:

● Data scientists should ensure that personal data is


stored securely and encrypted if necessary.
● This can include using secure servers or cloud storage
services, implementing firewalls or other security
measures, and regularly backing up data to prevent data
loss.
● Data scientists should also ensure that data storage
policies comply with privacy regulations, such as the
GDPR or CCPA.
5. Data sharing agreements:

● Data sharing agreements can help to protect data privacy


when sharing data with third parties.
● Data scientists should ensure that data-sharing
agreements include provisions for protecting data privacy,
such as requiring third parties to comply with relevant
privacy regulations, implementing appropriate security
measures, and limiting the use of data to specific
purposes.
6. Ethical data science practices:

Ethical data science practices can include regular review


and updating of privacy policies and procedures, ensuring
that privacy considerations are integrated into all stages of
the data lifecycle, and promoting transparency and
accountability in data handling and analysis.
the five C's of data science ethics
The five C's of data science ethics are guiding principles to ensure
responsible and ethical practices in data science.

1. Consent
● What it means: Ensuring that individuals are informed about how
their data is being collected, used, and shared, and that they have
agreed to these practices.
● Key considerations: Clear communication, voluntary participation,
and respecting withdrawal of consent.
● Example: Informed Consent for Data Collection in Mobile Health
Application.
2. Clarity
● What it means: Data collection and model building should be
transparent. Users should understand how algorithms work, what data is
used, and the potential impacts of decisions made by the models.
● Key considerations: Transparency, simplicity in explanation, and
ensuring the public or stakeholders can understand how data is
processed.
● Example: Publishing an easily understandable explanation of how a
recommendation algorithm works, so users know why they receive certain
recommendations.
3. Consistency
● What it means: Ethical practices must be applied consistently
across all phases of data science projects and across all
individuals affected.
● Key considerations: Ensuring fairness, equal treatment, and
avoiding biases in data collection, model training, and deployment.
● Example: Ensuring that a machine learning model treats all
demographic groups fairly and is not biased against any particular
race or gender.
4. Confidentiality
● What it means: Protecting individuals’ privacy and ensuring that sensitive
or personal data is kept secure and is not disclosed without proper
authorization.
● Key considerations: Data anonymization, encryption, and adhering to
privacy laws like GDPR.
● Example: Using anonymized datasets when developing models and
ensuring data storage systems are secure from unauthorized access.
5. Consequences
● What it means: Understanding the potential social, economic, and
personal impacts of data-driven decisions and ensuring that data
science outcomes do not cause harm.
● Key considerations: Assessing the downstream effects of data
science work, including unintended negative consequences.
● Example: Testing algorithms for any harmful bias, ensuring that
predictive models in fields like criminal justice or hiring do not
perpetuate inequality.
Steps of Getting informed consent
1. Provide Clear Information:
● Explain what data will be collected (e.g., personal information, browsing habits,
health data).
● Describe how the data will be used, who will have access, and if it will be shared
with third parties.
● Include the purpose of data collection (e.g., research, improving services,
marketing).
1. Explain the Risks and Benefits:
● Outline any potential risks, such as data breaches or unintended consequences.
● Explain how the individual benefits from sharing their data (e.g., personalized
services, contributing to research).
3. Ensure Voluntariness:
● Consent must be freely given without coercion.
● Offer the ability to decline or revoke consent at any time without
facing negative consequences.
4. Use Plain Language:
● Avoid technical jargon so that individuals can easily understand
the terms of consent.
● Include a summary of the terms for clarity, with the option to read
more detailed policies.
5. Allow for Opt-Out:
● Give individuals the choice to opt out of specific data
uses (e.g., marketing) or withdraw their consent
entirely.
6. Comply with Legal Standards:
● Ensure the consent process complies with laws like
GDPR or CCPA, which require specific, informed, and
voluntary consent.

You might also like