KEMBAR78
Research Methodologies Tutorial | PDF | Type I And Type Ii Errors | Methodology
0% found this document useful (0 votes)
17 views40 pages

Research Methodologies Tutorial

The document outlines the importance of research as a systematic process for discovering new knowledge and addressing various problems, emphasizing its role in business decision-making and societal progress. It discusses different types of research, characteristics, methodologies, and the philosophical foundations that guide research practices. Additionally, it highlights the benefits and challenges of conducting research, as well as the responsibilities of researchers in the research process.

Uploaded by

inyanglinus14
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views40 pages

Research Methodologies Tutorial

The document outlines the importance of research as a systematic process for discovering new knowledge and addressing various problems, emphasizing its role in business decision-making and societal progress. It discusses different types of research, characteristics, methodologies, and the philosophical foundations that guide research practices. Additionally, it highlights the benefits and challenges of conducting research, as well as the responsibilities of researchers in the research process.

Uploaded by

inyanglinus14
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 40

CHAPTER ONE

Research Methodology
Introduction
Research aims to give the reader an integrated view of several questions related to the research.
Research is a process of discovering new knowledge to enrich the progress of society. It is one of
the most essential activities for any business venture.
Companies in various industries spend billions of dollars on research every year. According to
Statista (2022), Amazon, Alphabet and Volkswagen are the top three global companies in terms
of research spending.
Research evolves from human quest for knowledge. It is therefore tailored along the theoretical
perspectives or orientation to guide the logic of enquiry; tools and techniques of data collection, analysis
and the systematic format for organizing the research and reporting its findings. The hallmarks or main
distinguishing characteristics of scientific research are among others its Purposive, Rigor, Testability,
Replicable, Precision and confidence, Objectivity, Generalize and Parsimony. There are different ‘types of
research’ depending on their nature and field of specialization. It is, therefore, useful to first of all take a
look at the underlying basic features so as to be able to identify the different types of research within their
respective notations and usage. The type of research would also vary depending on the objectives of the
study.

What is research?
‘Research is a systematic process of discovery and advancement of human knowledge’ (Gratton
& Jones, 2009, p.4). According to Theodorson and Theodorson (1969) it refers to any honest
attempt to study a problem systematically or to add to man’s knowledge of a problem.
According to Saunders et al. (2007) research is something that people undertake to find out
things in a systematic way, thereby increasing their knowledge.

Research is the process of gathering information and data to better understand a particular topic
or phenomenon. It involves using various methods and techniques to collect, analyze, and
interpret data to draw meaningful conclusions.

Research is used to answer questions and solve problems, as well as identify trends and
opportunities. It is an essential tool for businesses and organizations, as it allows them to make
more informed decisions and stay competitive in their respective market.

Companies often conduct market research to create strategies for growth, develop new products
and services, and improve existing ones. R&D department is usually entrusted with research
activities.

Characteristics of research
Based on the definitions above, there are several characteristics of research that researchers
should be familiar with. They are as follows:
* Research is a systematic (stage by stage) process. An appropriate process must be followed in
order to conduct a study.
* Research is usually conducted to study a problem.
* Researchers conduct an in-depth and critical analysis of all data that they have collected to
ensure that there is no error in the interpretation.

1
* Research is based on observation or direct experience by the researchers.
* Research is objective, unbiased, and logical.
Saunders et al. (2007) state that there are three key characteristics of research as follows:

1. Data are collected systematically.


2. Data are interpreted systematically.
3. There is a clear purpose: to find things out.

Research problems
Research can be done based on any problem that interests the researcher. For example, a
company has lost market share in recent years. Losing your market share is something a
researcher might be interested in looking into to find out possible reasons. However, the research
must have some relevance to attract the interest of a wider audience.

How to choose a research problem?


In colleges and universities, dissertation supervisors (research supervisors) guide and help
student researchers choose a research problem. According to the University of Southern
California (2023) supervisors may do so in three different ways as follows:

1. They may provide researchers with a general topic from which they need to
identify and study a particular aspect.
2. They may provide a list of possible research topics and the researchers need to
select a topic from that list
3. They may simply leave it to the researchers to select a topic to carry out the study
(with their approval though!)

Research topics
Research topics need to be interesting and relevant to the academic and professional fields of the
researcher. How will the research benefit the society? How will it benefit the researcher
academically and professionally?

Is there a clear purpose for conducting research? Answers to this question and others above
should help the researcher identify a good research topic. The following list offers some
interesting examples of research topic:

*An analysis of the impact of lockdowns on small businesses.


*An analysis of the impact of date on small businesses.
*An investigation into the challenges of engaging students in online learning.
*Evaluating the impact of working from home.
* Analyzing the impact of social media on attracting new customers.
*Exploring potential impacts of income inequality on anti-social behavior in New York.
* Understanding the impact on managerial styles on employee performance – a case study of
McDonald’s.
* Exploring social media as a new market.

2
* Pros and cons of working from home from an employer’s perspective.

* Relationship between customer service and customer satisfaction – a case study of Walmart.

Types of research
Research has different types. For example, primary research, secondary research, exploratory
research, descriptive research, explanatory research, predictive research, quantitative research
and qualitative research. They are as follows:
Primary research
Primary research is also called field research. According to Gratton & Jones (2009) primary
research refers to research that has involved the collection of original data specific to a particular
research project, for example through using research methods such as questionnaires or
interviews.

Secondary research
Secondary research is also called desk research. In this type of research, the researcher will not
collect any primary data and will rely on existing sources of data. Marketing research reports,
census, company websites, news reports, magazine articles are some of the sources of secondary
data. Secondary research is usually carried out at home or library with the help of both the
Internet and printed materials.
Descriptive research
Descriptive research aims to accurately and systematically describe a population, situation or
phenomenon. It can answer what, where, when and how questions, but not why questions. A
descriptive research design can use a wide variety of research methods to investigate one or more
variables.
Explanatory research
Explanatory research is a research method used to investigate how or why something occurs
when only a small amount of information is available pertaining to that topic. It can help you
increase your understanding of a given topic.
Predictive research
Predictive research is chiefly concerned with forecasting (predicting) outcomes, consequences,
costs, or effects. This type of research tries to extrapolate from the analysis of existing
phenomena, policies, or other entities in order to predict something that has not been tried,
tested, or proposed before.
Quantitative research
Quantitative research deals in numbers, logic, and an objective stance. Quantitative research
focuses on numeric and unchanging data and detailed, convergent reasoning rather than
divergent reasoning [i.e., the generation of a variety of ideas about a research problem in a
spontaneous, free-flowing manner].
Qualitative research
Qualitative research is multi-method in focus, involving an interpretative, naturalistic approach
to its subject matter. This means that qualitative researchers study things in their natural settings,
attempting to make sense of, or interpret, phenomena in terms of the meanings people bring to
them. Research tools (data collection tools)

3
Researchers can use a variety of data collection tools such as questionnaires, interviews, focus
groups, and observations. Quantitative researchers usually use questionnaires whereas qualitative
researchers use interviews.

Why research? Benefits of research


There may be many reasons for research. For example, it can help the researcher investigate
some existing situation or problem (Hussey and Hussey, 1997). Fundamentally, research is to
provide solutions to a problem. The results of research may include but not limited to new
knowledge, and better insights into a problem which is otherwise would not have been possible.

Research can provide a wealth of information that can be used by organizations and businesses
to make better decisions and stay ahead of competition. It can also help them identify potential
opportunities and threats, as well as determine the best strategies for growth. Here are some of
the benefits of research:

Improved decision-making
Research can provide the data and information necessary to make more informed decisions. This
can help organizations and businesses make better decisions in a timely manner.

Increased efficiency
Research can help identify processes that are inefficient and provide ways to improve them. This
can help organizations and businesses save time and money by streamlining their operations.

Improved customer satisfaction


Research can be used to gain insights into customer behavior and preferences. This can help
organisations and businesses develop products and services that are more tailored to their
customers’ needs, resulting in increased customer satisfaction.

Increased innovation
Research can identify potential opportunities and new ideas that can be used to create innovative
products and services.

Reduced risks
Research can help organizations identify potential risks and take measures to mitigate them. This
can help them reduce the chances of failure and increase the chances of success.

Challenges of conducting research


Cost
Conducting research can be expensive, as it requires a great deal of resources such as time,
money, and personnel. Organizations and businesses must ensure that they have the necessary
resources to conduct research without sacrificing their other operations.
Time
Conducting research can take a great deal of time, as it requires the collection and analysis of
data. Organizations and businesses must ensure that they have enough time to conduct research
without sacrificing their other operations.
What does a researcher do?

4
Researchers are responsible for conducting research, collecting data, and analysing it to draw
meaningful conclusions. They may also be tasked with writing reports, making
recommendations, and presenting their findings to colleagues and the public.
Researchers typically work in teams. They may also have to work collaboratively with other
professionals in order to ensure the success of their projects. They also need to be able to work
independently and take initiatives in order to complete their tasks on time and within budget.
Methodology
In its broadest sense, methodology is the study of research methods. However, the term can also
refer to the methods themselves or to the philosophical discussion of the associated basic
assumptions. A method is a structured procedure to achieve a given goal, such as acquiring
knowledge or checking knowledge claims. This usually involves several steps, such as: B.
selecting a sample, collecting data from that sample, and interpreting the data. The study of
methods includes a detailed description and analysis of these processes. Includes evaluation
aspects for comparing different methods. In this way, their advantages and disadvantages are
evaluated and for what research purposes they can be used. These descriptions and evaluations
are based on philosophical assumptions. Examples include the conceptualization of the
phenomena under study and what evidence is for or against it. Understood in the broadest sense,
the methodology also includes the examination of more abstract topics.
The term methodology refers to the overall approaches & perspectives to the research process as
a whole and is concerned with the following main issues:
➢ Why you collected certain data
➢ What data you collected
➢ Where you collected it
➢ How you collected it
➢ How you analyzed it
Methodology is therefore, the manner in which we approach and execute systematic
investigative activities. Within a discipline, there are accepted rules of evidence and reasoning.
Research methodology is a way of explaining how a researcher intends to carry out their
research. It’s a logical, systematic plan to resolve a research problem. A methodology details a
researcher’s approach to the research to ensure reliable, valid results that address their aims and
objectives.
Research methodology provides the principles for organizing, planning, designing and
conducting research. (It does not tell you how to do specific research).
A research method on the other hand refers only to the various specific tools or ways data can
be collected and analyzed, e.g. a questionnaire; interview checklist; data analysis software etc.).
It consists of specific details of how we do the task.
In short, to further differentiate research methodology from research methods:
➢ Methodology – general approaches or guidelines
➢ Methods – specific details and/or procedures to accomplish a task.
Methodology Vs Methods
The difference between research method and methodology are:
• A research method is used to find solution to the research problem. On the other hand, a
research methodology determines the appropriateness of the methods applied with a view to
ascertain solution.
• Research methodologies are applied in the initial stages of the research; while research method
is employed in the later stages of the study.

5
• Research methods encompass the strategies to accomplish the research objectives; while
methodologies include the different techniques to carry out investigation in the study.
Examples of methods?
➢ (Regression analysis, optimization models, surveys, matching, simulation etc.)
Research Methodology in Economics
Research methodology in Economics is a study which integrates the various components of
economics to accomplish a defined, goal-directed research. It aims at:
 expanding our knowledge and make it useful to the study of world economic problems.
 helping us to learn by doing under the supervision of an advisor (shown to be an effective
model)
 pulling together various aspects of economic theories, methods, and analysis
 to present in a coherent, logical, reliable and useful manner.
 providing a time-tested, proven means of producing new, reliable theories
 and their applications in solving real world economic problems

6
CHAPTER TWO
Philosophical And Conceptual Foundations Of Research
Introduction
Research is a systematic way for finding things that are not known, which are called Research
problems, it is a process consisting of the identifying and defining research problem, formulating
and testing the hypothesis through data collection, organization and analysis, making deductions
and reaching of conclusion from the test results of the hypotheses, and reporting and evaluating
the research. Prior to the creation of a research proposal, there is the need to identify a problem
to address and then questions to ask regarding the targeted problem.

Philosophical foundation of research


All philosophical positions and their attendant methodologies, explicitly or implicitly, hold a
view about reality. This view, in turn, will determine what can be regarded as legitimate
Knowledge. Philosophy works by making arguments explicit. You need to develop sensitivity
towards philosophical issues so that you can evaluate research critically. It will help you to
discern the underlying, and perhaps contentious, assumptions upon which research reports are
based even when these are not explicit, and thus enable you to judge the appropriateness of the
methods that have been employed and the validity of the conclusions reached.
Obviously, you will also have to consider these aspects in regard to your own research work.
Your research, and how you carry it out, is deeply influenced by the theory or philosophy that
underpins it. There are different ways of going about doing research depending on your
assumptions about what actually exists in reality and what we can know and how we can acquire
knowledge.
Theoretical perspectives relate to theories of knowledge which lies within the domain of
philosophy. The key concept associated with the perspectives is the Paradigm. Some authors
have classified these paradigms into three categories by re-naming them as:

Realism
Realism assumes that there is a real world that is external to the experience of any particular
person and the goal of research is to understand that world.
Constructivism
Constructivism assumes that everyone has unique experience and beliefs and it posits that no
reality exists outside of those perceptions.

Pragmatism
Pragmatism considers realism and constructivism as two alternate ways to understand the world.
However, the questions about the nature of reality are less important than questions about what is
meant to act and experience the consequence of those actions. The knowledge of all these
perspectives enable a researcher to make a meaningful choice About the research problem;
i. the research questions to investigate this problem;
ii. the research strategies to answer these questions;
iii. the approaches to social enquiry that accompany these strategies;
iv. the concept and theory that direct the investigation;
v. the sources, forms and types of data;
vi. the methods for collecting and analyzing the data to answer these questions.

7
Conceptual Foundations of Research
Research is deeply influenced by the theory or philosophy that underpins it. There are different
approaches to research depending on your assumptions about what actually exists in reality and
what we can know and how we can acquire knowledge. In our reasoning process, we are faced
with two separate worlds – the world of reality and the imaginary world (the world of theory or
abstraction). Reality is the phenomenon which exists somewhere and can be verified empirically.
Scientist is interested in the world of reality.
In science, we get close to reality through experience, logical reasoning, induction and
deduction. Whenever, we develop a process that led to eventual attainment of reality, we have
contributed to knowledge.

The researcher works through a thought process that requires some basic scientific concepts or
language in order to understand the reality. These are facts, concepts, Constructs, rules,
hypotheses, theories and laws. He may also need models, definitions and generalizations.

Fact
Fact is a reality which exists. It is a truth that can be known only by observation or experience.
Most scientific discoveries are facts or realities which are knowable but at one time or the other
were not known experienced or observed. With time and through gradual scientific process, the
unknown becomes known. Thus, the unknown reality becomes a Known fact. Example:
sometimes in the past electricity existed but unknown, so also solar Radiation and other forms of
electricity.

Concepts
A concept is our perception of reality to which we have attached some word labels for the
purpose of identification. To conceptualize, mean the mental image of reality. Thus, Concepts are
very important in research and science. A concept therefore expresses an abstraction formed from
our generalization of different forms of reality.

Constructs
A construct is simply as a concept that is deliberately defined for a particular scientific purpose
which becomes a concept when formalized. Thus there is a thin line between a concept and
construct.

Variables
A variable is a construct or concept to which numerical values can be attached. It is a concept
which can take on different values. For example, height, weight, income. There is the need to
identify the different variables so as to know how to manipulate them and obtain desired result in
a research.

Definitions
The desired meaning assigned to concept or variable in order to define it. There two kinds:
Constitutive and operational
i. In constitutive definition, we substitute the concept or construct we are defining with
other concepts or constructs. For Example, we define trade as an act of buying and

8
selling. Thus, we have substituted the concept of trade with the concepts of act,
buying, selling. It gives dictionary definition of variables, concepts or constructs.

ii. In operational definition, the concept or construct is assigned a type of meaning


which we want it to carry throughout the study. Here, the researcher manipulates or
measures the concept to get desired result. Operational definition can either be
measured or experimental. Example, Performance / productivity as the quantity of
goods produced, no of students supervised and graduated etc. An experimental
operational definition shows how the researcher can practically manipulate a given
variable to get the result. Example, motivation: Give a worker incentive or reward to
observe whether he /she is motivated.

Hypothesis
Tentative statement about relationships that exist between two or among many variable. It is a
statement about relationships that need to be tested and subsequently accepted or rejected. It puts
together all the concepts, constructs and the variables to give the researcher A clearer view of the
problem(s) under study.
Laws
If hypothesis is true, it states a law. It is a statement of invariant relationship among observable
or measurable properties. We can distinguish laws of nature from the laws of science. Laws of
nature hold independently anyone is aware of it or not. On the other hand, Laws of science are
hypotheses or postulates which are the objects of rational belief based On evidence and which
states the laws of nature.
Theories
Invariant relationship among measurable phenomenon with the purpose of explaining and
Predicting phenomenon. A theory consists of constructs, concepts, definitions, propositions
(hypotheses) and all these put together to present a systematic view of phenomenon and possibly
predict the phenomenon.

Research Process
Research process refers to the different steps involved in a desired sequence in carrying out
research. However, this does not mean that these steps are always in a given sequence.
The following are the series of actions or steps necessary to effectively carry out research:
Formulating the research problem: A research problem is the issue being addressed in a study.
The issue can be a difficulty or conflict to be eliminated, a condition to be improved, a concern
to handle, a troubling question, a theoretical or practical controversy (or a gap) that exists in
scholarly literature. It is the focus or reason for engaging in a research, it is the obstacles which
hinder the researcher’s path.
Literature review: A literature review is a step-by-step process that involves the identification
of published and unpublished work from secondary data sources on the topic of interest, the
evaluation of this work in relation to the problem, and the documentation of this work (Sekaran
& Bougie, 2009). In order to gain knowledge on your area of research you need to read and
assess the previous studies so as to Know where your will fit into that body.

Developing the hypothesis: A research hypothesis is a predictive statement, capable of being


tested by scientific methods, that relates independent variable to some dependent variable.

9
Preparing the research design: It constitutes the blueprint for the collection, measurement and
analysis of data. It includes an outline of what the researcher will do from writing the hypothesis
and its operational implications to the final analysis of data. More explicitly, the design decisions
happen to be in respect of The sampling design which deals with the method of selecting items to
be observed for the given study; The observational design which relates to the conditions under
which the observations are to be made; The statistical design which concerns with the question of
how many items are to be observed and how the information and data gathered are to be
analysed; and The operational design which deals with the techniques by which the procedures
specified in the sampling, statistical and observational designs can be carried out.

Collecting the data: Data collection is the process of gathering and measuring information on
variables of interest, in an established systematic fashion that enables one to answer stated
research questions, test hypotheses, and evaluate outcomes. The goal for data collection is to
capture quality evidence that then translates to rich data analysis and allows the building of a
convincing and credible answer to questions that have been posed. It is one of the most important
stages in conducting a research. The task of data collection begins after a research problem has
been defined and research design/plan drawn out. While deciding about the method of data
collection to be used in a study, two types of data should be kept in mind, primary and secondary.

Analysis of data: The analysis of data requires a number of related operations on the raw data
such as coding, tabulation and then drawing statistical inferences. Hypotheses earlier stated will
be subjected to tests of significance to determine with what validity data can be said to indicate
any conclusion(s). If the researcher had noHypothesis to start with, he might seek to explain his
findings on the basis of some theory.

Preparation of the research report: At the final stage, writing of report must keep in view (i)
the preliminary pages; (ii) the main text, and (iii) the end matter.

i. Preliminary pages contain the title, acknowledgements, certification, a table of


contents a list of tables and list of graphs and charts, if any, given in the report.

iii. The main text of the report should have Introduction: It should contain a clear
statement of the objective of the research and the scope of the study along with
various limitations. Conceptual, empirical and the theoretical literature review An
explanation of the methodology adopted in accomplishing the research A statement of
findings, and recommendations in non-technical language. If the findings are
extensive, they should be summarized.

iv. At the end of the report, bibliography, i.e., list of books, journals, Reports, etc.,
consulted, should be given; appendices should be enlisted in respect of all technical
data.

Identification of Research Problem


Before moving into the identification of research problem, it is important to know what is a
research problem. A research problem is the issue being addressed in a research study. The issue
can be a difficulty or conflict to be eliminated, a condition to be improved, a concern to handle, a

10
troubling question, a theoretical or practical controversy (or a gap) that exists in scholarly
literature. It is the focus or reason for engaging in a research, it is the obstacles which hinder the
researcher’s path.
A research problem helps in narrowing the topic down to something that is reasonable for
conducting a study. Creswell (2012) defined research problem as “a general educational issue,
concern, or controversy addressed in research that narrows the topic. Ogbonna (2006) defines
research problem as a felt difficult, a puzzle, a vague feeling or a guest in the researcher’s mind
to complete a blank or fill in a gap in the researcher’s experience. Awotunde and Ugodulunwa
(2004) defines a research problem as an unanswered question.

Identification of research problem is the first step in a research process. It serves as a foundation
upon which other activities in the research process are build, if it is well formulated, you can
expect a good study to follow. Student researchers finds it difficult to identify research problems.
Identifying the research problem could be accomplished by asking ourselves the following
questions; what is the issue, problem, or controversy that needs to be addressed? What
controversy leads to a need for this study? What were the concerns being addressed prior to this
study? The following steps are to be followed in identifying a research problem:
a. Determining the field of research in which the research is to be conducted.
b. Develop the mastery on the area.
c. Review previous researches conducted in the area to know the recent trend and studies
conducted in the area, this will aid in identifying the problem.
d. Draw an analogy and insight in identifying a problem or employ personal experience of the
field in locating the problem or seek for assistance from an expert in the field.
e. Pin point specific aspect of the problem which is to be investigated.

The Sources of Research Problem:


a. The classroom, school, home, community and other agencies of education are obvious
sources.
b. Social developments and technological changes are constantly bringing forth new
problems and opportunities for research.
c. Record of previous research, such specialized sources as the encyclopedias of
educational, research abstracts, research bulletins, research reports, journals of
researches, dissertations and many similar publications are rich sources of research
problems.
d. Text book assignments, special assignments, reports and term papers will suggest
additional areas of needed research.
e. Discussions-Classroom discussions, seminars and exchange of ideas with faculty
members and fellow scholars and students will suggest many stimulating problems to be
solved, close Professional relationships, academic discussions and constructive academic
climate are especially advantageous opportunities.
f. Questioning attitude: A questioning attitude towards prevailing practices and research
oriented academic experience will effectively promote problem awareness.

g. The most practical source of problem is to consult supervisor, experts of the most
experienced persons of the field. They may suggest most significant problems of the area.

11
Steps in formulating a research problem
The formulation of a research problem is the most crucial part of the research as the quality and
relevance of the research entirely depends upon it. After the identification of research problem,
the next step is to formulate research problem. Every step that constitutes the how part of the
research excursion depends upon the way the researcher formulates his/her research problem.
The process of formulating a research problem consists of a number of steps as follows:
a. Identify a broad field or subject area of interest: A research should think of are search area that
he/she is interested in. If a researcher is not interested in a research area, he/she will not be keen
on the research. Also finding an interested area will help him/her to have the zeal to conduct the
research effectively. In an interesting topic, for example, if you are an economic researcher you
might be interested in researching consumer behavior, energy pricing etc. As far as the research
journey goes, these are the broad research areas. It is imperative that he/she identifies one of
interest before undertaking the research excursion.
b. Divide the broad area into subareas: At the beginning, the researcher will realize that the broad
areas mentioned (consumer behavior and energy pricing) have many aspects. For example, there
are many aspects and issues in the area of energy, such as oil, gas, wind, hydro, thermal. Make a
list of these areas. In preparing this list of subareas there is the need to consult others who have
some knowledge in the area and review literature in the subject area.
c. The researcher should select what is of most interest him/her: The researcher should select
subarea which is most suitable and adoring to him/her. This is because the researchers’ interest
should be the most important determinant for selection, even though there are other
considerations. One way to decide what interests the researcher most is to start with the process
of elimination. Go through the list and delete all those subareas in which he/she is not very
interested in.
d. Raise research question:. At this step the researcher should ask his/her self, ‘What is it that I
want to find out about in this subarea?’ He/she should make a list of whatever questions come to
his/her mind involving the chosen subarea and if he/she thinks they are too many to be handled,
chose the ones that are more important and discard the rest.
e. Formulate objectives: Both the main objectives and sub objectives should be formulated, the
objectives should be drawn from the research questions:
The main difference between objectives and research questions is the way in which they
are written. Research questions are obviously those – questions. Objectives transform
these questions into behavioral aims by using action oriented words such as ‘to find out’,
‘to determine’, ‘to ascertain’ and ‘to examine’ ‘to analyze’.
f. Assess objectives: The researcher should examine the objectives to find out the feasibility
of achieving them through the research process. Consider them in the light of the time, resources
(financial and human) and technical expertise at disposal.
g. Double-check: The researcher should go back and give final consideration to whether or
not he/she is sufficiently interested in the study, and have adequate

12
CHAPTER THREE
Developing Research questions and hypotheses
Research questions
Research questions help researchers to focus on their research by providing a path through the
research and writing process.
Steps involved in developing a research question:
a. Choose an interesting general topic: Most professional researchers focus on topics . They
are genuinely interested in studying. A researcher should always choose a broad topic
about which he/she would like to know more. An example of a general topic might be
“greenhouse gases emission and global warming.”
b. Do some preliminary research on your general topic and narrow it down: Search current
periodicals and journals on the topic to see what’s already been done, this will assist in
narrowing the focus. What issues are scholars and researchers discussing, when it comes
to topic? What questions occur to the researcher as he/she read these articles?
c. Consider the audience: For most college papers, your audience will be academic, but
always keep your audience in mind when narrowing your topic and developing your
question. Would that particular audience be interested in the question you are developing?
d. Ask questions relating to the research topic: Taking into consideration all of the above,
the researcher should ask “how” and “why” questions about your general topic. For
example, “how does greenhouse gases emission contributes to global warming?” or “why
does greenhouse gases emission contributes to global warming?”
e. Evaluate the question asked: After writing down the questions on paper, evaluate these
questions to determine whether they would be effective research questions or whether
they need more revising and refining.

➢ Is the research question clear? With so much research available on any given topic, research
questions must be as clear as possible in order to be effective in helping the writer direct his or
her research.
➢ Is the research question focused? Research questions must be specific enough to be well
covered in the space available.
➢ Is the research question complex? Research questions should not be answerable with a simple
“yes” or “no” or by easily-found facts. They should, instead, require both research and analysis
on the part of the writer. They often begin with “How” or “Why.”

f. Start the research: After coming up with a question, think about the possible paths the
research could take. What sources should be consult as a researcher seeks answers to the
question/questions? What research process will ensure that a variety of perspectives and
responses to the question / questions are found?

Development of hypothesis
A hypothesis is a tentative answer to a research problem that is advanced so that it can be tested.
The definition of a hypothesis stresses that it can be tested. To meet this criterion, the concepts
employed in the hypothesis must be measurable. Developing hypotheses requires that a
researcher identifies one variable that causes, affects, or has an influence on, another variable.
The variable that affects other variables is called the independent/regression /explanatory
variable, while the variable which is explained by the independent Variable is called the

13
dependent/regression and/ response variable. After extensive literature review, researcher should
state in clear terms the research Hypothesis or hypotheses. Research hypothesis is tentative
assumption made in order to draw out and test its logical or empirical consequences. As such the
way in which research hypotheses are developed is particularly important since they provide the
central point for research. They also affect the way in which tests must be conducted in the
analysis of data and indirectly the quality of data which is required for the analysis.

In economic research, the development of research hypothesis plays an important role.


Hypothesis should be very specific and limited to the piece of research in hand because it has to
be tested. It should also be simple, and conceptually clear. There is no place for ambiguity in the
construction of a hypothesis, as ambiguity will make the verification of the hypothesis almost
impossible. It should be ‘one-dimensional’ that is, it should test only one relationship at a time.

The role of the hypothesis is to guide the researcher by delimiting the area of research and to
keep him/her on the right track. It sharpens his/her thinking and focuses attention on the more
important aspects of the problem. It also indicates the type of data required and the type of
methods of data analysis to be used.

A research hypothesis being a predictive statement, it is capable of being tested by scientific


methods, that relates an independent variable to some dependent variable. For example,
“Government expenditure causes economic growth.”
In statistical terms we have alternative hypothesis (H1) which the research wishes to prove and
the null hypothesis (Ho) is the one which the research wishes to disprove. Thus, a null
Hypothesis represents the hypothesis to be rejected, and alternative hypothesis represents the
hypothesis to be accepted, for example;
HO: Government expenditure does not cause economic growth.
H1: Government expenditure causes economic growth.

Characteristics of Hypothesis.
Hypothesis should be:
a. Clear and precise.
b. Capable of being tested.
c. Able to state relationship between variables, if it happens to be a relational hypothesis.
d. Limited in scope and must be specific.
e. Stated in simple terms.
f. Consistent with most known facts i.e. one which judges accept as being the most likely.
g. Testable within a reasonable time for one cannot spend a life-time collecting data to test it.
Steps in Developing Research Hypotheses
a. Discussions with colleagues and experts about the problem, its origin and the objectives
in seeking a solution.
b. Examination of data and records, if available, concerning the problem for possible trends,
peculiarities and other clues of the research problem.
c. Review of similar studies in the area or of the studies on similar problems.
d. Exploratory personal investigation which involves original field interviews on a limited
scale with interested parties and individuals with a view to secure greater insight into the
practical aspects of the problem.

14
Therefore, research hypotheses arise as a result of a-priori thinking about the subject,
examination of the available data and material including related studies and the counsel of
experts and interested parties. Research hypotheses are more useful when stated in precise and
clearly defined terms.

Errors in research
When a researcher tests hypothesis in research two types of errors may occur in his/her research
procedures. These are Type I error and Type II error.

Type I error: If the null hypothesis of a research is true, but the researcher takes decision to
reject it; then an error must occur, it is called Type I error (false positives). It occurs when the
researcher concludes that there is a statistically significant difference when in reality it does not
exists. This is an analogy of a test that shows a patient to have a disease when in fact the patient
does not have the disease. Imagine the consequences of Type I Error of a HIV test on a patient
promising a 99% accuracy rate.

Type II error: If the null hypothesis of a research is actually false, and the alternative hypothesis
is true. The researcher decides not to reject the null hypothesis, then he is said to commit a Type
II error (false negatives). For example, a blood test failing to detect the disease it was designed to
detect in a patient who really has the disease is a Type II error.

Neyman and Pearson, (1928) were the first to both Types I and II errors (Mohajan, 2017). The
Type I error is more serious than Type II, because a researcher has wrongly rejected the null
hypothesis. Both Type I and Type II errors are factors that every researcher must take into
account.
We have observed that a research is error free in the two cases:
i.If the null hypothesis is true and the decision is made to accept it, and
ii.If the null hypothesis is false and the decision is made to reject it.

15
CHAPTER FOUR
Research Organization And Reporting

One of the most important requirements for graduating from a tertiary institution is an
undergraduate research project. An undergraduate student may be required to write a full
research project under the supervision of a lecturer. This student may be assigned to a lecturer
for supervision during the period of this research. The research is usually carried out within the
limits of a student’s study. The Undergraduate research project writing process could be very
intimidating exercise.

It is the duty of the researcher to receive the research proposals, make necessary assessments and
approvals. Most often an external supervisor is brought in to provide an independent assessment
of the undergraduate research project work.

Research Proposals: Writing an undergraduate research project proposal is usually very tough
for some students and this is primarily because most students are just doing this for the first time.
But I can assure you that if you have been paying careful attention to some of the project writing
tips posted on this site you will have little or no problem in the undergraduate research project
writing process or while writing your proposals. The proposals for the topic should have the
subject of the study, a brief but comprehensive description and justification for the work, aims
and milestones, software and hardware to be employed, assumptions to be made, the research
methodology and references.

Research Guideline: A typical undergraduate research project writing build-up will pass
through these phases; researching, presentation and print submissions. These are to ensure that
the student is adequately prepared for the harsh environment outside school. While developing
the content of your research (which is usually divided into five chapters), these are some very
helpful guidelines that you should follow.

Undergraduate Research Project Format:

 Title page
 Approval page
 Dedication
 Acknowledgement
 Abstract
 Table of content
 List of tables
 List of figures
 List of symbols/nomenclature(where applicable)
 Main work(chapter 1-5)
 References
 Appendices(where applicable)

Title page: this is where you would have to remember the name of your institution and write it
down in full, on no account should you write an abbreviated version of the name of your higher

16
institution or a slug. Write out the name of your higher institution in full, the title of the research,
the name of the author and your matriculation number, then the reason for the research, this is
where you will write “it is in partial fulfillment of the course requirement required for the award
of the B.Sc degree (or B.Art, LLB, B.Eng write what is applicable to your course of study).”
Then add the date to it.

The Research Title /Topic ------------------------------------------


comes here should be (CASE STUDY OF ------------------)

BY
Your Name in Full and
--------------------------
Matriculation Number
Matric. No

-------------------------------UNIVERSITY OF______
Course of Study and Name
of School ________________

IN PARTIAL FULFILMENT OF THE REQUIRMENT


FOR THE AWARD OF BACHELOR DEGREE IN
Reasons for the Research -----------------------------------------------
FACULTY OF ---------

Date : Month/Year November 2023

Use the Diagram above as a guide.


Approval page: the name of the institution and department, then a statement signifying approval
for the work by the supervisor, head of department and external supervisor. Space is reserved for
signatures of all listed parties as well.

Dedication Page: this is the part of the research where the student can either choose to dedicate
the research to God, a friend, marriage partner, parents, children or Ancestors (this is common
practice among African students)
Acknowledgement: this is the part where the researcher writes to acknowledge those that
contributed to the success of the research project.
Abstract: the abstract is a brief summary of the major points of the written work and it is often
written last with the tense in past. This summary is usually less than 100 words and it is expected
to summarize the problem statement, the research methodology and recommendations. This
should be in a single paragraph and the word limit not exceeded.

17
Table of content: the main heading(s) and sub-heading(s) and page numbers are listed here.
This serves as a navigational map for the research work. Making page identification and
reference very easy. The table of content should be edited at the end of the research so that every
part of the work can be captured in the table of content.

List of Tables/Figures/Symbols: the list is to aid the reader in locating tables/figures/symbols. It


should contain the tag numbers, tags which reflects the content and the page numbers. It should
be well-numbered and unambiguous. In the main content, the figure/table should be well-labeled.

Chapter One: This is usually the introduction. It describes the background, scope and purpose
of the research. The rest of the report should be tied to the information supplied. The researcher
should strive to present sufficient details regarding why the study was carried out. Carefully
laying out your content in a hierarchical order would be preferred. arrange the content from top
to bottom. You can go on to conclude the chapter one of your research project with a linking
paragraph that would expose the objectives, constraints and limitations of the study.

That is chapter one consists of :


➢ background to the study;
➢ statement of the problem;
➢ objectives of the study;
➢ hypotheses of the study;
➢ significance or justification of the study

Chapter Two: The chapter two of your research project should cover your literature theoretical
review. Your research should be based on the ground work done by others and chapter is that
chapter of the research project where you would have to present the work done by others,
summarizing the points made by past authors and the areas left untouched by the authors. While
writing your literature review you should properly give due credits to past authors and
researchers whose work you may be using as a reference. You should reduce the use of
quotation, paraphrasing may be preferred while writing your literature review. You could
comment on the work of past authors on the field and care should be taken while writing a
literature review not to derail into criticism of an existing literature on your subject of study as
this is unprofessional and unethical. Focus should be on the author’s contributions and you
should try to also point out some of the relevant facts and details left out by some of the past
authors.

This chapter consists of


➢ conceptual framework;
➢ theoretical framework; and
➢ review of related literature.

Chapter Three: this part of your work contains the research methodology and the language
used here should be in the past tense. It is a sum-up of the research design, procedures, the area
and population of study, data sampling and data sources are detailed as well. The method used,
from all alternatives, should also be justified. The materials and equipment used should be
included.

18
Meaning that this chapter which is methodology may consist of
➢ population of the study;
➢ sources and type of data to be used;
➢ research design (sampling method used in selecting the objects of study, sample size
of the objects);
➢ method of data collection;
➢ methods of data analysis (descriptive and inferential methods – descriptive and
estimation procedures):
➢ pre-estimation tests;
➢ variables measurements;
➢ model specification; and
Of an expert (or other authority) need to be engaged, both for appropriateness, and for good
English.

Chapter Four: Data presentation and analysis. The results collated during the research will be
presented here, that is why chapter four is usually the best place to use visual presentations like
graphs, charts, tables etc. the results should be discussed here and compared to the result of past
authors. the effects and application of the result should be detailed as well. When writing chapter
four a research student should pay very close attention the data being analyzed to avoid simple
errors that usually occur during data analysis. Data should be outline correctly and since we may
have to deal with data this part of the research should be easier for a student with access to a
computer.
Chapter four consists of
➢ data presentation,
➢ Data analysis and interpretation, and
➢ discussion of findings.

Chapter five: This part houses the conclusions and recommendations. From the results of the
research, conclusions are made, and then there are suggested options for improvement for other
researchers with similar interest. Based on the whole happenings, recommendations are
suggested for tackling the issues raised during the research.

This Chapter comprise of summary, conclusions and recommendations will consists of


➢ summary of major findings;
➢ conclusions;
➢ policy implications;
➢ recommendations; and
➢ suggestions for further study.

References: this is a comprehensive list of all the books, journals, and other sources of
information cited in the research work, this could either be online or print materials. Most
students still use referenced materials to create heavily plagiarized research and they face
disciplinary actions when caught. But not all schools presently organize plagiarism detection
tests for their student’s research. it is however better to avoid writing plagiarized research work
and all quoted and exact words of different sources should be properly referenced, in-text and at

19
the references’ list/bibliography. MLA, APA and Chicago style are the commonest referencing
styles.

Appendices: Materials that are relevant to the work but were not added should be listed here.
This is to back-up the facts of the research. it encapsulates extensive proofs, official data from
case study, list of parameters, et al.

Tips:

 After writing time should be taken to proof read the content of the research and if
possible an independent editor should be consulted to help edit the research for
grammatical and spelling errors as this may affect the quality of the research content.
 Ensure that the final submission is clear and uses the specified font and font size required
for the research.
 Above all ensure that your research is up to standard, read through other works that have
been written in your field and use them as a guide to writing your own research.

Chapter Five
Citation And Reference

20
Referencing and citation is the process of acknowledging the sources you have used when
writing your work. Correct referencing and citation also allows the reader of your work to easily
identify those sources used in your work and to follow up on them if necessary. It is good
academic practice to acknowledge the contributions that others have made to your work.

Referencing and citation demonstrates;

 That your ideas, critical thinking and comments show you have researched the topic by
referring to established experts and authorities
 Your research skills and that you have made full use of the library resources
 Authenticity by using good academic practice
 Your ability to reference correctly and accurately. Remember: there are often marks
available in your assignments for good referencing and citation.

Information for your assignment may come from a wide range of sources; including; books,
articles, websites, and newspapers.
Using a variety of sources demonstrates that you have spent time researching your topic. These
sources will also help you to reinforce your own thoughts and arguments.
Anything that is not common knowledge must be cited.
You should consult lists of recommended reading from your tutors and make full use of
resources available via the Library catalogue. Do NOT rely principally on internet search engines
to find sources.

The Three Main Method of Citations


APA Citation Basics

When using APA format, follow the author-date method of in-text citation. This means that the
author's last name and the year of publication for the source should appear in the text, like, for
example, (Jones, 1998). One complete reference for each source should appear in the reference
list at the end of the paper.
If you are referring to an idea from another work but NOT directly quoting the material, or
making reference to an entire book, article or other work, you only have to make reference to the
author and year of publication and not the page number in your in-text reference.
On the other hand, if you are directly quoting or borrowing from another work, you should
include the page number at the end of the parenthetical citation. Use the abbreviation “p.” (for
one page) or “pp.” (for multiple pages) before listing the page number(s). Use an en dash for
page ranges. For example, you might write (Jones, 1998, p. 199) or (Jones, 1998, pp. 199–201).
This information is reiterated below.
Regardless of how they are referenced, all sources that are cited in the text must appear in the
reference list at the end of the paper.
In-text citation capitalization, quotes, and italics/underlining
 Always capitalize proper nouns, including author names and initials: D. Jones.
 If you refer to the title of a source within your paper, capitalize all words that are four
letters long or greater within the title of a source: Permanence and Change. Exceptions
apply to short words that are verbs, nouns, pronouns, adjectives, and adverbs: Writing
New Media, There Is Nothing Left to Lose.

21
(Note: in your References list, only the first word of a title will be capitalized: Writing new
media.)
 When capitalizing titles, capitalize both words in a hyphenated compound word: Natural-
Born Cyborgs.
 Capitalize the first word after a dash or colon: "Defining Film Rhetoric: The Case of
Hitchcock's Vertigo."
 If the title of the work is italicized in your reference list, italicize it and use title case
capitalization in the text: The Closing of the American Mind; The Wizard of Oz; Friends.
 If the title of the work is not italicized in your reference list, use double quotation marks
and title case capitalization (even though the reference list uses sentence case):
"Multimedia Narration: Constructing Possible Worlds;" "The One Where Chandler Can't
Cry."

SHORT QUOTATIONS
If you are directly quoting from a work, you will need to include the author, year of publication,
and page number for the reference (preceded by "p." for a single page and “pp.” for a span of
multiple pages, with the page numbers separated by an en dash).
You can introduce the quotation with a signal phrase that includes the author's last name followed by the
date of publication in parentheses.
According to Jones (1998), "students often had difficulty using APA style, especially when it was
their first time" (p. 199).
Jones (1998) found "students often had difficulty using APA style" (p. 199); what implications
does this have for teachers?

If you do not include the author’s name in the text of the sentence, place the author's last name,
the year of publication, and the page number in parentheses after the quotation.
She stated, "Students often had difficulty using APA style" (Jones, 1998, p. 199), but she did not
offer an explanation as to why.

LONG QUOTATIONS
Place direct quotations that are 40 words or longer in a free-standing block of typewritten lines
and omit quotation marks. Start the quotation on a new line, indented 1/2 inch from the left
margin, i.e., in the same place you would begin a new paragraph. Type the entire quotation on
the new margin, and indent the first line of any subsequent paragraph within the quotation 1/2
inch from the new margin. Maintain double-spacing throughout, but do not add an extra blank
line before or after it. The parenthetical citation should come after the closing punctuation
mark.
Because block quotation formatting is difficult for us to replicate in the OWL's content management
system, we have simply provided a screenshot of a generic example below.

22
Formatting example for block quotations in APA 7 style.

QUOTATIONS FROM SOURCES WITHOUT PAGES


Direct quotations from sources that do not contain pages should not reference a page number.
Instead, you may reference another logical identifying element: a paragraph, a chapter number,
a section number, a table number, or something else. Older works (like religious texts) can also
incorporate special location identifiers like verse numbers. In short: pick a substitute for page
numbers that makes sense for your source.
Jones (1998) found a variety of causes for student dissatisfaction with prevailing citation
practices (paras. 4–5).
A meta-analysis of available literature (Jones, 1998) revealed inconsistency across large-scale
studies of student learning (Table 3).

SUMMARY OR PARAPHRASE

If you are paraphrasing an idea from another work, you only have to make reference to the
author and year of publication in your in-text reference and may omit the page numbers. APA
guidelines, however, do encourage including a page range for a summary or paraphrase when it
will help the reader find the information in a longer work.
According to Jones (1998), APA style is a difficult citation format for first-time learners.
APA style is a difficult citation format for first-time learners (Jones, 1998, p. 199).

23
CHAPTER SIX

What Is the Correlation Coefficient?


The correlation coefficient is a statistical measure of the strength of a linear relationship between
two variables. Its values can range from -1 to 1. A correlation coefficient of -1 describes a
perfect negative, or inverse, correlation, with values in one series rising as those in the other
decline, and vice versa. A coefficient of 1 shows a perfect positive correlation, or a direct
relationship. A correlation coefficient of 0 means there is no linear relationship.
Correlation coefficients are used in science and in finance to assess the degree of association
between two variables, factors, or data sets. For example, since high oil prices are favorable for
crude producers, one might assume the correlation between oil prices and forward returns on oil
stocks is strongly positive. Calculating the correlation coefficient for these variables based on
market data reveals a moderate and inconsistent correlation over lengthy periods.

KEY TAKEAWAYS

 Correlation coefficients are used to assess the strength of associations between data
variables.
 The most common, called a Pearson correlation coefficient, measures the strength and the
direction of a linear relationship between two variables.
 Values always range from -1 for a perfectly inverse, or negative, relationship to 1 for a
perfectly positive correlation. Values at, or close to, zero indicate no linear relationship or
a very weak correlation.
 The coefficient values required to signal a meaningful association depend on the
application. The statistical significance of a correlation can be calculated from the
correlation coefficient and the number of data points in the sample, assuming a normal
population distribution.

Understanding the Correlation Coefficient


Different types of correlation coefficients are used to assess correlation based on the properties
of the compared data. By far the most common is the Pearson coefficient, or Pearson's r, which
measures the strength and direction of a linear relationship between two variables. The Pearson
coefficient cannot assess nonlinear associations between variables and cannot differentiate
between dependent and independent variables.

The Pearson coefficient uses a mathematical statistics formula to measure how closely the data
points combining the two variables (with the values of one data series plotted on the x-axis and
the corresponding values of the other series on the y-axis) approximate the line of best fit. The
line of best fit can be determined through regression analysis.

The further the coefficient is from zero, whether it is positive or negative, the better the fit and
the greater the correlation. The values of -1 (for a negative correlation) and 1 (for a positive one)
describe perfect fits in which all data points align in a straight line, indicating that the variables
are perfectly correlated. In other words, the relationship is so predictable that the value of one

24
variable can be determined from the matched value of the other. The closer the correlation
coefficient is to zero the weaker the correlation, until at zero no linear relationship exists at all.

Assessments of correlation strength based on the correlation coefficient value vary by


application. In physics and chemistry, a correlation coefficient should be lower than -0.9 or
higher than 0.9 for the correlation to be considered meaningful, while in social sciences the
threshold could be as high as -0.5 and as low as 0.5.34
For correlation coefficients derived from sampling, the determination of statistical
significance depends on the p-value, which is calculated from the data sample's size as well as
the value of the coefficient.

Correlation Coefficient Equation


To calculate the Pearson correlation, start by determining each variable's standard deviation as
well as the covariance between them. The correlation coefficient is covariance divided by the
product of the two variables' standard deviations

Cov (x , y)
P Xy =
σ X σY

where:
P Xy=Pearson product-moment correlation coefficient
Cov(x,y)=covariance of variables
σ X =standard deviation ofx
σ Y =standard deviation of y

Standard deviation is a measure of the dispersion of data from its average. Covariance shows
whether the two variables tend to move in the same direction, while the correlation coefficient
measures the strength of that relationship on a normalized scale, from -1 to 1.
The formula above can be elaborated as

r =n ¿ ¿

where:
r=Correlation coefficient
n=Number of observations

Correlation Statistics and Investing


The correlation coefficient is particularly helpful in assessing and managing investment risks.
For example, modern portfolio theory suggests diversification can reduce the volatility of a
portfolio's returns, curbing risk. The correlation coefficient between historical returns can
indicate whether adding an investment to a portfolio will improve its diversification.
Correlation calculations are also a staple of factor investing, a strategy for constructing a
portfolio based on factors associated with excess returns. Meanwhile, quantitative traders use
historical correlations and correlation coefficients to anticipate near-term changes in securities
prices.

25
Limitations of the Pearson Correlation Coefficient
Correlation does not imply causation, as the saying goes, and the Pearson coefficient cannot
determine whether one of the correlated variables is dependent on the other.
Nor does the correlation coefficient show what proportion of the variation in the dependent
variable is attributable to the independent variable. That's shown by the coefficient of
determination, also known as R-squared, which is simply the correlation coefficient squared.
The correlation coefficient does not describe the slope of the line of best fit; the slope can be
determined with the least squares method in regression analysis.
The Pearson correlation coefficient can't be used to assess nonlinear associations or those arising
from sampled data not subject to a normal distribution. It can also be distorted by outliers—data
points far outside the scatterplot of a distribution. Those relationships can be analyzed
using nonparametric methods, such as Spearman's correlation coefficient, the Kendall rank
correlation coefficient, or a polychoric correlation coefficient.

Finding Correlation Coefficients in Excel


The simplest way to calculate correlation in Excel is to input two data series in adjacent columns
and use the built-in correlation formula:
Investopedia.com
If you want to create a correlation matrix across a range of data sets, Excel has a Data Analysis
plugin on the Data tab, under Analyze.
Select the table of returns. In this case, our columns are titled, so we want to check the box
"Labels in first row," so Excel knows to treat these as titles. Then you can choose to output on
the same sheet or on a new sheet.
Hitting enter will produce the correlation matrix. You can add some text and conditional
formatting to clean up the result.

What Is a Correlation Coefficient?


The correlation coefficient describes how one variable moves in relation to another. A positive
correlation indicates that the two move in the same direction, with a value of 1 denoting a perfect
positive correlation. A value of -1 shows a perfect negative, or inverse, correlation, while zero
means no linear correlation exists.
How Do You Calculate the Correlation Coefficient?
The correlation coefficient is calculated by determining the covariance of the variables and
dividing that number by the product of those variables’ standard deviations.

How Is the Correlation Coefficient Used in Investing?


Correlation coefficients play a key role in portfolio risk assessments and quantitative trading
strategies. For example, some portfolio managers will monitor the correlation coefficients of
their holdings to limit a portfolio's volatility and risk.
Compete Risk Free with $100,000 in Virtual Cash
Put your trading skills to the test with our FREE Stock Simulator. Compete with thousands of
Investopedia traders and trade your way to the top! Submit trades in a virtual environment before
you start risking your own money. Practice trading strategies so that when you're ready to enter
the real market, you've had the practice you need.
CHAPTERS SEVEN

26
What is a Chi-Square Test? Formula, Examples & Application
Beginner R Statistics Structured Data Technique
Statistical analysis is a key tool for making sense of data and drawing meaningful conclusions.
The chi-square test is a statistical method commonly used in data analysis to determine if there is
a significant association between two categorical variables. By comparing observed frequencies
to expected frequencies, the chi-square test can determine if there is a significant relationship
between the variables. So let’s dive into the article to understand all about the chi-square test,
what it is, how it works, and how we can implement it in R.

Learning Objectives

 Understand what the chi-square test is and how it works


 Be able to calculate the chi-square value using the chi-square formula
 Learn about the different types of Chi-Square tests and where and when you should apply them
 Learn to implement a chi-square test in R

What is Chi-Sqaure Test?


The chi-square test is a statistical test used to determine if there is a significant association
between two categorical variables. It is a non-parametric test, meaning it does not make
assumptions about the underlying distribution of the data.
It compares the observed frequencies of the categories in a contingency table with the expected
frequencies that would occur under the assumption of independence between the variables. The
test calculates a chi-square statistic, which measures the discrepancy between the observed and
expected frequencies.

When to Use the Chi-Square Test?


Let’s start with a case study. I want you to think of your favorite restaurant right now. Let’s say
you can predict a certain number of people arriving for lunch five days a week. At the end of the
week, you observe that the expected footfall was different from the actual footfall.
Sounds like a prime statistics problem? That’s the idea!
So, how will you check the statistical significance between the observed and the expected
footfall values? Remember this is a categorical variable – ‘Days of the week’ – with 5 categories
[Monday, Tuesday, Wednesday, Thursday, Friday].
One of the best ways to deal with this is by using the Chi-Square test.
We can always opt for z-tests, t-tests, or ANOVA when we’re dealing with continuous variables.
But the situation becomes tricky when working with categorical features (as most data scientists
will attest to!). I’ve found the chi-square test to be quite helpful in my own projects.

Chi-Square Test Forumla

where,

27
 χ 2 = Chi-Square value
 Oi = Observed frequency
 Ei = Expected frequency

What are Categorical Variables?


I’m sure you’ve encountered categorial variables before, even if you might not have intuitively
recognized them. They can be tricky to deal with in the data science world, so let’s first define
them.
Categorical variables fall into a particular category of those variables that can be divided into
finite categories. These categories are generally names or labels. These variables are also called
qualitative variables as they depict the quality or characteristics of that particular variable.
For example, the category “Movie Genre” in a list of movies could contain the categorical
variables – “Action”, “Fantasy”, “Comedy”, “Romance”, etc.
There are broadly two types of categorical variables:

1. Nominal Variable: A nominal variable has no natural ordering to its categories. They have
two or more categories. For example, Marital Status (Single, Married, Divorcee), Gender
(Male, Female, Transgender), etc.
2. Ordinal Variable: A variable for which the categories can be placed in an order. For
example, Customer Satisfaction (Excellent, Very Good, Good, Average, Bad), and so on
When the data we want to analyze contains this type of variable, we turn to the chi-square test,
denoted by χ², to test our hypothesis.

Why Do We Use It?


Let’s learn the use of chi-square with an intuitive example.

Problem Statement
A research scholar is interested in the relationship between the placement of students in the
statistics department of a reputed University and their C.G.P.A (their final assessment score).
He obtains the placement records of the past five years from the placement cell database (at
random). He records how many students who got placed fell into each of the following C.G.P.A.
categories – 9-10, 8-9, 7-8, 6-7, and below 6.
Suppose there is no relationship between the placement rate and the C.G.P.A.. In that case, the
placed students should be equally spread across the different C.G.P.A. categories (i.e., there
should be similar numbers of placed students in each category).
However, if students having C.G.P.A more than 8 are more likely to get placed, then there would
be a large number of placed students in the higher C.G.P.A. categories as compared to the lower
C.G.P.A. categories. In this case, the data collected would make up the observed frequencies.
So the question is, are these frequencies being observed by chance, or do they follow some
pattern?

Chi Square Test Solution


Here enters the chi-square test! The chi-square test helps us answer the above question by
comparing the observed frequencies to the frequencies that we might expect to obtain purely by
chance.

28
Chi-square test in hypothesis testing is used to test the hypothesis about the chi-square
distribution of observations/frequencies in different categories.
We are almost at the implementing aspect of chi-square tests, but there’s one more thing we need
to learn before we get there. In order to fully understand the distribution of a variable, both
descriptive statistics and a chi-square test are essential tools. Descriptive statistics provide a
snapshot of the data, while a chi-square test can reveal important relationships between
categorical variables.
The normal distribution is a fundamental concept in statistics and is often used to model
variables in experiments. A chi-square test can be used to determine if a set of observations
follows a normal distribution.
Properties of Chi-Square Test
The chi-square test possesses several important properties that make it a valuable statistical tool:
Aspect Description
Non-parametric Test The chi-square test is non-parametric, making no assumptions about the
data’s underlying distribution. Applicable to categorical data.
Test for Examines association between categorical variables, determining
Independence significance of relationship or dependency, not strength or direction.
Goodness of Fit Test Assesses how well observed data fit an expected distribution, comparing
observed frequencies to expected frequencies.
Chi-Square Statistic Measures discrepancy between observed and expected frequencies in a
contingency table, indicating association or goodness of fit.
Degrees of Freedom Depend on the number of categories in variables. Determine critical values
and influence test result interpretation.
Null and Alternative Null hypothesis assumes no association or difference, alternative
Hypotheses hypothesis suggests presence of association or difference.
Test Statistic and P- Produces test statistic and corresponding p-value. Compare test statistic to
value critical value, p-value indicates probability under null hypothesis.
Interpretation Null hypothesis rejected if test statistic exceeds critical value or p-value is
less than chosen significance level. Indicates significant association or
deviation.
Assumptions of the Chi-Square Test
The chi-square test uses the sampling distribution to calculate the likelihood of obtaining the
observed results by chance and to determine whether the observed and expected frequencies are
significantly different. Just like any other statistical test, the chi-square test comes with a few
assumptions of its own:

 A large sample size is crucial for a reliable outcome in a chi-square test, as it helps ensure that
the data distribution is representative of the population.
 The χ2 assumes that the data for the study is obtained through random selection, i.e., they are
randomly picked from the population.
 The categories are mutually exclusive, i.e., each subject fits into only one category. For e.g.,
from our above example – the number of people who lunched in your restaurant on Monday
can’t be filled in the Tuesday category.
 The data should be in the form of frequencies or counts of a particular category and not in
percentages.

29
 The data should not consist of paired samples or groups, or we can say the observations should
be independent of each other.
 When more than 20% of the expected counts (frequencies) have a value of less than 5, then Chi-
square cannot be used. To tackle this problem: Either one should combine the categories only if
it is relevant or obtain more data.
The total number of observations is a crucial component in determining the validity of the chi-
square test. The larger the number of observations, the more accurate the chi-square test results
will be. The Yates correction is an adjustment used in the chi-square test to account for the
expected counts (frequencies) being close to zero. The Yates correction ensures the validity of
the chi-square test results when applied to small sample sizes.

Types of Chi-Square Tests


In this section, we will see the different types of chi-square tests and study them by manual
calculations and with their implementation in R.
There are three main types of chi-square tests commonly used in statistics:

1. Pearson’s Chi-Square Test: This test is used to determine if there is a significant


association between two categorical variables in a single population. It compares the
observed frequencies in a contingency table with the expected frequencies assuming
independence between the variables.
2. Chi-Square Goodness of Fit Test: This test is used to assess whether observed categorical
data follows an expected distribution. It compares the observed frequencies with the
expected frequencies specified by a hypothesized distribution.
3. Chi-Square Test of Independence: This test is used to examine if there is a significant
association between two categorical variables in a sample from a population. It compares
the observed frequencies in a contingency table with the expected frequencies assuming
independence between the variables.

Chi-Square Goodness of Fit Test


This is a non-parametric test. We typically use it to find how the observed value of a given event
is significantly different from the expected value. In this case, we have categorical data for one
independent variable, and we want to check whether the data distribution is similar or different
from the expected distribution.
Let’s consider the above example where the research scholar was interested in the relationship
between the placement of students in a reputed university’s statistics department and their
C.G.P.A.
In this case, the independent variable is C.G.P.A with the categories 9-10, 8-9, 7-8, 6-7, and
below 6.
The statistical question here is: whether or not the observed counts (frequencies) of placed
students are equally distributed for different C.G.P.A categories (so that our theoretical
frequency distribution contains the same number of students in each of the C.G.P.A categories).
We will arrange this data by using the contingency table, which will consist of both the observed
and expected values as below:

7-
C.G.P.A 10-9 9-8 8-7 Below 6 Total
6

30
Observed Frequency of Placed
30 35 20 10 5 100
students
Expected Frequency of Placed students 20 20 20 20 20 100
After constructing the contingency table, the next task is to compute the value of the chi-square
statistic. The formula for chi-square is given as:

where,

 χ 2 = Chi-Square value
 Oi = Observed frequency
 Ei = Expected frequency

Chi-Square Test Example


Let us look at the step-by-step approach to calculate the chi-square value
Step 1: Subtract each expected frequency from the related observed frequency
For example, for the C.G.P.A category 10-9, it will be “30-20 = 10”. Apply similar operations
for all the categories.
Step 2: Square each value obtained in step 1, i.e. (O-E)2
For example: for the C.G.P.A category 10-9, the value obtained in step 1 is 10. It becomes 100
on squaring. Apply similar operations for all the categories.
Step 3: Divide all the values obtained in step 2 by the related expected frequencies, i.e. (O-E)2/E
For example: for the C.G.P.A category 10-9, the value obtained in step 2 is 100. On dividing it
by the related expected frequency, which is 20, it becomes 5. Apply similar operations for all the
categories.
Step 4: Add all the values obtained in step 3 to get the chi-square test statistic value
In this case, the chi-square statistic value comes out to be 32.5.
Step 5: Once we have calculated the chi-square value, we will compare it with the critical chi-
square statistic value
Then the number of degrees of freedom represents the number of values in the data set that are
free to vary and contribute to the test statistic. The number of degrees of freedom in the test
affects the critical value and the level of significance, helping to determine if the differences are
due to chance or are statistically significant. We can find the critical chi-square value in the
below chi-square table against the number of degrees of freedom (number of categories – 1) and
the significance level:

In this case, the degrees of freedom are 5-1 = 4. So, the critical value at a 5% significance level is
9.49. The significance level in a chi-square test determines the threshold for rejecting the null
hypothesis and accepting the alternative hypothesis. A significance level of 0.05 means a 5%
chance of making a Type I error or falsely rejecting the null hypothesis.
Our obtained value of 32.5 is much larger than the critical value of 9.49. Therefore, we can say
that the observed frequencies from the sample data are significantly different from the expected

31
frequencies. In other words, C.G.P.A is related to the number of placements that occur in the
department of statistics.
Let’s further solidify our understanding by performing the Chi-Square test in R.

The Chi-Square Goodness of Fit Test in R


Let’s implement the chi-square goodness of fit test in R. Time to fire up RStudio!
Problem Statement
Let’s understand the problem statement before we dive into R.
An organization claims that the experience of the employees of different departments is
distributed in the following categories:

 11 – 20 Years = 20%
 21 – 40 Years = 17%
 6 – 10 Years = 41% and
 Up to 5 Years = 22%
A random sample of 1470 employees is collected. Does this random sample provide evidence
against the organization’s claim?
You can download the data here.
Setting up the Hypothesis

 Null hypothesis: The true proportions of the experience of the employees of different
departments are distributed in the following categories: 11 – 20 Years = 20%, 21 – 40 Years =
17%, 6 – 10 Years = 41%, and up to 5 Years = 22%
 Alternative hypothesis: The distribution of experience of the employees of different departments
differs from what the organization states
Let’s begin!

Step 1: First, import the data


Step 2: Validate it for correctness in R
#Step 1 - Importing Data

#Importing the csv data


data<-read.csv(file.choose())

#Step 2 - Validate data for correctness


#Count of Rows and columns
dim(data)

#View top 10 rows of the dataset


head(data,10)
view rawImport_data.R hosted with ❤ by GitHub
Output:

32
#Count of Rows and columns

[1] 1470 2
#View top 10 rows of the dataset

age.intervalsExperience.intervals
1 41 - 50 6 - 10 Years
2 41 - 50 6 - 10 Years
3 31 - 40 6 - 10 Years
4 31 - 40 6 - 10 Years
5 18 - 30 6 - 10 Years
6 31 - 40 6 - 10 Years
7 51 - 60 11 - 20 Years
8 18 - 30 Upto 5 Years
9 31 - 40 6 - 10 Years
10 31 - 40 11 - 20 Years

Step 3: Create a proportion table for expected frequencies:


#Step 3 - Calculate the proportion of experience of employees

#Proportion table for observed frequencies


prop.table((table(data$Experience.intervals)))
view rawprop.R hosted with ❤ by GitHub
Output:
11 - 20 Years 21 - 40 Years 6 - 10 Years Upto 5 Years
0.2312925 0.1408163 0.4129252 0.2149660

Step – 4: Calculate the chi-square value


#Step 4 - Calculate the chi-square value
chisq.test(x= table(data$Experience.intervals),
p= c(0.2, 0.17, 0.41, 0.22))
view rawchi.R hosted with ❤ by GitHub
Output:
Chi-square test for given probabilities

data: table(data$Experience.intervals)
X-squared = 14.762, df = 3, p-value = 0.002032
The p-value here is less than 0.05. Therefore, we will reject our null hypothesis. Hence, the
distribution of experience of the employees of different departments differs from what the
organization states.

33
Chi-Square Test of Association
The second type of chi-square test is the Pearson’s chi-square test of association. This test is used
when we have categorical data for two independent variables and want to see if there is any
relationship between the variables.
Let’s take another example to understand this. A teacher wants to know whether the outcome of
a mathematics test is related to the gender of the person taking the test. Or in other words, she
wants to know if males show a different pattern of pass/fail rates than females.
So, here are two categorical variables: Gender (Male and Female) and mathematics test outcome
(Pass or Fail). Let us now look at the contingency table:
Statu
Boys Girls
s
Pass 17 20
Fail 8 5
Looking at the above contingency table, we can see that girls have a comparatively higher pass
rate than boys. However, to test whether or not this observed difference is significant, we will
carry out the chi-square test.
The steps to calculate the chi-square value are as follows:

Step 1: Calculate the row and column total of the above contingency table
Statu
Boys Girls Total
s
Pass 17 20 37
Fail 8 5 13
Total 25 25 50

Step 2: Calculate the expected frequency for each individual cell by multiplying row sum by
column sum and dividing by total number
Expected Frequency = (Row Total x Column Total)/Grand Total
For the first cell, the expected frequency would be (37*25)/50 = 18.5. Now, write them below
the observed frequencies in brackets:
Status Boys Girls Total
17
Pass 20 (18.5) 37
(18.5)
Fail 8 5 13
Total 25 25 50

Step 3: Calculate the value of chi-square using the formula

Calculate the right-hand side part of each cell. For example, for the first cell, ((17-18.5)^2)/18.5
= 0.1216.

34
Step 4: Then, add all the values obtained for each cell. In this case, the values are
0.1216+0.1216+0.3461+0.3461 = 0.9354

Step 5: Calculate the degrees of freedom, i.e (Number of rows-1)*(Number of columns-1) = 1*1
=1
The next task is to compare it with the critical chi-square value from the above table.
The Chi-Square calculated value is 0.9354, which is less than the critical value of 3.84. So, in
this case, we fail to reject the null hypothesis. This means there is no significant association
between the two variables, i.e., boys and girls have a statistically similar pattern of pass/fail rates
on their mathematics tests.
Let’s further solidify our understanding by performing the chi-square test in R.

Chi-Square Test for Independence in R


Problem Statement
A Human Resources department of an organization wants to check whether the employees’ age
and experience depend on each other. For this purpose, a random sample of 1470 employees is
collected with their age and experience. You can download the data here.
Setting up the hypothesis:

 Null hypothesis: Age and Experience are two independent variables


 Alternative hypothesis: Age and Experience are two dependent variables
Let’s begin!
Step 1: First, import the data
Step 2: Validate it for correctness in R:
#Step 1 - Importing Data

#Importing the csv data


data<-read.csv(file.choose())

#Step 2 - Validate data for correctness

#Count of Rows and columns


dim(data)

#View top 10 rows of the dataset


head(data,10)
view rawImport_data.R hosted with ❤ by GitHub
Output:
#Count of Rows and columns

35
[1] 1470 2
> #View top 10 rows of the dataset

age.intervalsExperience.intervals
1 41 - 50 6 - 10 Years
2 41 - 50 6 - 10 Years
3 31 - 40 6 - 10 Years
4 31 - 40 6 - 10 Years
5 18 - 30 6 - 10 Years
6 31 - 40 6 - 10 Years
7 51 - 60 11 - 20 Years
8 18 - 30 Upto 5 Years
9 31 - 40 6 - 10 Years
10 31 - 40 11 - 20 Years

Step 3: Construct a contingency table and calculate the chi-square value:


#Step 3 - Make a table and Calculate the chi-square value

ct<-table(data$age.intervals,data$Experience.intervals)
Ct
chisq.test(ct)
view rawchi2.R hosted with ❤ by GitHub
Output:
ct<-table(data$age.intervals,data$Experience.intervals)
>ct

11 - 20 Years 21 - 40 Years 6 - 10 Years Upto 5 Years


18 - 30 22 0 172 192
31 - 40 190 20 308 101
41 - 50 85 112 110 15
51 - 60 43 75 17 8
>chisq.test(ct)

Pearson's Chi-squared test

data: ct
X-squared = 679.97, df = 9, p-value < 2.2e-16

The p-value here is less than 0.05. Therefore, we will reject our null hypothesis. We can
conclude that age and experience are two dependent variables, aka, as the experience increases,
the age also increases (and vice versa).

36
Limitations of Chi Square Test
While the chi-square test is a useful statistical tool, it does have some limitations that should be
considered:

1. Independence Assumption: The test assumes independent observations, contributing to


expected frequencies. Violations like dependence or correlation can impact validity.
2. Sample Size: Reliability is limited with small samples or low expected cell frequencies.
Recommended to have at least 5 frequencies scheduled per cell.
3. Sensitivity to Sample Composition: Imbalanced frequencies or empty cells can bias or
inaccurately influence test results.
4. Limited to Categorical Variables: Designed for categorical variables; not applicable to
continuous or ordinal variables.
5. Lack of Directionality or Magnitude: Determines association presence but doesn’t
indicate strength, direction, or magnitude.
6. Type of Association: Detects associations but doesn’t differentiate types or establish
cause-and-effect relationships.
7. Large Sample Bias: In large samples, small deviations may lead to statistically significant
results without practical implications.
8. Multiple Comparisons: Multiple tests on the same data increase the chance of finding
significant results by chance alone. Consider adjustments like Bonferroni correction.
9. Interpretation Considerations: Interpret test results cautiously, considering the context and
research question. Significance doesn’t guarantee meaningful or practically important
associations.

Conclusion
In this article, we learned how to analyze the significant difference between data that contains
categorical measures with the help of chi-square tests. We enhanced our knowledge of the use of
chi-square, the assumptions involved in carrying out the test, and how to conduct different types
of chi-square tests both manually and in R.
If you are new to statistics, want to cover your basics, and also want to get a start in data science,
I recommend taking the Introduction to Data Science course. It gives you a comprehensive
overview of both descriptive and inferential statistics before diving into data science techniques.
Did you find this article useful? Can you think of any other applications of the chi-square test?
Let me know in the comments section below, and we can come up with more ideas!

Key Takeaways

 The Chi-square test is a hypothesis testing method used to compare observed data with expected
data.
 The chi-square value, calculated using the chi-square formula, tells us the extent of similarity or
difference between the categories of data being considered.
 There are two types of Chi-Square Tests: the Chi-Square Goodness of Fit Test and the Chi-
Square Test of Independence/Association.
Questions
Q1. What is chi-square test and how is it calculated?

37
A. First find the difference between the observed (o) and expected (e) values. Take the square of
that number and divide it by the expected value. Finally, add all of these calculated values from
the various categories to get the chi-square.
Q2. What is a chi-square test used for?
A. A chi-square test is used to predict the probability of observations, assuming the null
hypothesis to be true. It is often used to determine if a set of observations follows a normal
distribution. It can also be used to find the relationship between the categorical data for two
independent variables.
Q3. What does it mean when the chi-square value is large?
A. If the chi-square value is larger than the critical value, it means that there is a significant
difference between the categories of data in consideration. The larger the chi-square value, the
greater the probability of a significant difference.
Q4. What is the p-value in chi-square?
A. The p-value in a chi-square test is a measure of the probability that the observed data deviates
from the expected distribution purely due to chance. It helps determine the statistical significance
of the association between categorical variables. A smaller p-value indicates stronger evidence
against the null hypothesis of independence.

38
Effects of cultural factors ontheperformance of multi- national corporations in Nigeria.

Corporate ethics & organizational effectiveness of selected oil & gas companies in Nigeria.

The role of television in electoral education in Nigeria


Impact of motivation in employees commitment in an organization

Impact of money supply & interest rate on stock prices in Nigeria

Comparative assessment of know of childhood diarrhea & preventive practices among nursing
mothers as predictors of infant mortality in selected sub- urban communities in Lagos
metropolis, Nigeria.

39
40

You might also like