GRAPHICAL REPRESENTATION OF DATA
Graphical representation is done of the data available this being a very important step of statistical analysis. We will be discussing the organization of data. The word 'Data' is plural for 'datum'; datum means facts. Statistically the term is used for numerical facts such as measures of height, weight and scores on achievement and intelligence tests. Tests, experiments and surveys in education and psychology provide us valuable data, mostly in the shape of numerical scores. For understanding data available and deriving meaning and useful conclusion, the data have to be organized or arranged in some systematic way. This can be done by following ways: 1. statistical tables 2. rank order 3. frequency distribution Statistical tables The data are tabulated or arranged into rows and columns of different heading. Such tables can list original raw scores as well as the percentages, means, standard deviations and so on. Example Table for group mean and S.D. of anxiety test of dancers and non dancers
Group Dancers Non dancers
Mean 22.66 27.66
Standard deviation 6.018 8.741
N 15 15
Rules for constructing tables: 1. Title of the table should be simple, concise and unambiguous. As a rule, it should appear on the table. 2. The table should be suitably divided into columns and rows according to the nature of data and purpose. These columns and rows should be arranged in a logical order to facilitate comparison. 3. The heading of each columns or row should be as brief as possible. Two or more columns or rows with similar headings may be grouped under a
common heading to avoid repetition and we may have subheadings or captions. 4. Sub total for each separate classification and a general total for all combined classes are to be given. These totals should be given at the bottom or right of the concerned items. 5. The units in which the data are given must invariably be mentioned. 6. Necessary footnotes should be providing essential explanation of the points to ambiguous representation of the tabulated data must be given at the bottom of the table. 7. The sources from where the data have been received should be given at the end of the table. 8. In tabulating long columns of figures, space should be left after every five or ten rows. 9. If the numbers tabulated have more than three significant figure, the digit should be grouped in threes. For ex.- 4394756 as 4 394 756. 10. For all purposes and by all means, the table should be as simple as possible so that it may be studied by the readers with minimum possible strain and create a clear picture and interpretations of the data. Rank order The original raw scores can be arranged in an ascernding or a descending series exhibiting an order with respect to the rank or merit position of the individual. Example: Sixteen students of BA final psychology class obtained the following scores on an achievement test. Tabulating the given data 5 8 4 12 15 17 18 12 20 7 8 19 6 9 10 11 S. No. Scores S No. Scores S No. Scores S No. Scores 1 20 5 15 9 10 13 7 2 19 6 12 10 9 14 6 3 18 7 12 11 8 15 5 4 17 8 11 12 8 16 4 Frequency Distribution The organization of the data according to rank order does not help us to summarize a series of raw scores. It also does not tell us the frequency of the raw scores. In frequency distribution we group the data into an arbitrarily
chosen groups or classes. It is also seen that how many times a particular score or group of scores occurs in the given data. This is known as the frequency distribution of numerical data. Construction of Frequency distribution table Finding the range: First of all the range of the series to be grouped is found. it is done by subtracting the lowest score from the highest. In the present problem the range of the distribution is 46-12, ie. 34 Determining class interval: After finding range we find class interval represented by 'i'. The formula for I is i = Range/ no. of class interval desired I = 34/8 I = 4.25 We decide to take class interval to be 5. Writing the contents of the frequency distribution table: Writing the classes of the distribution. In the first column we write the classes of distribution. First of all the lowest class is settled and afterwards other subsequent classes are written down. In this case we take 10-14 as the lowest class, then wee have higher classes as 1519, 20-24,.. and so on up to 45-49. Tallying the scores into proper classes. The scores given are tallied into proper classes in the second column then the tallies are counted against each class to obtain the frequency of the class. Example-
Class interval 45-49 40-44 35-39 30-34 25-29 20-24 15-19 10-14
Tallies l 0 ll lll llll ll ll l
Frequency for Nondancers 1 0 2 3 4 2 2 1
Total frequency (N) = 15 GRAPHICAL REPRESENTATION OF DATA
The statistical data may be presented in a more attractive form appealing to the eye with the help of some graphic aids, i.e. Pictures and graphs. Such presentation carries a lot of communication power. A mere glimpse of thee picture and graphs may enable the viewer to have an immediate and meaningful grasp of the large amount of data. Ungrouped data may be represented through a bar diagram, pie diagram, pictograph and line graph. Bar graph represents the data on the graph paper in the form of vertical or horizontal bars.
In a pie diagram, the data is represented by a circle of 360degrees into parts, each representing the amount of data converted into angles. The total frequency value is equated to 360 degrees and then the angle corresponding to component parts are calculated. In pictograms, the data is represented by means of picture figures appropriately designed in proportion to the numerical data.
Line graphs represent the data concerning one variable on the horizontal and other variable on the vertical axis of the graph paper. Grouped data may be represented graphically by histogram, frequency polygon, cumulative frequency graph and cumulative frequency percentage curve or ogive.
A histogram is essentially a bar graph of a frequency distribution. The actual class limits plotted on the x-axis represents the width of various bars and respective frequencies of these class intervals represent the height of these bars. A frequency polygon is a line graph for the graphical representation of frequency distribution. A cumulative frequency graph represents the cumulative frequency distribution by plotting actual upper limits of the class intervals on the x axis and the respective cumulative frequencies of these class intervals on the y axis.
Cumulative frequency percentage curve or ogive represents cumulative percentage frequency distribution by plotting upper limits of the class intervals on the x axis and the respective cumulative percentage frequencies of these class intervals on the y axis. METHOD FOR CONSTRUCTING A HISTOGRAM
1. The scores in the form of actual class limits as 19.5-24.5, 24.5-29.5 and so on are taken as examples in the construction of a histogram rather than written class limits as 20-24, 25-30. 2. It is customary to take two extra intervals of classes one below and above the grouped intervals. 3. Now we take the actual lower limits of all the class intervals and try to plot them on the x axis. The lower limit of the lowest class interval is taken at the intersecting point of x axis and y axis. 4. Frequencies of the distribution are plotted on the y axis. 5. Each class interval with its specific frequency is represented by separate rectangle. The base of each rectangle is the width of the class interval. And the height is representative of the frequency of that class or interval. 6. Care should be taken to select the appropriate units of representation along the x and y axis. Both the axis and the y axis must not be too short or too long. METHOD FOR CONSTRUTING A FREQUENCY POLYGON 1. As in histogram two extra class interval is taken, one above and other below the given class interval. 2. The mid-points of the class interval is calculated. 3. The mid point is calculated along the x axis and the corresponding frequencies are plotted along the y axis. 4. The various points given by the plotting are joined by lines to give frequency polygon.
DIFFERENCE BETWEEN HISTOGRAM AND FRQUENCY POLYGON
Histogram is a bar graph while frequency polygon is a line graph. Frequency polygon is more useful and practical. In frequency polygon it is easy to know the trends of the distribution; we are unable to do so in histogram. Histogram gives a very clear and accurate picture of the relative proportion of the frequency from interval to interval. METHOD FOR CONSTRUTING
A CUMULATIVE FREQUENCY GRAPH 1. First of all we calculate the actual upper and lower limits of the class intervals i.e. if the class interval is 20-24 then upper limit is 24.5 and the lower limit is 19.5. 2. We must know select a suitable scale as per the range of the class interval and plot the actual upper limits on the x axis and the respective cumulative frequency on y axis. 3. All the plotted points are then joined by successive straight lines resulting a line graph. 4. To plot the origin of the x axis an extra class interval is taken with cumulative frequency zero is taken.