Chapter 4: Measures of Central Tendency – Biostatistics

Chapter 4

Measures of Central Tendency

Objectives

After completing this chapter, you can understand the following:

  • The definition, meaning and significance of measures of entral tendency [MCT].
  • The calculation of different MCTs such as arithmetic mean, median, mode, geometric mean and harmonic mean.
  • The relationship between the different averages.
  • How these measures can be used to make decisions related to biological studies?
4.1 INTRODUCTION

Statistical methods are needed for summarizing and describing the collected numerical data. The main objective of this chapter is to introduce one representative value, which can be utilized to identify and summarize an entire set of data. This representative value is going to be very much helpful in making decisions, based on the data collected. MCTs are used to set the central value around which the data are spread over.

4.2 MEASURES OF CENTRAL TENDENCY

The average of a distribution is its representative size. At most of the items of the series cluster around the average, it is called a measure of central tendency. The average is computed in order to reduce the complexity of the data. The entire distribution is reduced to one number, which can be considered typical of an important characteristic of the population, and the same can be used in making comparisons and in examining relations with other distributions.

For example, it is not possible to remember the individual’s income of crores of earning people in India. By considering, all the data related to income, if the average income is evaluated, we get a single value, which is going to represent the entire population. The commonly used averages are as follows:

  • Arithmetic Mean
  • Median
  • Mode
  • Geometric mean and
  • Harmonic mean.

4.2.1 Properties of Best Average

It should be

  • rigidly defined.
  • based on all observations of the series.
  • easy to calculate and simple to understand.
  • capable of further algebraic treatment.
  • free from the extreme values, that is, it should not be affected by extreme values.

The arithmetic mean is ideal in the above respects. Even though the other averages are also useful in certain specific cases. The median is quite useful for studying data not capable of direct quantitative measurements such as skin colour etc.; the measure mode is good when the extreme values are not well defined.

4.3 ARITHMETIC MEAN

The evaluation of mean depends on the nature of data. The data for evaluation can be categorized into three types as follows:

  1. Discrete data [DD]
  2. Discrete data with frequency [DDF] and
  3. Continuous data with frequency [CDF]

4.3.1 Discrete Data

Consider the given set of n number of discrete values Xi [i = 1, 2, …, n] X1, X2, X3, …, Xn. Then the arithmetic mean is defined as

4.3.2 Discrete Data with Frequency

Consider the given set of n discrete values corresponds with n different frequencies Xi and fi [i = 1, 2, … , n].

Then the arithmetic mean is defined as

4.3.3 Continuous Data with Frequency

Consider the given set of data. Verify whether the given class interval is continuous or not. If not continuous, change it into continuous one with proper method. After that, verify whether the length of the class intervals is uniform. If not uniform, do proper adjustment for the varied length.

Where Li: lower limit of the ith class interval and

Ui: Upper limit of the ith class interval.

For each interval choose the mid value and call it as Xi [mid value of the ith class interval]. Xi = [Li + Ui]/2.

Then the arithmetic mean can be computed using the following relationship:

Alternate method

Find, di = [XiA]/h; for all i = 1, 2, … , n.

A – any one value of X preferably the middle value.

h – the length of the class interval

The relative advantages of arithmetic mean are listed as follows:

  • easy to understand
  • easy to calculate
  • value is definite
  • familiar to all
  • based on all observations
  • least affected by fluctuations of sampling
  • Stable, in the sampling sense
  • capable for further algebraic treatment
  • used to evaluate mean deviation
  • used to evaluate the standard deviation
4.4 MATHEMATICAL PROPERTIES OF ARITHMETIC MEAN
  1. The sum of all deviations of the observations from the mean is zero. The same can be denoted mathematically,

    Note:

    All the evaluated statistics such as mean, median and mode and so on are always constant.

  2. The sum of the squared deviations of the observations from the mean is minimum. The same can be mathematically denoted as
  3. Composite mean [combined mean] can be evaluated for any number of groups. If we know the mean of different groups then the composite mean can be evaluated. Let [1, n1], [2, n2], … , [n, nn] the mean and size of different groups, then the composite mean X can be evaluated using the relation

4.4.1 Disadvantages of Arithmetic Mean Related to Other Averages

  1. If the number of items in a series is very small, the extreme items affect the arithmetic mean.

    Example:

    Consider the marks secured by six students in Mathematics

    100, 100, 100, 0, 0 and 0. The average mark is 50, which is not representative.

  2. In an organization, the production rate of four of its member per hour is A – 30, B – 30, C – 30 and D – 46. The average comes out 34; Hence, the average 34 cannot be fixed as the production standard because many of the employees cannot achieve this.
  3. The average cannot be evaluated even if one item in the given series is not known.
  4. It cannot be located by mere inspection. It requires computation. In certain situation, the mean value found is to be odd. Suppose the mean value of the number of children in 100 families is 2.5, we conclude that on an average there are two and a half children in each family. This figure is different to conceive of.

Example: 1

The following are the two-week weight gains [kgs] of six young lambs of the same breed who had been raised on the same diet: 11, 13, 19, 2, 10 and 1.

Find the arithmetic mean.

Step 1:

Consider the given values

 

X1 = 11; X2 = 13; X3 = 19; X4 = 2; X5 = 10 and X6 = 1.

 

Here, n = 6.

Step 2:

Hence, the mean or average or arithmetic mean weight is 9.33 kgs.

Example: 2

Six men with high serum cholesterol participated in a study to evaluate the effects of diet on cholesterol level. At the beginning of the study their serum cholesterol levels [mg/dL] were as follows: 366, 327, 274, 292, 274 and 230. Determine the mean.

Step 1:

Consider the salary for the 6 employees

 

X1 = 366, X2 = 327, X3 = 274, X4 = 292, X5 = 274 and X6 = 230; here n = 6.

Step 2:

Hence, the average serum cholesterol level is 293.83 mg/dL.

Example: 3

A researcher applied the carcinogenic [cancer-causing] compound benzo[a]pyrene to the skin of five mice and measured the concentration in the liver tissue after 48 hours. The results [nmol/g] were as follows:

 

6.3, 5.9, 7, 6.9 and 5.9.

 

Evaluate the mean value.

Step 1:

Construct the discrete distribution with frequency:

Step 2:

Consider the values of x and f and create a new column f * X.

Xi fi fi * Xi
5.9
2
11.8
6.3
1
6.3
6.9
1
6.9
7
1
7
Total
5
32

Step 3:

The arithmetic mean can be evaluated using the relation

Example: 4

Find the arithmetic mean for the following data

Step 1:

Consider the values of x and f and create a new column Xi * fi

Xi fi fi * Xi
50
15
750
55
20
1100
60
25
1500
65
30
1950
70
10
700
Total
100
6000

Alternate method

Method of assumed mean

Find d = X – A, where A is any one of the X-values and evaluate [f * d].

= 60 + [0/100] = 60 + 0 = 60

Note:

In both the methods, the value of is same. Any one of the methods can be used interchangeably. Based on computational aspect the second method is better one.

Example: 5

For each 31 healthy dogs, a veterinarian measured the glucose concentration in the anterior chamber of the right eye, and also in the blood serum. The following data are the anterior chamber glucose measurements, expressed as a percentage of the blood glucose.

Convert these data into continuous data with frequency and find the average.

Evaluate the arithmetic mean.

Step 1:

The given class intervals are continuous and having uniform length. The values of X can be evaluated either directly or using difference method.

Direct method

Find the mid value of the class interval [X] and create a column [Xf].

Class = 69 – 76; midpoint = [69 + 76]/2 = 72.5.

For the subsequent classes, add 7 with the latest value.

Hence, the average blood glucose level is 85.82.

Alternate method

Using X find the value d, with the help of the relation, di = [Xi – A]/h;

where A – anyone value of Xi [i = 1, 2, … , 9]; h = length of the class interval = 50 and n = 9.

Hence, the average blood glucose level is 85.82.

Example: 6

Number of aphids observed per clover plant. A frequency table grouping the data of above problem is as follows:

Number of aphids on a plant Number of plants observed
0–3
6
4–7
17
8–11
40
12–15
54
16–19
59
20–23
75
24–27
77
28–31
55
32–35
32
36–39
8
423

Find the mean.

Step 1:

  • The given class intervals are not continuous and having uniform length.
  • Difference between the upper value and the lower value of two subsequent intervals are uniform and its value is 1. The half of the difference is [1/2], that is, 0.5.

Step 2:

  • Add 0.5 and subtract 0.5 with the upper and lower limits of the class intervals, respectively.
  • Find the mid point of the class intervals.
  • Find the value of di = [Xi – A]/h; let A = 17.5; h = 4;

Step 3:

Average number of aphids observed per clover plant is 20.84.

Example: 7

Restructure the given problem into a standard form.

In this problem, the class 1 and class 5 are of open-end classes. In order to convert this into a standard form these two open-ended classes must be converted into a closed end classes based on the other closed end classes. The class 1 and the class 5 can be changed into

(a) Below 20 => 0–20

(b) 80 and above => 80–100, respectively.

Hence, the given structure becomes

Note:

The mean can be computed based on the modified problem.

Example: 8

Restructure the given problem into a continuous one.

By looking into the given structure, one can identify that the length increases by five.

Hence, the length of the first class interval should be 5, that is, 15–20 and the last one be 25, that is, 65–90. The modified structure is as follows:

Example: 9

The expenditure of 1000 families is given as follows:

The mean for the distribution is 87.50. Calculate the missing frequencies.

Step 1:

Let the missing frequencies are x and y. The given class intervals are not continuous and the length are uniform.

Step 2:

Convert the class intervals into a continuous one.

Difference = 1; half of the difference = ½.

Subtract [1/2] and add [1/2] at the lower and upper values of the intervals. Find the mid point of the class intervals, where h = 20; mid point of the first interval = [39.5 + 59.5]/2 = 49.5 and add the length for the subsequent intervals.

Using the value of mean in the second equation,

Using the value of x in the first equation, we have 150 + y = 550;

 

y = 550 – 150 = 400. Hence, the missing frequencies are

Class

40–59
80–99

Missing frequency

150
400

Example: 10

The mean age of a combined group of men and women is 30 years. If the mean age of the group of men is 32 and that of the group of women is 27, find out the percentage of the men and women in the group.

Let the total number of men and women in the group = 100.

Let n be the number of men and then [100 – n] be the number of women in the group.

By definition, the mean of two composite group is

Here = 30 years; 1 = 32 years; 2 = 27 years; n2 = 100 – n and n1 = n.

Using all the values in equation [1]

Using n = 60, we have n2 = 100 – 60 = 40.

Percentage of the men = 60;

Percentage of the women = 40.

Example: 11

The average weight for a group of 25 boys was calculated to be 78.4 lb. It was later discovered that one weight was misread as 69 lb instead of the correct value of 96 lb. Calculate the correct average.

Given,

Mean of 25 boys = 78.4 lb.

Total weight of 25 boys = 78.4 * 25 = 1960 lb.

This 1960 lb includes the incorrect value of 69 lb.

Subtracting the incorrect value and adding the correct value 96 lb, we can have the corrected total weight.

Corrected total weight = 1960 – 69 + 96 = 1987.

Corrected mean = 1987/25 = 79.48 lb.

The corrected average is 79.48 lb.

4.5 MEDIAN

The median is the size of the middle-most item when the items from an array. Half the total number of cases will lie below the median and half will be above. It is a ‘positional’ average.

4.5.1 Discrete Data

Consider the given set of ‘n’ number of discrete values Xi[i = 1, 2, … , n], X1, X2, X3, … , Xm. Then the median is defined as the mid value of the data after setting the data in the form of ascending order, if it contains the odd number of data set. If the number of data set is even, it is considered the average value of the middle two items.

4.5.2 Discrete Data with Frequency

Consider the given set of ‘n’ discrete values corresponds with ‘n’ different frequencies Xi and fi [i = 1, 2, … , n]

Construct the cumulative frequency column. Find the value of n/2. Select the cumulative frequency just greater than the value of n/2. Then the value of x corresponds to the selected cumulative frequency.

4.5.3 Continuous Data with Frequency

Consider the given set of data. Verify whether the given class interval is continuous or not. If not continuous, change it into continuous one with proper method. After that, verify whether the length of the class intervals is uniform. If not uniform, do proper adjustment for the varied length.

Where Li : lower limit of the ith class interval and

          Ui : Upper limit of the ith class interval

For each interval choose the mid value and call it as Xi [mid value of the ith class interval].

Construct the cumulative frequency column. Find the value of n/2 Select the cumulative frequency just greater than the value of n/2 Then select the interval that corresponds to the selected cumulative frequency as median class.

l     lower limit of the median class

h    length of the class interval

n    total frequency

cf   cumulative frequency of the previous class to the median class

f    frequency of the median class

The value of median can be computed using the following relationship:

Relative advantages

  1. The median is easily calculated and can be understood easily.
  2. The value of the median is not affected by the magnitude of the extreme values.

    Example:

    Median of the 5 employee’s income is, 30, 35, 40, 45 and 50 is 40. If we change the value of the fifth item 50 by 100, still the median is 40.

  3. The median can be evaluated even if the data are incomplete.

    Example:

    For the above-mentioned problem, it is possible to evaluate median but the arithmetic mean cannot be evaluated.

  4. The median may be located when the items in a series cannot be measured quantitatively such as the fairness of the skin and the intelligence.

Relative disadvantages

  1. It is not capable of further algebraic treatment like the mean. That is, if we know the medians of two groups, the overall median cannot be evaluated.
  2. If there is a high degree of variation among the data set, median cannot be viewed as a representative.

    Example:

    Median for 10, 20, 30, 100, 1000, 2000, 3000 is 100 which is not a representative of the group.

  3. It cannot be considered as a representative when there are few items.

Example: 12

The following are the two-week weight gains [kgs] of six young lambs of the same breed who had been raised on the same diet: 11, 13, 19, 2, 10 and 1.

Find the value of Median.

Data type is DD. Place all the given 6 values in ascending order.

 

1  2  10  11  13  19

 

Select the middlemost item. Since the number of elements are even; select the middle most two values and find the average.

The mid values are 10 and 11.

 

Median = [10 + 11]/2 = 10.5

 

Hence, the median is 10.5.

Example: 13

A researcher applied the carcinogenic [cancer-causing] compound benzo[a]pyrene to the skin of five mice and measured the concentration in the liver tissue after 48 hours. The results [nmol/g] were as follows: 6.3, 5.9, 7, 6.9 and 5.9.

Evaluate the median value.

Step 1:

Construct the discrete distribution with frequency:

Step 2:

The data type is DDF. Based on the given table, construct the cumulative frequency table.

Xi fi cf
5.9
2
2
6.3
1
3
6.9
1
4
7
1
5
Total
5
[n + 1]/2 = 6/2 = 3

 

Here, the cumulative frequency just greater than 3 is 4.

Hence, the median is the value of X corresponds to the cumulative frequency 4. The median is 6.9.

Example: 14

The following table gives the litter size [number of piglets surviving to 21 days] for each of the 36 sows. Determine the median of litter size.

Step 1:

The data type is DDF. Based on the given table, construct the cumulative frequency table.

No.of piglets Frequency[No. of sows] cf
5
1
1
6
0
1
7
2
3
8
3
6
9
3
9
10
9
18
11
8
26
12
5
31
13
3
34
14
2
36
Total
36
 

Step 2:

 

Find [n + 1]/2 = 37/2 = 18.5

 

Here, the cumulative frequency just greater than 18.5 is 26.

Hence, the median is the value of X corresponds to the cumulative frequency 26. The median is 11.

Hence, the median of litter size is 11.

Example: 15

Evaluate the value of median.

Step 1:

The data type is continuous one. Construct the cumulative frequency column.

We have to find the class interval in which the median lies,

For which find the value of

Cumulative frequency just greater than 15.5 is 23, which corresponds to class [83–90], which is considered to be the median class.

Class interval [blood glucose] No. of Dogs [fi] cf
69–76
6
6
76–83
9
15
83–90
8
23
90–97
3
26
97–104
2
28
104–111
1
29
111–118
2
31
Total
31
 

Here,

l     lower limit of the median class

h    length of the class interval

n    total frequency

cf   cumulative frequency of the previous class to the median class

f     frequency of the median class

 

l = 83; h = 7; cf = 15; n/2 = 15.5; f = 8;

Hence, the median is 83.44. The median value implies that 50% of the dogs having the blood glucose level up to 83.44.

Example: 16

Number of aphids observed per clover plant. A frequency table grouping the data of above-mentioned problem is as follows:

Number of aphids on a plant Number of plants observed
0–3
6
4–7
17
8–11
40
12–15
54
16–19
59
20–23
75
24–27
77
28–31
55
32–35
32
36–39
8
 
423

Find the value of median.

Step 1:

  1. The given class intervals are not continuous and having uniform length.
  2. Difference between the upper value and the lower value of two subsequent intervals are uniform and its value is 1. The half of the difference is [1/2], that is, 0.5.

Step 2:

  1. Add 0.5 and subtract 0.5 with the upper and lower limits of the class intervals, respectively.

The data type is continuous one. Construct the cumulative frequency column.

We have to find the Class interval in which the median lies,

For which find the value of

Cumulative frequency just greater than 211.5 is 251, which corresponds to class [19.5–23.5], which is considered to be the median class.

Number of aphids on a plant Number of plants observed cf
-0.5–3.5
6
6
3.5–7.5
17
23
7.5–11.5
40
63
11.5–15.5
54
117
15.5–19.5
59
176
19.5–23.5
75
251
23.5–27.5
77
328
27.5–31.5
55
383
31.5–35.5
32
415
35.5–39.5
8
423

Here,

l     lower limit of the median class

h    length of the class interval

n    total frequency

cf   cumulative frequency of the previous class to the median class

f     frequency of the median class

 

l = 19.5; h = 4; cf = 176; n/2 = 211.5; f = 75;

Hence, the median is 21.39.

Example: 17

Consider the following data, which relates to the age distribution of 1000 workers in an industry:

Evaluate the median age.

The data type is CDF. Here the given structure of the table is open and closed end. Convert the data into a continuous one. Moreover the intervals are continuous and of uniform length. The median value can be evaluated directly.

Construct the cumulative frequency column based on the given table.

Age [years] No. of workers Cumulative frequency
20–25
120
120
25–30
125
245
30–35
180
425–cf
35–40
160–f
585
40–45
150
735
45–50
140
875
50–55
100
975
55–60
25
1000
 
Total
1000

Cumulative frequency just greater than 500 is 585.

Hence, the median class is [35–40].

Here l = 35; n/2 = 500; cf = 425; f = 160 and h = 5,

The required median age is 37.34 years. The median value implies that 50% of the workers are below 37.34 years age and 50% of the workers are above 37.34 years age.

Example: 18

An incomplete frequency distribution is given as follows:

Given that the median value is 46, find the missing frequencies using the median formula.

The given class intervals are continuous and having uniform class length, keep them as such. Let x and y be the missing frequencies of the class 30–40 and 50–60, respectively. The value of the median 46 implies that obviously the median lies in the interval [40–50].

Construct the cumulative frequency column.

Given the total frequency is 229.

the median class [40–50] implies that

 

n = 229; l = 40; cf = 42 + X; f = 65; h = 10 and median = 46.

 

By definition,

Variable Frequency Cumulative frequency
10–20
12
12
20–30
30
42
30–40
X
42 + X
40–50
65
107 + X
50–60
Y
107 + X + Y
60–70
25
132 + X + Y
70–80
18
150 + X + Y
Total
229
 

Approximately X = 34.

Using the values of X = 34 in equation 1, we have

 

34 + Y = 79; Y = 79 – 34 = 45.

 

Hence, the missing frequencies are,

Variable

30–40
50–60

Frequency

34
45

Property of median

The sum of the absolute deviations about the median is minimum.

It is denoted by is minimum.

4.5.4 Graphical Method to Find the Median

Method 1

Consider the continuous frequency distribution. Construct the less than cumulative frequency column. Draw the less than cumulative frequency curve [Ogive curve] by taking the class interval on the X-axis and the frequency on the Y-axis.

Find [n/2], and draw a horizontal line at Y = n/2, it will touch the less than Ogive curve, then draw a perpendicular line to the X-axis from that intersecting point. The intersecting point on the X-axis is the required median.

Example: 19

Evaluate median using graphical method.

The given class intervals are continuous and having uniform length; so, keep it as such. Construct the cumulative frequency column and mid-class intervals column.

Consider the upper limits of the class intervals on the X-axis and frequency along the Y-axis. n/2 = 80/2 = 40; draw a horizontal line at y = 40. Clearly it intersects the cumulative frequency curve. Draw a perpendicular line from the point of intersection to the X-axis. The point at which it intersects on the X-axis is the required median.

Approximately the median value is 35.

Method 2

Consider the continuous frequency distribution. Construct the less than and more than cumulative frequency columns. Draw the less than Ogive and more than Ogive curves by taking frequency along the Y-axis and the class intervals along the X-axis. Draw the perpendicular line from the point of intersection of both the Ogive curves to the X-axis, the point at which it meets the X-axis is the required median. Consider the previous example. Already the less than cumulative frequency is evaluated. Construct the more than cumulative frequency column.

Draw both the less than and more than Ogive curve [LOC & MOC].

The approximate value of median is 35.

4.6 QUARTILES, DECILES AND PERCENTILES

Quartiles are measures like the median. Median divides the series into two equal parts. Extending this concept of median, we have quartiles. Quartiles can be classified into Q1, Q2 and Q3 This divided the series into four equal parts. Among these three Q2 is nothing but median.

Similarly, the distribution can be divided into ten equal parts called deciles and into 100 equal parts called percentiles.

The evaluation procedures of quartiles, deciles and percentiles are similar to the evaluation of the median. All the above said statistics have the merits and demerits similar to median. An Ogive curve can be used to locate them.

Quartiles are particularly useful in statistics for calculating quartile deviation and measuring the skewness of a distribution.

Example: 20

Find the quartiles of the following data:

 

68, 50, 32, 21, 54, 38, 59, 66, 44

 

Rewrite the given data in the ascending order.

 

21, 32, 38, 44, 50, 54, 59, 66, 68

 

Here n = 9.

 

Q1 : find [n + 1]/4 = [9 + 1]/4 = 2.5 item

 

Average value of 2nd and 3rd item

Hence, the value of quartiles are Q1 = 35; Q2 = 50 and Q3 = 62.5.

Example: 21

The following table gives the litter size [number of piglets surviving to 21 days] for each of the 36 sows. Determine all the quartiles of litter size.

Step 1:

The data type is DDF. Based on the given table, construct the cumulative frequency table.

Number of piglets Frequency [No. of sows] cf
5
1
1
6
0
1
7
2
3
8
3
6
9
3
9
10
9
18
11
8
26
12
5
31
13
3
34
14
2
36
Total
36
 

Step 2:

 

Q1 : Find [n +1]/4 = 37/4 = 9.25

 

Here, the cumulative frequency just greater than 9.25 is 18.

Hence, the Q1 is the value of X corresponds to the cumulative frequency 18. Hence, the Q1 of litter size is 10.

 

Q2 : Find [n + 1]/2 = 37/2 = 18.5

 

Here, the cumulative frequency just greater than 18.5 is 26.

Hence, the Q2 is the value of X corresponds to the cumulative frequency 26. Hence, the Q2 of litter size is 11.

 

Q3:Find [n + 1] [3/4] = 37 * [3/4] = 27.75

 

Here, the cumulative frequency just greater than 27.75 is 31.

Hence, the Q3 is the value of X corresponds to the cumulative frequency 31. Hence, the Q3 of litter size is 12.

Example: 22

Find the quartiles of the following distribution:

Construct the cumulative frequency column and find the value of N/4, N/2 and [3/4] * N.

X f Cumulative frequency
10
5
5
15
10
15
20
25
40
25
30
70
30
20
90
35
15
105
40
2
107
Total
107

Here n = 107.

Q1 : n/4 = 107/4 = 26.75

The value of cumulative frequency just greater than 26.75 is 40.

Q1 = The value of x corresponds to the cumulative frequency 40 is 20.

Q1 = 20.

Q2: n/2 = 107/2 = 53.5.

The value of cumulative frequency just greater than 53.5 is 70.

Q2 = the value of x corresponds to the cumulative frequency 70 is 25.

Q2 = 25.

Q3: [3/4] * n = 3 * 107/4 = 80.25

The value of cumulative frequency just greater than 80.29 is 90.

Q3 = the value of x corresponds to the cumulative frequency 90 is 30.

Q3 = 30.

Hence, the quartiles are Q1 = 20; Q2 = 25 and Q3 = 30.

Example: 23

Calculate the quartiles and the D3 for the following data.

The given data is of continuous one with uniform length. Construct the cumulative frequency column. Also find N/4, N/2, 3 * [N/4] and 3 * [N/10].

Difference in years Frequency Cumulative frequency
0–5
449
449
5–10
705
1154
10–15
507
1661
15–20
281
1942
20–25
109
2051
25–30
52
2103
30–35
16
2119
35–40
4
2123
Total
2123

Evaluation of Q1

n/4 = 530.75

The cumulative frequency just greater than 530.75 is 1154.

The corresponding first quartile class is 5–10.

Here l = 5; h = 5; f = 705; cf = 449.

Q1 = l + h * [[n/4 – cf ]/f ]

      = 5 + 5 * [[530.75 – 449]/705] = 5 + 5 * [0.1160] = 5 + 0.58

Q1 = 5.58.

 

Evaluation of Q2

n/2 = 2123/2 = 1061.5

The cumulative frequency just greater than 1061.5 is 1154.

The corresponding second quartile class is 5–10.

Here l = 5; h = 5; f = 705; cf = 449.

Q2 = l + h * [[n/2 – cf]/f]

      = 5 + 5 * [[1061.5 – 449]/705] = 5 + 5 * [0.8688] = 5 + 4.344

Q2 = 9.344.

 

Evaluation of Q3

3n/4 = [3 * 2123]/4 = 1592.25

The cumulative frequency just greater than 1592.25 is 1661.

The corresponding third quartile class is 10–15.

Here l = 10; h = 5; f = 507; cf = 1154

Q3 = 10 + 5 * [[1592.25 – 1154]/507] = 10 + 5 * [0.8644] = = 14.322.

 

Evaluation of D3

3N/10 = [3 * 2123]/10 = 636.9

The cumulative frequency just greater than 636.9 is 1154.

The corresponding third deciles class is 5–10.

Here l = 5; h = 5; f = 705; cf = 449

D3 = l + h * [[3n/10 – cf ]/f ]

     = 5 + 5 * [[636.9 – 449]/705] = 5 + 5 * [0.2665] = 5 + 1.333

D3 = 6.333.

Hence, the required quartiles and deciles are

Q1 = 5.58; Q2 = 9.344; Q3 = 14.322; D3 = 6.333.

Example: 24

Evaluate all the quartile values.

Step 1:

The data type is continuous one. Construct the cumulative frequency column.

Class interval [blood glucose] No. of dogs [fi] cf
69–76
6
6
76–83
9
15
83–90
8
23
90–97
3
26
97–104
2
28
104–111
1
29
111–118
2
31
Total
31

Q1: We have to find the class interval in which the Q1 lies, for which find the value of

Cumulative frequency just greater than 7.75 is 15, which corresponds to class [76–83], which is considered to be the Q1 class.

Here,

l     lower limit of the Q1 class

h    length of the class interval

n    total frequency

cf   cumulative frequency of the previous class to the Q1 class

f     frequency of the Q1 class

 

l = 76; h = 7; cf = 6; n/4 = 7.75; f = 9

Hence, the Q1 is 77.3611.

We have to find the class interval in which the Q2 lies, for which find the value of

Q2: Cumulative frequency just greater than 15.5 is 23, which corresponds to class [83–90], which is considered to be the median class.

Here,

l      lower limit of the Q2 class

h     length of the class interval

n     total frequency

cf    cumulative frequency of the previous class to the Q2 class

f      frequency of the Q2 class

 

l = 83; h = 7;cf = 15; n/2 = 15.5; f = 8;

Hence, the Q2 is 83.44.

We have to find the class interval in which the Q3 lies, for which find the value of

Q3: Cumulative frequency just greater than 23.25 is 26, which corresponds to class [90–97], which is considered to be the Q3 class.

Here,

l      lower limit of the Q3 class

h     length of the class interval

n     total frequency

cf    cumulative frequency of the previous class to the Q3 class

f      frequency of the Q3 class

 

l = 90; h = 7; cf = 23; n * [3/4] = 23.25; f = 3;

Hence, the Q3 is 90.58.

4.7 MODE

The value of the variable that occurs most frequently called as mode. It is the position of greatest density, the predominant or most common value. It is also a positional average.

4.7.1 Discrete Data

Consider the given set of ‘n’ number of discrete values Xi [i = 1, 2,…, n]

 

X1, X2, X3, … , Xn

 

For the discrete series mode is not well defined. The approximate value of the mode can be computed using the following relationship:

 

Mode = 3 * Median – 2 * Mean

4.7.2 Discrete Data with Frequency

Consider the given set of ‘n’ discrete values corresponds with ‘n’ different frequencies X i and fi [i = 1, 2, … , n].

Select the maximum frequency. The value of X corresponds to the maximum frequency is consider to be the mode of the data set.

4.7.3 Continuous Data with Frequency

Consider the given set of data. Verify whether the given class interval is continuous or not. If not continuous, change it into continuous one with proper method. After that, verify whether the length of the class intervals is uniform. If not uniform, do proper adjustment for the varied length.

Where Li: lower limit of the ith class interval and

          Ui: upper limit of the ith class interval.

Select the maximum frequency. The class interval corresponds to the maximum frequency is considered to be the modal class of the data set.

l    lower limit of the modal class

h   length of the class interval

f0   frequency of the modal class

f1   frequency of the class preceding to the modal class

f2   frequency of the class succeeding to the modal class

 

The value of mode can be computed using the following relationship:

The relative advantages and disadvantages of mode

The relative advantages of mode are as follows:

  • It is useful in the study of popular sizes.
  • It is simple.
  • It is not affected by the extreme values and can be calculated even if extreme values are unknown.

Example:

A banker can use mode instead of mean to decide the average balances of depositors. The relative disadvantages of mode are as follows:

  • It is not well defined. Sometimes it is not possible to locate it properly.
  • A distribution may be bimodal or multimodal.
  • It is not suitable for mathematical treatment.

Example: 25

The following are the two-week weight gains [kgs] of six young lambs of the same breed who had been raised on the same diet: 11, 13, 19, 2, 10 and 1.

Find the value of Mode.

First evaluate the values of mean and median. For the given data mean and median values are 9.33 and 10.5, respectively.

The approximate value of mode can be given by the relation:

 

Mode = 3 * Median – 2 * Mean
Mode = 3 * 10.5 – 2 * 9.33 = 12.84

 

The required mode value is 12.84.

Example: 26

A researcher applied the carcinogenic [cancer-causing] compound benzo[a]pyrene to the skin of five mice and measured the concentration in the liver tissue after 48 hours. The results [nmol/g] were as follows: 6.3, 5.9, 7, 6.9 and 5.9.

Evaluate the mode value.

Step 1:

Construct the discrete distribution with frequency:

Step 2:

The data type is DDF.

Xi fi
5.9
2
6.3
1
6.9
1
7
1
Total
5

Select the maximum frequency, here it is 2. The value of X corresponds to 2 is the required mode.

Hence, Mode = M0 = 5.9.

Example: 27

The following table gives the litter size [number of piglets surviving to 21 days] for each of the 36 sows.

Determine all the mode of litter size.

The data type is DDF.

Select the maximum frequency, here it is 9. The value of X corresponds to 9 is the required mode.

Hence, Mode = M0 = 10.

Example: 28

Evaluate the value of mode.

Step 1:

The data type is continuous one.

Class interval [blood glucose] No. of dogs [ fi]
69–76
6
76–83
9
83–90
8
90–97
3
97–104
2
104–111
1
111–118
2
Total
31

Select the maximum frequency; here it is 9. This communicates to us that the model class is [76–83].

Here,

l     lower limit of the model class

h    length of the class interval

n    total frequency

f1    frequency of the previous class to the model class

f0    frequency of the model class

f2    frequency of the succeeding class to the model class

 

Here, f1 = 6; f0 = 9; f2 = 8; l = 76; h = 7.

The value of mode can be computed using the following relationship:

Hence, the mode is 81.25.

Example: 29

Number of aphids observed per clover plant. A frequency table grouping the data of above problem is as follows:

Number of aphids on a plant Number of plants observed
0–3
6
4–7
17
8–11
40
12–15
54
16–19
59
20–23
75
24–27
77
28–31
55
32–35
32
36–39
8
 
423

Find the value of mode.

Step 1:

The given class intervals are not continuous and having uniform length.

Difference between the upper value and the lower value of two subsequent intervals are uniform and its value is 1. The half of the difference is [1/2], that is, 0.5.

Step 2:

Add 0.5 and subtract 0.5 with the upper and lower limits of the class intervals, respectively.

The data type is continuous one.

Number of aphids on a plant Number of plants observed
–0.5–3.5
6
3.5–7.5
17
7.5–11.5
40
11.5–15.5
54
15.5–19.5
59
19.5–23.5
75
23.5–27.5
77
27.5–31.5
55
31.5–35.5
32
35.5–39.5
8

Select the maximum frequency; here it is 77. This communicates to us that the model class is [23.5–27.5].

Here,

l     lower limit of the model class

h    length of the class interval

n    total frequency

f1    frequency of the previous class to the model class

f0    frequency of the model class

f2    frequency of the succeeding class to the model class

 

Here, f1 = 75; f0 = 77; f2 = 55; l = 23.5; h = 4.

The value of mode can be computed using the following relationship:

Hence, the mode is 23.83.

Example: 30

Find the mode for the data given below by using the method of grouping.

The given data contains two modes [bimodal distribution], because sizes 4 and 11 having the highest frequency 63. So, both can be considered as mode. We want to know which one is more representative of the distribution. By considering size and frequency

  • Create column-3 by combining the subsequent two frequencies starting from the first one.
  • Create column-4 by combining the subsequent two frequencies starting from the second one.
  • Create column-5 by combining the subsequent three frequencies by adding from the first one.
  • Create column-6 by combining the subsequent three frequencies by adding from the second one.
  • Create column-7 by combining subsequent three frequencies by adding from the third one.

The size 4 repeats maximum number of groups. This implies that the modal size is 4.

Example: 31

The expenditure of 100 families is given below:

Mode of the distribution is 24. Calculate the missing frequencies.

The given class intervals are continuous and are in uniform length. Let x and y be the unknown frequencies of classes 10–20 and 30–40, respectively.

Here h = 10; mode = 24 and n = 100.

The modal class is 20–30.

Expenditure No. of families
0–10
14
10–20
x➙f1
20–30
27➙f0
30–40
Y➙f2
40–50
15
Total
56+x+y

Here; n = 100, implies that 100 = 56 + x + y, that is x + y = 100 – 56 = 44

 

Then                      

 

By definition,

                             

 

Here                            f0 = 27; f1 = x; f2 = y.

 

Using the values in Equation [2],

[1] * [2] + [3]; implies that

Using the value of x in [1]; we have

23 + y = 44; implies that y = 44 – 23 = 21; y = 21.

Hence, the missing frequencies are

Expenditure Number of families
10–20
23
30–40
21

Example: 32

A welfare organization introduced an education scholarship scheme for the school going children of a backward village. The rates of scholarships were fixed as given below:

Age groups [in years] Amount of scholarship per month []
5–7
30
8–10
40
11–13
50
14–16
60
17–19
70

The ages [years] of 30 school going children are noted as 11, 8, 10, 5, 7, 12, 7, 17, 5, 13, 9, 8, 10, 15, 7, 12, 6, 7, 8, 11, 14, 18, 6, 13, 9, 10, 6, 15, 13 and 5 years, respectively. Calculate mean of monthly scholarship. Find out a total monthly scholarship amount being paid to the students.

 

Construct the frequency distribution based on the given data.

Mean = 43.

The average monthly scholarship is 43.

Total monthly scholarship paid to the students is 1290.

Example: 33

Calculate mean, median and mode from the following data:

The given table is of the form open ended. First, convert it with uniform class intervals. Let h = 10. Find the mid values of the class intervals. Select the values of A and h.

 

Mean = 55 + 10 * [2748/2140] = 55 + 10 * 1.284 = 55 + 12.84 = 67.84. Mean = 67.84

Median:

 

n = 2140, n/2 = 2140/2 = 1070.

 

Cumulative frequency just greater than 1070 is 1081.

Hence, the median class is [60–70].

Here, l = 60; h = 10; cf = 748; f = 333.

  Median = 69.67

Mode:

Maximum frequency is 358, hence the modal class is [80–90].

Here, l = 80; f0 = 358; f1 = 351; f2 = 350 and h = 10

= 80 + 10 * [[358 – 351]/[2 * 358 – 351 – 350]] = 80 + 10 * [7/15] = 80 + 10 * 0.4667 = 84.67

Mode = 84.67

Example: 34

The marks obtained by 10 students in a semester examination in a particular paper are 70, 65, 68, 70, 75, 73, 80, 70, 83 and 86. Find the arithmetic mean, mode and median.

Construct the discrete frequency distribution by considering the given data.

 

Mean:

To find the average mark secured by the students, create a column [fx].

Median:

To find the median. Evaluate n/2. n/2 = 10/2 = 5.

Cumulative frequency just greater than 5 is 6.

Median is the value of x corresponds to the cumulative frequency 6.

Here it is 73. Median = 73 marks.

Mode: Find the maximum frequency among fi.

The maximum frequency is 3, which corresponds to 70.

Mode = 70 marks. Hence, mean = 74 marks, median = 73 marks and mode = 70 marks.

4.7.4 Graphical Method to Evaluate the Mode

Graphical method can be used to evaluate the mode if and only if the given data set follows continuous distribution with uniform class length.

Example: 35

Evaluate mode using graphical method

Since the given data is continuous distribution and the lengths of the class intervals are uniform, draw the histogram.

The approximate value of the mode is 2353.

4.8 COMPARISON OF MEAN, MEDIAN AND MODE

The determination of which average is exactly suits for a specific variable depends on many factors. Certainly it depends on the data level. The following table summarizes the valid averages for each level of data:

Data level Averages can be evaluated

Nominal

Mode

Ordinal

Mode, median

Interval

Mode, median and mean

Ratio

Mode, median and mean

It is often convenient to talk about the shape of the distribution. A symmetrical distribution with one mode is commonly stated as bell-shaped curve. For a symmetrical distribution, all the three measures mean, median and mode are exactly equal in value.

i.e. Mean = Median = Mode.

If a distribution is not symmetrical, then it is called asymmetrical or skewed distribution. If distribution is moderately asymmetrical, the following relationship holds good approximately.

 

Mode = 3 * Median – 2 * Mean [or]

Mean – Mode = 3 * [Mean – Median]

 

The above-mentioned relation is called empirical relation. Using the empirical relation, if any two measures are known then the third one can be evaluated approximately.

Example: 36

Calculate the arithmetic mean and median for the following distribution. Also, find the mode value using the empirical relationship.

Consider the given open-end intervals and convert them into continuous class intervals with uniform length.

Evaluate the mean and median:

The mean value is 236 and the median value is 232.14.

The value of mode can be obtained using the empirical relation.

Hence, the approximate mode value is 224.42.

Example: 37

Calculate mean, median and the find value of mode using the empirical relation of the following series.

Size of holdings [hectares] Number of farmers
2.5–3.5
1000
3.5–4.5
2300
4.5–5.5
3600
5.5–6.5
2400
6.5–7.5
1700
7.5–8.5
3000
8.5–9.5
500

Consider the given data. Clearly, the given distribution is continuous and having uniform length. Construct the mid class interval column, cumulative frequency column and columns containing d’s and fd’s.

Here, A = 6 and h = 1 and d = [XA]/h

Cumulative frequency just greater than 7250 is 9300.

The median class is 5.5–6.5.

 

l = 5.5; cf = 6900; f = 2400.

Median

The value of mode can be obtained using the empirical relation.

 

Mode = 3 * Median – 2 * Mean = 3 * 5.646 – 2 * 5.862 = 5.214 hectares.

 

Hence,

Mean = 5.862 hectares

Median = 5.646 hectares and

Mode = 5.214 hectares.

4.9 WEIGHTED ARITHMETIC MEAN

The weighted arithmetic mean is the average evaluated after applying weights to the item as judged by their relative importance. It becomes essential whenever the items in the group are not exactly homogeneous.

Example: 38

Assume that grades representing a semester of work contains one final examination and two one-hour examinations.

The weights attached should strictly reflect the relative importance of the items. The weights should be rounded up for early evaluation. When the average is weighted, the importance of all the items is taken into account. The ordinary average becomes a special case of a weighted average. If we take the weight uniformly as one, the weighted mean becomes ordinary mean.

Example: 39

A cultivator, having coconut farm, cultivates hybrid coconut trees of 20 and their yield [number] is given as follows:

The Government gives incentive of 20, 25, 30, 40 and 50 for the coconut trees in the respective yield groups exceeding 100, but not exceeding 120, exceeding 120 but not exceeding 140, and so on up to 180 but not exceeding 200. Find out the total incentive received by the cultivator and also average incentive per coconut tree.

Average incentive = 650/20 = 32.50

Total incentive received by the cultivator is 650.

Average incentive received by the cultivator is 32.50

4.9.1 Advantages of the Weighted Mean

The weighted mean is used in the following instances.

  • It is used in constructing index numbers. The relative weights of the expenditure such as food, clothing, housing etc. are obtained by surveys and the cost of living index is calculated with those weights.
  • In the educational institutions, it is used to assess the real merit of the student.
  • It is used in evaluating standardized death rates.
4.10 GEOMETRIC MEAN

The geometric mean is the nth root of the product of n items.

 

GM = [X1 * X2 * … * Xn ]1/n

 

It is a mathematical average and not a positional average. It takes all the given values in its evaluation. It gives less weight to the end values than the arithmetic mean. Usually this value will be less than the mean. If any one of the element takes the value ‘0’ means, GM = 0 and if any one of the element is negative, then the value of GM is imaginary. So, it is not useful except for certain special situations.

Note:

GM can be used in the following situations:

  • to calculate the rates of change
  • certain cases of averaging ratios and percentages
  • problems involving rates of interest of invested money
  • can be used to interpolate between items that have a uniform rate of change
  • Used in the evaluation of index numbers and etc.

Example: 40

A sum of money was invested for five years. The average rates of return for the investment for the five successive years were as follows:

 

5%, 4%, 5%, 6%, 3%

 

What was the average rate of interest for the five years.

If we assume that the amount invested as 100. The total amount earned for the five years are 105, 104, 105, 106 and 103 [ex 5% = 100 + 5 = 105]. Here n = 5.

 

GM = [105 * 104 * 105 * 106 * 103]1/5 = 104.5950 = 104.6

This implies that the average rate of return is [104.6 – 100] = 4.6% per year.

Example: 41

Find the geometric mean for the data given.

Items in cost of living Price relative Weight
Food
128.8
60
Clothing
175.6
20
House rent
110.0
10
Miscellaneous
210.0
10
Total
 
100

Here n =100.

Hence, the required geometric mean is 141.65.

4.11 HARMONIC MEAN

HM is also a mathematical average. It is the reciprocal of the average of the reciprocal of the values.

Merits

  • It is used in the averaging of time rates and in manipulation of price data.
  • It is capable of algebraic manipulation.
  • It is of lower value than geometric and arithmetic means.

Demerits

  • It is difficult to calculate.
  • It is not easy to understand.
  • When one of the items is 0, it becomes in determinant.

Example: 42

A teacher finds that 3 students X, Y, Z take 6, 3 and 8 minutes, respectively, to solve a problem.

Compute the average rate of solving the problem.

Given

Time for solving the problem by X = 6 minutes.

Time for solving the problem by Y = 3 minutes.

Time for solving the problem by Z = 8 minutes.

Here n = 3.

EXERCISES
  1. Evaluate all the statistical measures.
  2. Find all the statistical measures:
  3. Determinations of the amount of phosphorus in leaves.
    Phosphorus [mg/g of leaf] Frequency [i.e. No. of determinations]
    8.15–8.25
    2
    8.25–8.35
    6
    8.35–8.45
    8
    8.45–8.55
    11
    8.55–8.65
    17
    8.65–8.75
    17
    8.75–8.85
    24
    8.85–8.95
    18
    8.95–9.05
    13
    9.05–9.15
    10
    9.15–9.25
    4

    Evaluate all the measure of central tendencies.

  4. Find mean, median, mode and all the quartiles:
    Life [No. of years] No. of animals
    0–2
    5
    2–4
    16
    4–6
    13
    6–8
    7
    8–10
    5
    10–12
    4
  5. A palaeontologist measured the width [in mm] of the last upper molar in 36 specimens of the extinct mammal Acropithecus rigidus. The results were as follows:
    1. Construct a frequency distribution [refer the page number 3.19].
    2. Evaluate all the measure of central tendencies.
  6. In a study of schizophrenia, researchers measured the activity of the enzyme monoamine oxidase [MAO] in the blood platelets of 18 patients. The results were as follows:

    Evaluate all the measures of central tendencies.

  7. Find all the averages for the following data:
  8. The following data refers the number of eggs laid by 10 lizards in a season. Find the mean and median.
  9. A sample from a population of butterfly wing lengths.
    Xi [cm] Xi [cm]
    3.3
    4
    3.4
    4
    3.6
    4
    3.6
    4.1
    3.7
    4.1
    3.8
    4.1
    3.8
    4.2
    3.8
    4.2
    3.9
    4.3
    3.9
    4.3
    3.9
    4.4
    4
    4.5

    Find average wing length.

  10. The life in days of 100 rats is distributed as follows:

    Evaluate median and mode.

  11. Consider the weights gained by 100 fishes of a lab test, evaluate all the measure of central tendencies.
  12. Evaluate the values of mean and median for the following data:
  13. Monthly growth rates of a sheep farm in months are 7%, 8.5%, 4.0%, 12%, 15% and 13%, respectively. What is the compound rate of growth of output per month for the period?
  14. As part of a classic experiment on mutations, ten aliquots of identical size were taken from the same culture of the bacterium E. coli. For each aliquot, the number of bacteria resistant to a certain virus was determined. The results were as follows:

    14, 15, 13, 21, 15, 14, 26, 16, 20 and 13. Evaluate all the MCTs.

  15. The following is the frequency tabulation of the weights of eggs [in mg] of a butterfly.
    x f
    185–195
    2
    195–205
    1
    205–215
    3
    215–225
    4
    225–235
    5
    235–245
    6
    245–255
    4
    255–265
    3
    265–275
    2
    275–285
    1

    Evaluate the value of mean and median.

  16. The weight gains of beef steers were measured over a 140-day test period. The Average daily gains [lb/day] of 9 steers on the same diet were as follows: 3.89, 3.51, 3.97, 3.31, 3.21, 3.36, 3.67, 3.24 and 3.27.

    Determine the mean and median.

  17. Listed in increasing order are the serum creatine phosphokinase [ck] levels [u/Li] of 36 healthy men: Evaluate all the MCTs.
  18. To study the spatial distribution of Japanese beetle larvae in the soil, researchers divided a 12 x 12–foot section of a cornfield into 144 one-foot squares. They counted the number of larvae Y in each square, with the results shown in the following table:

    Find all the statistical measures.

  19. One measure of physical fitness is maximal oxygen uptake, which is the maximum rate at which a person can consume oxygen. A treadmill test was used to determine the maximal oxygen uptake of nine college women before and after participation in a ten-week program of vigorous exercise. The accompanying table shows the before and after measurements and the change [after-before]; all values are in mLi O2 per mm per kg body weight. Determine and mean and median values.
    Maximal oxygen uptake
  20. Find all the measures for the life expectancy of two hypothetical species of birds in captivity.
    Species A Xi [mo] Species B Xi [mo]
    34
    34
    36
    36
    37
    37
    39
    39
    40
    40
    41
    41
    42
    42
    43
    43
    79
    44
     
    45
  21. Calculate the mean from the following data:
  22. Consider the following data which are amino acid concentrations [mg/100 ml] in arthropod haemolymph:

    240, 238, 236, 245, 242, 248 and 237.

    Find the value of mean.

  23. Two female cockroaches laid oothecae having the following weights [mg]. To find out the average weight laid, sample oothecae [egg cases] are selected and weighed accurately. The weights of the egg cases laid by two cockroaches are given in the following table. Evaluate the averages.
  24. Calculate the median of the distribution of the values of 140 fruits given in the following table and also calculate the mean and the median.
    X f
    10
    3
    25
    8
    30
    14
    36
    18
    40
    27
    44
    23
    50
    22
    55
    17
    60
    7
    Total 140
  25. The life expectancy [in months] of 212 catla fishes are given as follows. Calculate the mean.
  26. The body weights, [in grams] collected from a population of rats, are:

    66.1,77.1,74.6,61.8,71.5

    Compute the value of average.

  27. The length of 60 fruits of chilli varieties are given as follows.

    Calculate the mean value.

  28. Calculate the mean.
  29. The length of 200 parasites in the human blood were each measured to the nearest micron given in the following table. Calculate all the MCTs.
    Length Frequency
    80–89
    2
    70–79
    2
    60–69
    6
    50–59
    20
    40–49
    56
    30–39
    40
    20–29
    42
    10–19
    32

    [Modify the class intervals in to continuous one and proceed. 80–89 can be modified that 9.5–19.5; … ; 79.5 to 89.5 and proceed.]

    Length Frequency
    9.5–19.5
    32
    19.5–29.5
    42
    29.5–39.5
    40
    39.5–19.5
    56
    49.5–59.5
    20
    59.5–69.5
    6
    69.5–79.5
    2
    79.5–89.5
    2
  30. Three researchers A, B and C are rearing fishes in three different units of a laboratory. The mean weight [in grams] per month and SD [in grams] in each section of the laboratory are given below. Calculate the mean weight of all the fishes taken together.
    Researchers No. of fishes reared Mean weight [in grams]
    A
    50
    113
    B
    60
    120
    C
    70
    115
  31. The height [in cms] of 12 wheat plants and the number of tillers are given as follows:
  32. In two laboratories X and Y engaged in the same research institution, the average monthly food consumption [in kg] on white leghorn and standard deviations is given in the following table:
    Laboratory Average consumption [in kg] No. of white leghorn
    X
    24.5
    450
    Y
    32.5
    525
  33. The lengths 466 microfilaria in pleural blood were each measured to the nearest micron are given below:

    Calculate the mean of this distribution.

  34. The following data shows the estimation of eggs produced by spiders during summer and monsoon season. Calculate the mean weight and standard deviation of all the spiders taken together.
    Seasons No. of spiders Mean weight
    Monsoon
    50
    225
    Summer
    25
    185
  35. The arithmetic mean and standard deviations of weight of 25 fishes were calculated as 20 grams and 5 grams, respectively. But while weighing them, a fish whose weight 13 grams was misread as 30. Find out the correct arithmetic mean.
  36. Number of aphids observed per clover plant. A frequency table of discrete data is as follows:

    Total number of observations = 424. Find out the arithmetic men.

  37. Number of aphids observed per clover plant. A frequency table grouping the data of above problem: Find the average value.
    Number of aphids on a plant Number of plants observed
    0–3
    6
    4–7
    17
    8–11
    40
    12–15
    54
    16–19
    59
    20–23
    75
    24–27
    77
    28–31
    55
    32–35
    32
    36–39
    8

    Total number of observations = 424

  38. Determinations of the amount of phosphorus in leaves. A frequency table of continuous data. Find the average value.
    Phosphorus [mg/g of leaf] Frequency [i.e. no. of determinations]
    8.15–8.25
    2
    8.25–8.35
    6
    8.35–8.45
    8
    8.45–8.55
    11
    8.55–8.65
    17
    8.65–8.75
    17
    8.75–8.85
    24
    8.85–8.95
    18
    8.95–9.05
    13
    9.05–9.15
    10
    9.15–9.25
    4
ANSWER THE QUESTIONS
  1. Statistical methods are needed for_________________and_________________the collected numerical data.
  2. The_________________is computed in order to reduce the complexity of the data collected.
  3. The average is otherwise called_________________.

    (a) mean     (b) measures of central tendency     (c) (a) or (b)     (d) none

  4. Total number of different kind of averages is_________________.
  5. State the different kinds of averages.
  6. The data can be classified in to_________________varieties.
  7. State the different kinds of data classification.
  8. Write down the formulas for evaluating the mean for the following types of data discrete, discrete with frequency and continuous with frequency.
  9. The sum of all the deviations of the observations from the mean is_________________.

    a) > 0     (b) < 0     (c) 0     (d) None

  10. The sum of the squared deviations of the observations from the men is_________________.

    (a) Minimum (b) Maximum (c) None

  11. When Geometric Mean can be used?
  12. ‘The median value communicates that 50% of the values are above and 50% values are below it.’ – Comment on this statement.
  13. The value of median can be evaluated using graphical method.

    (a) True     (b) False     (c) None

  14. Define the terms quartiles, percentiles and deciles.
  15. Differentiate the discrete and continuous data set.
  16. The_________________mean is the average evaluated after applying weights to the item as judged by their relative importance.

    (a) Mean     (b) Weighted arithmetic mean     (c) None

  17. Median and mode are_________________average.
  18. Median value = Second quartile.

    (a) True     (b) False     (c) None

ANSWERS
  1. Summarizing and describing
  2. Average
  3. (a) or (b)
  4. 5
  5. Arithmetic Mean, Median and Mode
  6. 3
  7. Refer Section 4.3
  8. Refer Sections 4.3.1, 4.3.2 and 4.3.3
  9. 0
  10. Minimum
  11. Refer Section 4.10
  12. True
  13. True
  14. Refer Section 4.6
  15. Refer Section 4.3
  16. Weighted arithmetic mean
  17. Positional
  18. True