Chapter 6: Skewness, Moments and Kurtosis – Biostatistics

Chapter 6

Skewness, Moments and Kurtosis

Objectives

After completing this chapter, you can understand the following:

  • The definition, meaning and significance of skewness, moments and kurtosis.
  • The method of evaluation of skewness, moments and kurtosis.
  • How these measures can be used to make decisions related to biological studies?
6.1 INTRODUCTION

It refers to the lack of symmetry of a distribution. A symmetrical distribution has the property that the structure of the curve will look like a bell-shape. If a distribution is not symmetrical, it is called as a skewed one. The value of mean, median and mode will be exactly same for a symmetrical distribution. Any difference in the values of the three measures clearly spell out that the corresponding distribution is skewed [not a symmetric] one. In probability theory and statistics, kurtosis [from the Greek word κυρτός, kyrtos or kurtos, meaning bulging] is a measure of the ‘peakedness’ of the probability distribution of a real-valued random variable.

6.2 DISPERSION AND SKEWNESS

The measure of dispersion shows the scatter of the items. It means that how much each item differs from the mean. Whereas the skewness measures the degree of symmetry of the distribution.

Measure of skewness

Karl Pearson’s measure of skewness for a skewed distribution

 

Sk = [Mean – Mode]/SD

 

For a moderately skewed distribution

 

Sk = 3 * [Mean – Median]/SD

 

Bowley’s measure of skewness

 

Sk = {Q32 * Q2 + Q1 }/{ Q3 – Q1}

Both the measures can be used to evaluate the skewness. If the frequency distribution has open-end classes, bowley’s measure is the best one.

It can be classified into positively skewed and negatively skewed.

Nature Characteristics

Positively skewed

Mean > Median > Mode Tail of the curve extends long to the right

Negatively skewed

Mode > Median > Mean Tail of the curve extends to the left

It is necessary to find not only the skewness but also its direction. Consequently two different distributions can have same mean and standard deviation but they may be of skewed in opposite direction.

Example: 1

A plant physiologist grew birch seedlings in the greenhouse and measured the ATP content of their roots. The results [nmol ATP/mg tissue] were as follows for four seedlings that had been handled identically. 1.05 1.07 1.19 1.45

Calculate the coefficient of skewness based on the mean and median.

Mean = = 1.19; Median = 1.13; SD = 0.1594 [Refer Chapter 5]

Sk= 3 * [Mean – Median]/SD = 3 * [1.19–1.13 ]/0.1594 = 1.13

Example: 2

Calculate Pearson’s measure of skewness on the basis of mean, mode and standard deviation from the following data:

The following table gives the litter size [number of piglets surviving to 21 days] for each of the 36 sows.

Step 1:

The data type is DDF.

Mean = 375/36 = 10.41

Mode = 10

Step 2:

Construct the following table.

Skewness = 3 * [Mean – Mode ]/SD

                = 3 * [10.41 – 10]/1.96 = 0.63

Hence, Mean = 10.41, Mode = 10, SD = 1.96 and Skewness = 0.63

Example: 3

Evaluate the Pearson’s measure of skewness on the basis of mean, mode and standard deviation from the following data:

Step 1:

Find the value of mean, mode and SD.

Mean = 85.82

Mode = 81.25

SD = 11.57

Step 2:

Skewness = 3 * [Mean – Mode]/SD

                = 3 * [85.82 – 81.25]/11.57 = 1.19.

Example: 4

Which group is more symmetrically skewed?

Group-I Group-II
Mean = 22
Mean = 22
Median = 24
Median = 25
SD = 10
SD = 12

Skewness = 3* [Mean – Median]/SD

Skewness for Group-I

Sk=3* [22 – 24]/10 = –0.6

Skewness for Group-II

Sk=3* [22 – 25]/12 = –0.75

Since the absolute value of the skewness of Group-I is less than Group-II, so the Group-I is more symmetrical than the second.

6.3 MOMENTS

It is used to refer the peculiarities of a frequency distribution. The utility of moments lies in the sense that they indicate different aspects of a given distribution. It helps to measure the central tendency of a series, dispersion or variability, skewness and the peakedness of the curve.

The moments about the actual arithmetic mean are denoted by the symbol μ. The first four moments about the mean are as follows:

n → number of items given

In general, the rth moment can be defined as

Note:

The values of x are discrete in nature.

If the values of x are referring frequency distributions then the formula for the four moments are given as follows:

In general the rth moment can be defined as

6.4 KURTOSIS

It refers to the peakedness of the frequency curve. Pearson [1905] introduced kurtosis as a measure of how flat the top of a symmetric distribution is when compared to a normal distribution of the same variance. According to Clark, the term kurtosis means the property of the distribution which expresses the peakedness. It is denoted by the notation β2.

Note:

The value of skewness can also be evaluated using moments. Skewness = [μ3]/[μ2]1.5

A high kurtosis distribution has a sharper ‘peak’ and fatter ‘tails’, whereas a low kurtosis distribution has a more rounded peak with wider ‘shoulders’.

Distributions with zero kurtosis are called mesokurtic [β2 = 3]. The most prominent example of a mesokurtic distribution is the normal distribution family, regardless of the values of its parameters.

A distribution with positive kurtosis is called leptokurtic [β2 > 3]. In terms of shape, a leptokurtic distribution has a more acute ‘peak’ around the mean [that is, a higher probability than a normally distributed variable of values near the mean] and ‘fat tails’.

Examples of leptokurtic distributions include the Laplace distribution and the logistic distribution. Such distributions are sometimes termed ‘super Gaussian’.

A distribution with negative kurtosis is called platykurtic [β2 < 3]. In terms of shape, a platykurtic distribution has a smaller ‘peak’ around the mean and ‘thin tails’. Examples of platykurtic distributions include the continuous or discrete uniform distributions, and the raised cosine distribution. The most platykurtic distribution of all is the Bernoulli distribution with p = ½.

Example: 4

A plant physiologist grew birch seedlings in the greenhouse and measured the ATP content of their roots. The results [nmol ATP/mg tissue] were as follows for four seedlings that had been handled identically.

Evaluate the value of kurtosis.

Step 1:

First evaluate the mean value.

Step 2: The required four moments are

Kurtosis = β2 = [μ4]/[μ2]2

             β2 = 0.001/[0.025]2 = 1.6

 

The value of β2 is 1.6 which is less than 3, implies that the given distribution is platykurtic.

Example: 5

Find the value of kurtosis of the following data:

The following table gives the litter size [number of piglets surviving to 21 days] for each of the 36 sows.

Step 1:

The data type is DDF.

Mean = 375/36 = 10.41

Step 2:

Construct the following table.

Kurtosis = β2 = 4]/[μ2]2

 

β2 = 48.35/[3.86]2 = 3.25

 

The value of β2 is 3.25 which is more than 3, implies that the given distribution is leptokurtic.

Example: 6

Number of aphids observed per clover plant. A frequency table grouping the data of above problem:

No. of aphids on a plant No. of plants observed
0–3
6
4–7
17
8–11
40
12–15
54
16–19
59
20–23
75
24–27
77
28–31
55
32–35
32
36–39
8
423

Find the value of kurtosis.

Step 1:

  • The given class intervals are not continuous and having uniform length.
  • Difference between the upper value and the lower value of two subsequent intervals are uniform and its value is 1. The half of the difference is [1/2] i.e. 0.5.

Step 2:

  • Add 0.5 and subtract 0.5 with the upper and lower limits of the class intervals, respectively.
  • Find the midpoint of the class intervals.
  • Find the value of di = [xi – A]/h; let A = 17.5, and h = 4.

Step 3:

 

= 20.84

 

Average number of aphids observed per clover plant is 20.84.

Here the value of x refers the mid value of the class intervals.

Mean = 20.84; n = 423

Kurtosis = β2 = [μ4]/[μ2]2

 

β2 = 10304.56/[65.78]2 = 2.38.

 

The value of β2 is 2.38 which is less than 3, implies that the given distribution is platykurtic.

 

EXERCISES
  1. Evaluate the Karl Pearson measure of skewness.
  2. Evaluate the Karl Pearson measure of skewness.
  3. Determinations of the amount of phosphorus in leaves.
    Phosphorus [mg/g of leaf] Frequency [i.e. no. of determinations]
    8.15–8.25
    2
    8.25–8.35
    6
    8.35–8.45
    8
    8.45–8.55
    11
    8.55–8.65
    17
    8.65–8.75
    17
    8.75–8.85
    24
    8.85–8.95
    18
    8.95–9.05
    13
    9.05–9.15
    10
    9.15–9.25
    4
  4. Evaluate the Karl Pearson measure of skewness.
    Life [No. of years] No. of animals
    0–2
    5
    2–4
    16
    4–6
    13
    6–8
    7
    8–10
    5
    10–12
    4
  5. Find Karl Pearson measure of skewness for the following data:
  6. The following data refers the number of eggs laid by 10 lizards in a season. Find Bowley’s measure of skewness for the following data:
  7. Find Bowley’s measure of skewness for the following data:
  8. The following is the frequency tabulation of the weights of eggs [in mg] of a butterfly.
    x f
    185–195
    2
    195–205
    1
    205–215
    3
    215–225
    4
    225–235
    5
    235–245
    6
    245–255
    4
    255–265
    3
    265–275
    2
    275–285
    1

    Find Bowley’s measure of skewness.

  9. To study the spatial distribution of Japanese beetle larvae in the soil, researchers divided a 12 × 12 -foot section of a cornfield into 144 one-foot squares. They counted the number of larvae Y in each square, with the results shown in the following table:

    Find Pearson’s measure of skewness.

  10. Calculate the median of the distribution of the values of 140 fruits given in the following table and also calculate the statistical measures.
    x f
    10
    3
    25
    8
    30
    14
    36
    18
    40
    27
    44
    23
    50
    22
    55
    17
    60
    7
    Total 140

    Find Bowley’s measure of skewness for the following data:

  11. The life expectancy [in months] of 212 catla fishes are given below. Calculate the Pearson’s skewness.
  12. Compute coefficient of quartile deviation from the following data of life expectancy of hypothetical species of birds in captivity: Evaluate the quartiles and its deviation.
  13. Consider the following frequency tabulation of leaf weights [in grams]. Evaluate the quartiles.
  14. Water content of eggs of 150 butterflies are given as follows:
    Water content of eggs [Percentage] Butterflies [Numbers]
    47
    4
    49
    10
    51
    5
    53
    9
    55
    25
    57
    35
    59
    20
    61
    10
    63
    20
    65
    12

    Find Bowley’s measure of skewness.

  15. The lengths 500 microfilaria in pleural blood were each measured to the nearest micron are given as follows:

    Evaluate all the moments and kurtosis.

    [modify the interval into a continuous one]

  16. Consider the following frequency tabulation of leaf weights [in grams]:
    xi fi
    1.85–1.95
    2
    1.95–2.05
    1
    2.05–2.15
    2
    2.15–2.25
    3
    2.25–2.35
    5
    2.35–2.45
    6
    2.45–2.55
    4
    2.55–2.65
    3
    2.65–2.75
    1

    Evaluate all the moments and kurtosis.

  17. The life in days of 100 rats are distributed as follows:

    Evaluate all the moments and kurtosis.

  18. The following is the frequency tabulation of the weights of eggs [in mg] of a butterfly.
    x f
    185–195
    2
    195–205
    1
    205–215
    3
    215–225
    4
    225–235
    5
    235–245
    6
    245–255
    4
    255–265
    3
    265–275
    2
    275–285
    1

    Evaluate all the moments and kurtosis.

  19. Find the value of kurtosis for the following data, which are amino acid concentrations [mg/100 ml] in anthropoid haemolymph:

     

    240.6, 238.2, 236.4, 244.8, 240.7, 241.3 and 237.9.
ANSWER THE QUESTIONS
  1. ________________ refer the lack of symmetry of a distribution.
    • Mean
    • SD
    • Skewness
    • None
  2. ________________ is the measure of the peakedness of the probability distribution of a real-valued random variable.
    • Skewness
    • Kurtosis
    • None
  3. ________________ and ________________ are the two measures of the skewness.
  4. If the frequency distribution has open-end classes ________________ measure is best to evaluate the measure of skewness.
    • Karl Pearson’s measure
    • Bowley’s measure
    • None
  5. The distribution is said to be positively skewed if __________________ .
  6. The distribution is said to be negatively skewed if __________________ .
  7. Write down the formulas for evaluating both the measures of skewness.
  8. __________________ is used to refer the peculiarities of a frequency distribution.
  9. Kurtosis can be computed using the relation __________________ .
  10. Even though two distributions having the same mean and SD, it is not necessary both should have the skewness _____________ .
    • Same
    • May be having opposite sign
    • None
  11. When the value of the kurtosis is zero, then it is said to be __________________ .
  12. When the value of the kurtosis is positive, then it is said to be __________________ .
  13. When the value of the kurtosis is negative, then it is said to be __________________ .
ANSWERS
  1. Skewness
  2. Kurtosis
  3. Karl Pearson, Bowley
  4. Bowley’s measure
  5. mean > median > mode
  6. mean > median > mode
  7. Refer Section 6.2
  8. Moments
  9. Refer Section 6.4
  10. May be having opposite sign
  11. Mesokukrtic
  12. Leptokurtic
  13. Platyurtic