Chapter 7
Correlation and Regression Analysis
Objectives
After completing this chapter, you can understand the following:
 The definition, meaning and significance of correlation coefficient, and rank correlation coefficient.
 The construction of regression lines.
 The utilization of the regression line concept to estimate the values.
 Its implication towards the decisionmaking applications with respect to biological studies.
7.1 INTRODUCTION
We shall now study two [bivariate] or more variables [multivariate] simultaneously and make an attempt to find the relationship among the variables in quantitative/qualitative form. In reality, we have many such related variables such as crop per acre and fertilizer, height and weight, birth and death rate, blood pressure readings based on two different methods, age of elephants and annual maintenance cost, quantum of pesticides applied and intensity of food poisoning, dietary component and plasma lipid level, size of crops and percentage of worms, age and blood pressure, and antibiotics and bacteria.
This methodology of studying the strength of relationship among the variables is given by Sir Francis Galton and Karl Pearson.
7.2 CORRELATION
It is a statistical measure used to evaluate the strength and degree of relationship among the two or more variables under study. Here the term ‘relationship’ is used to measure the tendency of the variables to move together. The movement of the variables may be in the same or opposite direction. The correlation is said to be positive if the variables are moving in the same direction, and negative if they are moving in the opposite direction. If there is no change in direction, it implies that the variables are not related.
 simple correlation,
 rank correlation and
 group correlation.
7.2.1 Simple Correlation/Correlation
This measure can be evaluated for a discrete series of quantitative in nature. It is denoted by the notation r. The value of r lies in the closed interval [–1 ≤ r ≤ 1]. If the value of r is towards 1, then variables are said to be positively correlated or directly related [if X increases, Y also increases and if X decreases, Y also decreases]. If it is towards –1, then it is said to be negatively correlated or inversely related [if X increases, Y will decrease and if X decreases, Y increases] and if it is 0, then the variables are said to be uncorrelated [the change in X does not affect the variable Y and viceversa].
7.2.2 Rank Correlation
This measure can be evaluated for a discrete series of qualitative in nature. It is denoted by R. The value of R lies in the closed interval [–1 ≤ R ≤ 1].
7.2.3 Group Correlation
This measure can be evaluated for a continuous series of grouped data. It is denoted by r. The values of r lies in the closed interval [–1 ≤ r ≤ 1].
Note:
The larger the value of r, the stronger the linear relationship between Y and X. If r = –1 or r = +1, the regression line will include all data points and the line will be a perfect fit.
7.2.4 Assumptions for Karl Pearson’s Coefficient of Correlation
 The relationship between the two series [X and Y] is linear [the amount of variation in X bears a constant ratio to the corresponding amount of variation in Y].
 Either one of the series is dependent on the other or both are dependent on the third series.
 Correlation analysis is applied to most scientific data where inferences are to be made. In agriculture, amount of fertilizers and crops’ yields are correlated. In economics, prices and demand or money and prices. In medicine, use of cigarettes and incidence of lung cancer or use of new drug and the percentage of cases cured. In sociology, unemployment and crime or welfare expenditure and labour efficiency. In demography, wealth and fertility and so on.
 The correlation coefficient r, like other statistics of the sample, is tested to see how for the sample results may be generalized for the parent population.
7.2.5 Limitations of Correlation
 Interpretation of this analysis needs expertise regarding the statistical concepts and the background of data.
 Correlation in statistics is studied by scatter diagrams and regression lines/coefficient of correlation.
7.2.6 Properties of Correlation
 It is independent of any change of origin of reference and the units of measurement.
 Its value lies in the interval [–1, 1].
 It is a constant value, which helps to measure the relationship between two variables.
7.2.7 Scatter Diagram
The scatter diagram is a very valuable graphic device to show the existence of correlation between the two variables. Represent the variable X on the xaxis and Y on the yaxis. Mark the coordinate points [x, y]; then the existence of correlation can be studied based on the structure of the clustering of the coordinate points. The direction of scatter reveals the refuse and strength of the scatter correlation between the variables.
The scatter diagrams for r and 0 < r < 1 refers that the path is linear and the variables are moving in the same direction. This indicates the correlation is positive [the relationship between the variables is direct].
The scatter diagrams for r = –1 and –1 < r < 0 indicates that the variables are moving in opposite direction and the path is linear.
The scatter diagram for r = 0 indicates that the variables are not having any relation and the path is a curve.
7.3 KARL PEARSON’S COEFFICIENT OF CORRELATION
Consider the pairs of values [X_{1}, Y_{1}], [X_{2}, Y_{2}], … , [X_{n}, Y_{n}] of the variables X and Y. Then, the covariance of these two variables X and Y can be defined as
The standard deviations of X and Y can be given by
The correlation coefficient r can be defined as
Equivalent alternate formulae for r
Value of r using assumed mean
To derive the result, we make use of the concept that the correlation coefficient is independent of choice of origin. Take X_{i} = [X – a] and Y_{i} = [Y – b]. Where a is any one value of X and b is any one value of Y. Then
Example: 1
 In trying to evaluate the effectiveness of antibiotics in killing bacteria, a research institution compiled the following information.
Calculate the correlation coefficient.
Here n = 6; ΣX = 84; ΣY = 39.6
Direct method
Since the value of r is positive, it implies that the relationship between the antibiotics and bacteria is positively related and the association is 74%.
Example: 2
The following table shows the ages [X] and systolic blood pressure [Y] of 8 persons:
Find the value of r.
Here, n = 8; ΣX = 395; ΣY = 1,070.
The age and the blood pressure level are positively related with correlation 0.22.
Example: 3
In a study of the effect of dietary component on plasma lipid composition, the following ratios were obtained on a sample of experimental animals.
Obtain the correlation coefficient.
Let the variables X and Y refers the test score and the production rating, respectively.
Here n = 8; ΣX = 23; ΣY = 16.
The dietary components on plasma lipid composition are negatively related with correlation – 0.29.
Example: 4
Calculate Karl Pearson’s coefficient of correlation for the following data using 20 as the working mean for price and 70 as the working mean for demand:
Let the variables X and Y refers the level of price and demand, respectively.
The assumed means are given as a = 20 and b = 70.
Here, n = 9.
The correlation value is –0.828; it implies that the demand and the price are negatively related.
A computer while calculating the value Y between two variables X [advertising expenditure] and Y [sales level] from 25 sets of values gives n = 25; ΣX = 125; ΣY = 100; ΣX^{2} = 650; ΣY^{2} = 460; and ΣXY = 508. At the time of checking, it was found that two sets of values were wrongly entered.
Evaluate the correct value of r.
Given,
n = 25; ΣX = 125; ΣY = 100; ΣX^{2} = 650; ΣY^{2} = 460 and ΣXY = 508. First, we have to find the corrected sums, that is, subtract the incorrect values and add the correct values from the total.
Corrected values:
Similarly proceeding,
Hence, the corrected value of the correlation coefficient is [2/3] or 0.67.
7.4 COEFFICIENT OF CORRELATION FOR A GROUPED DATA
In a grouped data, the information is given in a correlation table. In each compartment of the table, the deviations from the average of x and the average of y with respect to the corresponding compartment are multiplied and written within brackets. This outcome further multiplied with the frequency of that cell. Adding all such values lead to
Example: 6
The following table gives the distribution of total population and those who are totally are partially blind among them. Find out if there is any relation between age and blindness.
Age  No. of persons in ‘000  Blind 

0–10

100

45

10–20

60

40

20–30

40

40

30–40

36

40

40–50

24

36

50–60

11

22

60–70

6

18

70–80

3

15

Create a modified table which comprised the data % of blindness over the population.
Let A = 45; h = 10; n = 8.
There is a close positive correlation between age and blindness.
Example: 7
Find the coefficient of correlation between the ages of husbands and the ages of wives given here in the form of a twoway frequency table.
Age of husbands [in years]
Note: Show that r lies between +1 and –1.
Let X_{i} = X_{i} – and let Y_{i} = Y_{i} –
Because each term in the RHS of [1] is perfect squares, it implies that LHS ≥ 0.
using [2] in [3], we have [1 – r^{2}] ≥ 0; r^{2} ≤ 1
Hence, the correlation coefficient lies in the closed interval [–1, 1].
7.5 PROBABLE ERROR OF THE COEFFICIENT OF CORRELATION
Normally, we use sample data to evaluate correlation coefficient. So, whenever the result is interpreted, it is necessary to check the reliability of the evaluated sample correlation with the population’s coefficient. This is determined by probable error. It is evaluated using the result.
Probable error = 0.6745 * [standard error of r]
Where standard error of
Where r is the correlation coefficient and n is the number of pairs of items. The interpretation is that if P.E. of r = +/–a, where ‘a’ is a constant, then the range of the correlation of the population can be evaluated approximately as [r – a, r + a].
This probable error calculation can be used only when the whole data are normal or near to normal. The selection of sample should be unbiased. In related to the probable error, the significance of the coefficient of correlation may be judged as follows:
The coefficient of correlation is significant, if it is more than six times the probable error or where the probable error is not much and r exceeds 0.5. It is not significant at all, if it is less than the probable error.
Example: 8
Calculate the correlation coefficient and its probable error from the following results:
And find its probable error.
Given,
By definition,
The correlation coefficient is 0.75; it implies that Y is positively related. The probable error of r is 0.0851.
Example: 9
Calculate the coefficient of correlation between X and Y.
X series  Y series  

No. of items 
15

15

Arithmetic mean 
25

18

Squares of deviation from mean 
136

138

Sum of the product of deviations X and Y series from their respective means is 122.
Given,
X series  Y series 

n_{1} = 15

n_{2} = 15

= 25

= 18

The relationship between the variables is positive.
Example: 10
Evaluate the correlation coefficient for the following data:
Consider the given data
By definition,
The variables are positively related.
7.6 RANK CORRELATION
Pearson’s correlation coefficient ‘Y’ gives a numerical measure of degree of relationship exists among the two variables X and Y. However, it requires the joint distribution of X and Y must be normal. These two things can be over cited by rank correlation coefficient based on the ranking of the variates. This was introduced by Charles Edward Spearman in 1904. It helps on dealing with qualitative characteristics such as beauty and intelligence. It is more suitable, if the variables can be arranged in order of merit. This is denoted by R.
Consider n pairs [X_{1}, Y_{1}], [X_{2}, Y_{2}], … , [X_{n}, Y_{n}].
Rank the elements of X series by comparing each and every element of it.
Let it be R_{1}, R_{2}, … R_{n}.
Similarly for Y series, let it be S_{1}, S_{2}, … , S_{n}.
Similarly proceeding, we have
Similarly proceeding, we have
Note for repeated ranks
The abovegiven formula holds good, if the ranks are not repeated. For repeated ranks, say if a rank is repeated for m number of times, then the value [[m[m – 1]^{2}]/12] should be added along with [Σdi^{2}]. This must be carried over for each repeated ranks.
Merits of rank correlation coefficient
 It is simple to understand and easy to evaluate.
 It is very much useful for qualitative type of data.
 It can be evaluated also for a quantitative type of data.
 Two referees in a flower beauty competition rank the 10 types of flowers as follows:
Use the rank correlation coefficient and find out what degree of agreement is there between the referees.
n = 10. By definition,
Since the given data set contains ranks, evaluate the difference in ranks.
The rank correlation coefficient is positive; it implies that the variables are positively related.
Example: 12
Ten competitors in a flower beauty contest are ranked by three judges in the following order:
Use the rank correlation coefficient to determine which pair has the nearest approach to common taste in deciding flower beauty.
Since the data set contains ranks, first evaluate the rank correlation coefficient between [J_{1}, J_{2}], [J_{2}, J_{3}], and [J_{3}, J_{1}].
Judges 1 and 3 has the nearest approach to common taste in beauty.
Example: 13
Find the rank correlation coefficient of the following data:
Consider the data given and rank it.
Series A:
98 repeated for 3 times; the corresponding rank positions are 7, 8 and 9.
Series B:
73 is repeated for 2 times; the corresponding rank positions are 6 and 7.
As per Spearman’s modified formula for repeated values, along with Σd^{2}; for each repeated values, the element [[m [m^{2} – 1]]/12] should be added. Where m is the number of time the value is repeated.
Hence,
The variables are positively related.
Example: 14
The coefficient of rank correlation between marks in mathematics and statistics of a class is 9/11 and the sum of the squares of the differences in ranks is 30. Find the number of students in the class.
Given R = 9/11 and Σd^{2} = 30.
Find the value of n.
By definition,
Using the given values in the relation [1],
Comparing the values of the factors or both LHS and RHS, it implies that n = 10.
Hence, the number of students in the class is 10.
7.7 REGRESSION EQUATIONS
7.7.1 Regression
The word regression was first used by Sir Francis Galton in his investigation regarding heredity. Regression means stepping back. The term regression is not used in this sense in statistics. It is a mathematical measure that refers the mean relationship between two variables. This is used to predict the expected value of one variable if the value for another one is given. Among the two variables, one should be treated as independent variable and the other one is treated to be dependent.
The relationship stated above can be expressed in the form of a linear equation in two variables. Among the two variables say X and Y, at a time one can be treated as dependent on the other.
(a)X depends on Y (b)Y depends on X.
7.7.2 Regression Equation Y depends on X
Consider n pairs of data [X_{1}, Y_{1}], [X_{2}, Y_{2}], … [X_{n}, Y_{n}] and let the linear equation representing these n data be
Multiply on both sides of [1] by X.
Take the summation on either side of [3],
[2] and [4] are two linear equations with two unknowns a and b.
Divide [2] by n on both sides, we have
Solving [1] and [5], we have
By definition,
Comparing [7] and [8], we have
using the value of a in [6],
[9] is the required regression equation Y on X.
It is used to estimate the most likely values of Y when the X value is known.
Here, the value is called regression coefficient of the regression equation Y on X and can be denoted by b_{YX}. Then, [9] can be expressed as
Similarly proceeding, we can get the regression Equation X depends on Y as
The value is called regression coefficient of the regression Equation X on Y and can be denoted by b_{XY}. Then, [10] can be expressed as
[9] and [10] are the required two regression equations.
Multiplying the like sides of we have
Note:
 The value of the variances of and are always positive.
 The two regression equations [9] and [10] imply that the two lines are passing through the common point [,].
 To get the value of the two means, it is sufficient to solve the given two regression equations.
Example: 15
Blood pressure readings by two different methods were made in 10 patients with essential hypertension. The systolic readings by the two methods are shown in the following table. The clinician wished to investigate the relationship between the two measurements. You are required to find out whether there is any correlation between the two methods of measurement. Is it positive or negative? Is it high or low? Also construct the two regression lines.
Systolic blood pressure readings [mm Hg] by two methods in 10 patients with essential hypertension  

Patient  Method 1  Method 2 
1 
132 
130 
2 
138 
134 
3 
144 
132 
4 
146 
140 
5 
148 
150 
6 
152 
144 
7 
158 
150 
8 
130 
125 
9 
162 
160 
10 
168 
150 
Let X and Y be the two random variables referring blood pressure reading based on method 1 and method 2, respectively. Evaluate the necessary summations using the given data.
Here n = 10; ΣX = 1,478; ΣY = 1,415
The correlation is positive and high.
By definition,
Similarly,
The regression equation Y on X is
The regression equation X on Y is
[1] and [2] are the required two regression equations.
Example: 16
Construct the regression lines between pesticides and food poisoning. Find the value of Y when X = 10.
Quantum of pesticides applied [in Kg] X  Intensity of food poisoning Y 

17

36

13

46

15

35

16

24

6

12

11

18

14

27

9

22

7

2

12

8

Evaluate the necessary summations using the given data.
Here, n = 10; ΣX = 120; ΣY = 230
Similarly,
The regression equation Y on X is
The regression equation X on Y is
[1] and [2] are the required two regression equations.
Given x = 10, to find the value of y.
Put X = 10 in equation [1]; Y = 2.5 * 10 – 7 = 18.
When the pesticides level X = 10, the corresponding intensity level of food poisoning Y is 18.
Example: 17
The following table shows the methyl mercury intake and whole blood mercury values in 10 subjects exposed to methyl through consumption of contaminated fish.
Methyl mercury intake [mg Hg/day] X  Mercury in whole blood [mg/g] Y 

180

90

200

120

230

130

410

290

600

310

550

300

580

175

600

380

250

70

115

100

You are required to construct the two regression equations. Also evaluate the value of X given Y = 295. Evaluate the necessary summations using the given data.
Here, n = 10; ΣX = 3,715; ΣY = 230
By definition,
The regression equation Y on X is
The regression equation X on Y is
[1] and [2] are the required two regression equations.
Given Y = 295, to find the value of X.
Put Y = 295 in [2]; X = 1.49 * 295 + 79.53 = 519.08.
When the mercury in whole blood level Y= 295 mg/g, the corresponding value of methyl mercury intake X is 519.08 mg Hg.
Example: 18
The correlation coefficient between supply [Y] and price [X] of a commodity is 0.60. If σ_{X} = 150, σ_{Y }= 200, mean [X] = 10 and mean [Y] = 20. Find the equations of the regression lines of Y on X and X on Y.
[MBA, 1998]
Given Y = 0.6; σ_{X }= 150, σ_{Y}_{ }= 200, mean [X] = 10 and mean [Y] = 20.
By definition,
The regression equation Y on X is
The regression equation X on Y is
The regression equation Y on X is Y = 0.8X + 12.
The regression equation X on Y is X = 0.45Y + 1.
Example: 19
In a partially destroyed laboratory record of an analysis of correlation data, the following results only are legible:
Regression equations: 8X – 10Y + 66 = 0; 40X – 18Y = 214.
What were
 the mean values of X and Y.
 The correlation coefficient between X and Y.
 If σ_{X}^{2} = 9, find the value of σ_{Y}_{}
[MBA 1999]
Consider the two regression equations,
We have to choose one equation for X on Y and the other one for Y on X.
Since the magnitude of coefficient of Y in [1] is dominating the magnitude coefficient of X, choose [1] for Y on X and [2] for X on Y.
[1] can be rewritten as,
[2] can be rewritten as,
Comparing [4] with the actual equation
we have, b_{YX} = 0.8
In the same way, comparing [4] with the actual equation
we have, b_{XY }= 0.45
By definition, [5]
and [6]
Multiplying the like sides of [5] and [7] we have,
Since both the regression coefficients are positive, the value of the correlation coefficient must be positive.
Hence, the value of correlation coefficient is 0.6.
To get the mean values of X and Y, solve the two given [1] and [2] for X and Y. The value of X is taken to be the mean value of X and the value of Y is taken to be the mean value of Y.
Using the values of Y = 17 in [1] we have X = 13.
Hence,
The mean of X is 13 and the mean of Y is 17.
Given σ_{X}_{}^{2} = 9.
Using the value of σ_{X} and Y in [5],
Note:
In the situation of dominancy among the coefficients of the variables are not existing purely, choose any one of the equation for Y on X and the other one for X on Y based on trial and error basis. This selection should satisfy the condition b_{YX} * b_{XY} ≤ 1. If this condition fails, then revert the selection and proceed.
Example: 20
Two lines of regressions are given by x + 2y = 5 and 2x + 3y = 8. Calculate the value of mean of x, mean of y and r.
Consider the given regression equations,
There is no pure dominance existing among the two variables in both the equations. Clearly the coefficient of Y dominates in terms of magnitude in both the equations. Choosing [1] for Y on X based on trial and error method,
[3] implies that b_{yx} = –0.5
Choose the second equation for X on Y.
Then we have, b_{xy} = –1.5
Hence, the selection is correct. [If bxy * byx > 1, change the selection of equation for Y on X and X on Y then proceed.]
Multiplying the like sides of [5] and [6] we have,
Since both the regression coefficients are negative, the value of the correlation coefficient must be negative.
Hence, the value of correlation coefficient is –0.866.
To get the mean values of x and y, solve the two given [1] and [2] for x and y. The value of x is taken to be the mean value of x and the value of y is taken to be the mean value of y.
Multiplying [3] and [4] based on like sides,
Both bxy and byx are < 0; it implies that r value should be negative,
Solving [1] and [2], we have x = 1 and y = 2.
Hence, the mean of x = 1 and the mean of y = 2.
EXERCISES
 Distinguish between correlation coefficient r and rank correlation coefficient R.
 Analyse critically the assumptions underlying the Karl Pearson’s correlation coefficient.
 Calculate the coefficient of correlation between age group and rate of mortality from the following data.
 Ten competitors in a beauty contest are ranked by three judges. Find which pair of judges has the nearest approach to common taste in beauty.
 Given the regression lines as 3x + 2y = 26 and 6x + y = 31. Find their point of intersection and interpret it. Also find the correlation coefficient between x and y.
 If the Karl Pearson’s coefficient of correlation is 0.95 and the SD of x and y are 3 and 7, what is the covariance of x, y?
 Calculate Spearman’s coefficient of rank correlation for the following data:
 Find the rank correlation coefficient of the following data:
 Y is the weight of potassium bromide that will dissolve in 100g of water at X° C are given below. Fit an equation of the form Y = a + bx by the method of least square. Use this relation to estimate weight [Y] when X = 150°C.
 Assume that we conduct an experiment with eight fields planted with corn and four fields having no nitrogen fertilizer. The resulting corn yields are shown in the table as bags per acre:
Field Nitrogen [Kg] Corn yields [bags/acre] 10122036306401858012868011278011288076 Compute a linear regression equation by least squares.
 Predict corn yield for a field treated with 60 pounds of fertilizer.
 Find the linear regression equation of percentage worms [Y] on size of the crop [X] based on the following seven observations.
 The following table shows the ages [X] and systolic blood pressure [Y] of eight persons:
Fit a linear regression equation of Y on X and estimate the blood pressure of a 70yearold person.
 In trying to evaluate the effectiveness of antibiotics in killing bacteria, a research institution compiled the following information.
Calculate the regression equation of bacteria on antibiotics. Estimate the probable killings of bacteria when the antibiotics are used in 20 mg.
 From the following data, ascertain whether the birth and death rate of fish that have been reared in the laboratory are correlated.
Month Birth rate Death rate January10090February10495March11098April125100May130102June140115July145135  Some health researchers have reported an inverse relationship between the central nervous system malformations and the hardness of the related water supplies. Suppose the data were collected on a sample of nine geographic areas with the following results.
CNS malformation rate [per 1,000 births] Water hardness [ppm] 9120813059011504160210031406807200Compute coefficient of correlation. What are your conclusions?
 The body weight [X lbs] and food consumption [Y, 350day food consumption, lbs] of white leghorn is given in the following table:
Show the relationship between body weight and food consumption.
 The following data give the yield of maize grain [in kgs] per plot of size 10 × 4 sq.m for different doses of nitrogen applications.
Calculate the correlation coefficients and draw your interface.
 Calculate the correlation coefficient between height of father and son from the following data:
 Calculate the coefficient of correlation between age of elephants and annual maintenance cost.
Age of elephants [years] Annual maintenance cost [rupees] 21,60031,50051,80091,90081,700102,100122,000  The following are the results of some experiments:
Age of fish [weeks] Fish reared [no.] Fish achieved [required weight] 10–1120015011–1230025012–13502013–1415011014–151008015–1620019016–17250220Calculate the coefficient of correlation between age and fish achieved the required weight in the experiments.
ANSWER THE QUESTIONS
 ____________________ helps us to find the relationship among the variables in quantitative/qualitative form.
 This methodology of studying the strength of relationship among the variables is given by ____________________.
 ____________________ is a statistical measure used to evaluate the strength and degree of the relationship among the two or more variables under study.
 Correlation is classified into ____________________.
 The value of correlation [r] lies in the closed interval.
 ____________________ is used to find the association of the quantitative type of data.
 ____________________ is used to find the association of the qualitative type of data.
 If the data type is continuous, the association can be studied using the method of ____________________.
 State the properties of correlation.
 The ____________________ is a very valuable graphic device to show the existence of correlation between the two variables.
 The value of r can be computed using the relation.
 The standard error[r] = .
 The relationship for computing the ____________________ is [0.6745 *[standard error of r]].
 Define the term ____________________.
 The word ____________________ was first used ____________________ by in his investigation regarding heredity.
 ____________________ is used to predict the expected value of one variable if the value for another one is given.
 ____________________ is used to express the relationship exists between any two variables in the form of a linear equation.
 The structure of the regression equation can be given as ____________________.
 Both the regression coefficients b_{xy} and b_{yx} should be of ____________________.
 same sign
 opposite in sign
 none
 When the covariance is positive, then the values of both and are positive.
ANSWERS
 Bivariate or multi variate analysis
 Sir Francis Galton and Karl Pearson
 Correlation
 simple correlation, rank correlation and group correlation
 [–1 ≤r ≤ 1].
 Simple correlation.
 Rank correlation
 Group correlation
 Refer Section 7.2.6
 Scatter diagram
 [Covariance/{SD[x] * SD[y]}]
 Probable error
 Rank correlation
 Regression and Sir Francis Galton
 Regression
 Regression
 Same sign
 b_{yx} and b_{xy}