ANOVA was developed by Prof. R.A. Fisher on Agriculturist who made an extensive use of this statistical technique in his agriculture experiment. In statistics the term ANOVA mean analysis of variance, which is one of the best advanced statistical technique in the field of professional business, physical sciences. Agriculture experiments and Economic theory. In the study of ANOVA our aim is to find out the difference among the various population means.
The technique of “ANOVA” is to split up the total variation into component variation due to independent factors where each of the components gives us the estimate of the population variance.
The total variation is split up into the following two parts:
(i) Variance between samples,
ADVERTISEMENTS:
(ii) Variance within samples.
Classification Model:
We use two techniques, in which “ANOVA” can be studied:
(i) One way classification model,
ADVERTISEMENTS:
(ii) Two way classification model.
Important:
The following terms are generally used in ANOVA:
(i) Treatment:
ADVERTISEMENTS:
An object cause or procedure whose effect is measured and compared whose effect is measured and compared is known as a treatment. A set of drugs, fertilizers etc. are the examples of treatments in their different field experimentation.
(ii) Experimental Unit:
It is the smallest division of an experimental material to which a treatment material to which a treatment is applied and on which the variable under study is measured. The set of all the experiment units is known as field or the experimental material.
(iii) Blocks:
In this study, the total experimental material is subdivided into various strata. Which are more uniform amongst themselves then the fields as a whole these strata are known as blocks.
(iv) Yields:
The experimental result obtained by different plots are measured and are termed as yields.
(v) Replication:
The repetition of the treatment under comparison is known as degree of freedom.
(i) ANOVA-One Way Classification:
In the investigation of analysis of variance (ANOVA), if we consider the influence of any one factor, then it is called one way classification of ANOVA model. In one way classification of ANOVA, we consider the influence of any one factor. It is designed to test the null hypothesis that the arithmetic means of the population from which the k samples are randomly drawn equal to one another.
These are two techniques for analysis of one way classification model:
(i) Direct Method
(ii) Short-Cut Method
(i) Direct Method:
The null hypothesis is:
H0: µ1 = µ2 = µ3 = … = µk
where µi = 1, 2, 4,…, k are the arithmetic means of the population from which k samples are randomly drawn, i.e., the null hypothesis H0 is the arithmetic mean of the population from which the k samples are randomly drawn are equal to one another.
(a) Sum of Squares of Variation amongst the Columns (Samples SSC):
It is the sum of the squares of deviations between columns of group means and the grand mean which are given below:
(i) Obtain the mean of each column, i.e., X̅1 + X̅2+ X̅3,…
(ii) Calculate the mean of the samples means or the grand mean is given by-
X̅1 + X̅2+ X̅3+…+ X̅n/ N1+N2+ N3+…+ Nn
(iii) Evaluate the deviation of sample means from the grand mean; square these deviations and multiply by the number of items in samples; find the sum of these figures.
(iv) Divide SSC by the degrees of freedom which is C-1, where C denotes the number of samples. This is the variance between the samples. It is indicated by MSC (Mean Sum of Squares of Columns).
(b) Sum of Squares of Variation within the Columns (SSF):
The variation within in the samples would be on account of chance. Since the same treatment is given to all the items in the sample (the same fertilizer is being used on the sampled fields). The difference in the values of different items in a sample which is due to chance is called an estimate of the error. It is represented by SSE (Sum of squares of variations from the mean of the series).
We computed the following steps as:
(i) Calculate the mean of the sample.
(ii) Calculate the deviation of various items of the sample from the mean value of the sample, square these deviations and obtain their total.
(iii) Repeat this process for all samples and total the sum of squares of the deviation of the various samples from their respective means. This would be the value of SSE (Sum of Squares of Variations within the Columns).
(iv) Divide SSE by the degrees of freedom which would be (n1 – 1) + (n2 – 1) + (n3 – 1) +… or N-K, where N is the total number of items in all the samples K is the number of the samples. This could also be N-C as the number of columns would be equal to the number of the samples. This would be variance within the samples or a variance due to chance. This indicated by MSE (Mean Sum of Squares of Error).
(c) Total Sum of Square of Variations (SST):
Here it is found by adding the sum of squares of deviations between the samples and the sum of square and the sum of square of deviations within the samples.
Now, we calculated to this by following steps:
(i) Calculate the grand mean or the mean of the items of all the samples.
(ii) Calculate the deviation of all the items of all the samples from the grand mean; square these deviations and obtain the total of these figures.
(iii) Divide SST by the degrees of freedom which is N – 1, where N stand s for the total number of items in all the samples taken together. This would be the total Varian. It is not necessary to calculate total variance as F-ratio is determined on the bases of variance between the samples and variance within the samples.
(d) Computation of Variance Ratio:
Variance ratio or F is the ratio between greater variance and smaller variance.
Generally F-Ratio is given below:
F= Variance between the samples/Variance within the samples = MSE/MSC
The computed value of F is compared with the critical (tabulated) value of F to draw inferences. One should be very careful in consulting the table containing he critical values of F. These values are given for various level of significance on the basis of degrees of freedom for greater and smaller variances. If the calculated value of F is greater than the critical value, the null hypothesis is rejected and the difference between the means is said to be significant. Since we accept the alternative hypothesis that the µ1 ≠ µ2 ≠ µ3 =…
(ii) Short-Cut Method:
The above direct method of computing F-ration is very tedious and time consuming. As such it is always advisable to use a short-cut method.
The different steps in calculation of F-ratio by the short-cut method are as follows:
(i) Calculate the sum of the values of all the items of all the samples. It is devoted by T.
(ii) Calculate the correction factor which is equal to T2/N, where N is the total number of items in all the samples.
(iii) Calculate the square of all the items (not deviation) of all the samples and add them together.
(iv) Calculate out the total sum of squares (SST) by subtracting the correction factor from the sum of squares of all the items of the samples, i.e., (iii) – (ii).
(v) Find out the sum of squares between the samples (SSB) by the following/method:
(a) Square the totals and divide by the number of items in each sample.
(b) Subtract the correction factor from (a). The resulting figure would be the sum of square between samples (SSB)
(vi) Find the sum of squares within samples (SSE) by subtracting SSC from SST.
(vii) Set up the table of analysis of variance and calculate F.
Remarks:
(i) Sum of all the items of all samples:
And correction factor =T2/N, where N is the total number of items in all samples.
(ii) Sum of squares between the samples:
(ii) ANOVA Two-Way Classification (Manifold Classification):
In two-way classification, we study the influence of two, not one, various factors on various sample groups. Here, the data are classified according to the two different factors. For example, fertilizers may be tried on different soil textures. Thus, with fertilizers in the column classification, the various types of land textures may be in rows. But there may be sampling variations besides the two factors considered which we call as ‘residual variations’.
The sum of square of variations in columns (SSC) plus the sum of squares of variations in rows (SSR) plus the sum of squares of residual variations due to errors (SSF) make up the total sum of squares of variations (SST), i.e.,
SST = SSC + SSR + SSE
The total number of degrees of freedom = CR, where C and R refer to number of columns and rows respectively.
Degrees of freedom between columns = C – 1
Degrees of freedom between rows = R – 1
Degrees of freedom between residuals i =. (C – 1) (R – 1)
In a two-way classification, one should be careful in finding out the degrees of freedom. For columns total, it is C – 1, for rows total, it is R – 1 and for the residuals, it is (C –1) (R –1). While calculating F-ratio for the columns and the rows, the degrees of freedom for the numerator may not be the same. If columns and rows are equal, the degrees of freedom for the numerator and for the denominator would be the same.