Variation indicators. When studying a varying characteristic among units of a population, one cannot limit oneself to only calculating the average value from individual variants, since the same average may not apply to populations of the same composition.

Variation of a characteristic is the difference in individual values ​​of a characteristic within the population being studied.

The term “variation” comes from the Latin variatio – change, fluctuation, difference. However, not all differences are usually called variation.

In statistics, variation is understood as such quantitative changes in the value of the characteristic under study within a homogeneous population, which are caused by the intersecting influence of various factors. The variability of individual values ​​is characterized by variation indicators. The greater the variation, the further apart the individual values ​​are on average.

Variation of a trait is distinguished in absolute and relative values.

Absolute indicators include: range of variation, average linear deviation, standard deviation, dispersion. All absolute indicators have the same dimension as the quantities being studied.

Relative indicators include oscillation coefficients, linear deviation and variations.

The indicators are absolute. Let's calculate the absolute indicators characterizing the variation of the trait.

The range of variation is the difference between the maximum and minimum values ​​of a characteristic.

R = Xmax – Xmin.

The range of variation indicator is not always applicable, since it takes into account only the extreme values ​​of a characteristic, which can be very different from all other units.

It is possible to more accurately determine the variation in a series using indicators that take into account the deviations of all options from the arithmetic mean.

There are two such indicators in statistics: linear average and standard deviation.

Average linear deviation (L) represents the arithmetic mean of the absolute values ​​of deviations of individual options from the average.

The practical use of the average linear deviation is as follows: with the help of this indicator, the composition of workers, the rhythm of production, and the uniformity of supplies of materials are analyzed.

The disadvantage of this indicator is that it complicates calculations of the probable type and complicates the use of mathematical statistics methods.

Standard deviation () is the most common and accepted measure of variation. It is slightly larger than the average linear deviation. For moderately asymmetric distributions, the following relationship between them is established

To calculate it, each deviation from the average is squared, all squares are summed up (taking into account the weight), after which the sum of squares is divided by the number of terms of the series and the square root is extracted from the quotient.

All these actions are expressed by the following formula

those. The standard deviation is the square root of the arithmetic mean of the squares of the deviations from the mean.

The standard deviation is a measure of the reliability of the mean. The smaller σ, the better the arithmetic mean reflects the entire represented population.

The arithmetic mean of the squared deviations of variant values ​​of a characteristic from the average value is called dispersion (), which is calculated using the formulas

A distinctive feature of this indicator is that when squaring (), the proportion of small deviations decreases, and large ones increase in the total amount of deviations.

The variance has a number of properties, some of which make it easier to calculate:

1. The variance of a constant value is 0.

If , then and .

Then .

2. If all variants of attribute (x) values ​​are reduced by the same number, then the variance will not decrease.

Let , but then in accordance with the properties of the arithmetic mean and .

The variance in the new series will be equal to

Those. the variance in the series is equal to the variance of the original series.

3. If all variants of attribute values ​​are reduced by the same number of times (k times), then the variance will decrease by k2 times.

Let , then and .

The variance of the new series will be equal to

4. The variance calculated in relation to the arithmetic mean is minimal. The average square of deviations calculated with respect to an arbitrary number is greater than the variance calculated with respect to the arithmetic mean by the square of the difference between the arithmetic mean and the number, i.e. . The variance from the average has the property of minimality, i.e. it is always less than the variances calculated from any other quantities. In this case, when we equate to 0 and, therefore, do not calculate deviations, the formula takes the following form:

The calculation of variation indicators for quantitative characteristics was discussed above, but in economic calculations the task may be set to assess the variation of qualitative characteristics . For example, when studying the quality of manufactured products, products can be divided into high-quality and defective.

In this case we're talking about about alternative signs.

Alternative characteristics are those that some units of the population possess and others do not. For example, the presence of industrial experience among applicants, academic degree from university teachers, etc. The presence of a characteristic in population units is conventionally denoted by 1, and the absence by 0. Then, if the proportion of units possessing the characteristic (in the total number of population units) is denoted by p, and the proportion of units not possessing the characteristic by q, the variance of the alternative characteristic can be calculate by general rule. In this case, p + q = 1 and, therefore, q = 1– p.

First, we calculate the average value of the alternative attribute:

Let's calculate the average value of the alternative characteristic


those. the average value of an alternative characteristic is equal to the proportion of units possessing this characteristic.

The variance of the alternative characteristic will be equal to:

Thus, the variance of an alternative characteristic is equal to the product of the proportion of units possessing this characteristic by the proportion of units not possessing this characteristic.

And the standard deviation will be equal to =.

The indicators are relative. For the purpose of comparing the variability of different characteristics in the same population or when comparing the variability of the same characteristic in several populations, variation indicators expressed in relative values ​​are of interest. The basis for comparison is the arithmetic mean. These indicators are calculated as the ratio of the range of variation, average linear deviation or standard deviation to the arithmetic mean or median.

Most often they are expressed as a percentage and determine not only a comparative assessment of variation, but also characterize the homogeneity of the population. The population is considered homogeneous if the coefficient of variation does not exceed 33%. The following relative indicators of variation are distinguished:

1. The oscillation coefficient reflects the relative fluctuation of the extreme values ​​of a characteristic around the average.

3. The coefficient of variation evaluates the typicality of average values.


The smaller , the more homogeneous the population is in terms of the characteristic being studied and the more typical the average. If ≤33%, then the distribution is close to normal, and the population is considered homogeneous. From the above example, the second population is homogeneous.

Types of variances and the rule for adding variances. Along with studying the variation of a characteristic throughout the population as a whole, it is often necessary to trace quantitative changes in the characteristic in the groups into which the population is divided, as well as between groups. This study of variation is achieved through calculation and analysis various types variances.

In this case, it is possible to determine three indicators of the variability of a sign in the aggregate:

1. The general variation of an aggregate which results from the action of all causes. This variation can be measured by the total variance (), which characterizes the deviations of individual values ​​of a population characteristic from the overall average


2. Variation of group averages, expressing deviations of group averages from the general average and reflecting the influence of the factor by which the grouping was made. This variation can be measured by the so-called between-group variance (δ2)


where are group averages, a is the overall average for the entire population, and is the number of individual groups.

3. Residual (or intragroup) variation, which is expressed in the deviation of individual values ​​of the attribute in each group from their group average and, therefore, reflects the influence of all other factors except the one underlying the grouping. Since the variation in each group is reflected by the group variance


then for the entire population the residual variation will be reflected by the average of the group variances. This variance is called the average of the intragroup variances () and is calculated using the formula

This equality, which has a strictly mathematical proof, is known as the rule of adding variances.

The rule for adding variances allows you to find total variance according to its components, when the individual values ​​of a characteristic are unknown, and only group indicators are available.

Determination coefficient. The variance addition rule allows you to identify the dependence of results on certain factors using the coefficient of determination.

It characterizes the influence of the characteristic that forms the basis of the group on the variation of the resulting characteristic. The correlation ratio varies from 0 to 1. If , then the grouping characteristic does not affect the resultant one. If , then the resulting characteristic changes only depending on the characteristic underlying the grouping, and the influence of other factorial characteristics is zero.

Indicators of asymmetry and kurtosis. In the field of economic phenomena, strictly symmetrical series are extremely rare; more often one has to deal with asymmetrical series.

In statistics, several indicators are used to characterize asymmetry. If we take into account that in a symmetric series the arithmetic mean coincides in value with the mode and median, then the simplest indicator of asymmetry () will be the difference between the arithmetic mean and the mode, i.e.

The value of kurtosis is calculated using the formula

If >0, then the kurtosis is considered positive (the distribution is peaked), if<0, то эксцесс считается отрицательным (распределение низковершинно).

The coefficient of variation, VAR or CV, is a key indicator in assessing the risk of projects and the profitability of securities. It allows you to analyze in advance two indicators that have values ​​that change over time. If the indicator is less than 0.1, the investment direction is characterized by a low level of risk. If the indicator is above 0.3, the risk level is unreasonably high. For calculations, it is most convenient to use the STANDARDEVAL and AVERAGE functions of the Excel spreadsheet editor.


In order to form a high-quality investment portfolio, investors sometimes have to resort to evaluating the assets included in it, which have different levels of risk and return. For this purpose, an indicator widely known in investment analysis and econometrics is used.

The coefficient of variation(Coefficient of variation - CV, VAR) is a relative financial indicator that demonstrates a comparison of the dispersion of the values ​​of two random indicators that have different units of measurement relative to the expected value.

Reference! Since the coefficient of variation allows one to obtain comparable results, its use is optimal within the framework of portfolio analysis. In it, it allows you to effectively combine the risk and return values ​​and output the resulting value.

Coefficient of variation is an indicator from among the relative statistical methods, which, like NPV and IRR, is used as part of investment analysis. It is measured as a percentage and can be used to compare variations in two unrelated criteria. It is most often used by financial and investment analysts.

Reference! Based on the coefficient of variation, the so-called “unitized risk” is estimated, since it evaluates the relative spread of two indicators in relation to the predicted value.

What is VAR used for?

  • for the purpose of comparing two different indicators;
  • to determine the degree of stability of forecast models (mainly for investments and portfolio investment);
  • to perform XYZ analysis.

Reference! XYZ analysis is an analytical tool within which a company’s products are assessed according to two parameters: stability of consumption and sales.

Formula for calculating the coefficient of variation

The essence of calculating the coefficient of variation is that for a set of values, first calculate the standard deviation, and then the arithmetic mean, and then find their ratio.

In general, the formula for calculating VAR is as follows:

CV = σ / t avg, where:

CV - coefficient of variation;

σ - standard deviation;

t is the arithmetic mean for the random variable.

The formula for calculating the VAR indicator can take on a wide variety of interpretations depending on the object being assessed.

Important point! It is obvious that applying the above formulas manually, especially when there is a wide range of values, is very difficult. That is why the Excel spreadsheet editor is used for calculations.

VAR values ​​in investment analysis

There is no standard value for this indicator. However, there are some reference criteria that help in its analysis and interpretation.

Important point! The CV coefficient has several disadvantages - it does not take into account the size of the initial investment, assumes the symmetry of scattered values ​​​​with respect to the average, and also cannot be used for options whose profitability may be less than 0. Therefore, if in doubt, it is worth additionally using the IRR and NPV indicators.

Examples of VAR calculation in Excel

Calculating the coefficient of variation manually is a complex and time-consuming procedure. If the sample is large, then manually calculating the standard deviation from it is extremely fraught with errors and inaccuracies.

A convenient way to determine VAR is offered by the Excel spreadsheet editor. On its basis you can calculate:

  • standard deviation (STANDEVAL function);
  • arithmetic mean (AVERAGE function).

In order to understand the intricacies of using CV, it makes sense to give an example of its calculation.

Calculation example: evaluation of two projects with different profits

There are two businesses that have shown different financial results over the course of 5 years. In order to make a choice between them, an investor should calculate the coefficient of variation.

First, let's calculate the standard deviation using the Excel statistical function STANDARDEV.V.

Similarly, based on the statistical function AVERAGE, the arithmetic mean is calculated for both projects

After this, it remains to divide the standard deviation by the arithmetic mean and get the result - the value of the coefficient of variation.

Conclusion! For project A, the risk level turned out to be 40%. In this situation, it seems risky and unstable. For Project B, the risk level is acceptable - only 11.64%. It is appropriate for an investor to invest in a more reliable project B, although in certain periods project A brings greater profits.

A detailed algorithm for calculating the indicator is presented in a sample based on the Excel spreadsheet editor.

The detailed process for calculating the variation index is presented in the video.



Goal of the work: obtaining practical skills in calculating various indicators (measures) of variation depending on the objectives set by the study.

Work order:

1. Determine the type and form (simple or weighted) of variation indicators.

3. Formulate conclusions.

1. Determination of the type and form of variation indicators.

Variation indicators are divided into two groups: absolute and relative. The absolute ones include: range of variation, quartile deviation, average linear deviation, dispersion and standard deviation. Relative indicators are coefficients of oscillation, variation, relative linear deviation, relative quartile variation, etc.

Range of variation (R) is the simplest measure of variation of a trait and is determined by the following formula:

where is the highest value of the varying characteristic;

– the smallest value of the varying characteristic.

Quartile deviation (Q)– used to characterize the variation of a characteristic in the aggregate. Can be used instead of range of variation to avoid the disadvantages associated with using extreme values.

where and are the first and third quartiles of the distribution, respectively.

Quartiles– these are the values ​​of the characteristic in the ranked series of the distribution, selected in such a way that 25% of the population units will be less in value; 25% of the units will be contained between and ; 25% of the units will be contained between and , and the remaining 25% exceed .

Quartiles 1 and 3 are determined by the formulas:


Where is the lower limit of the interval in which the first quartile is located;

– the sum of the accumulated frequencies of intervals preceding the interval in which the first quartile is located;

– frequency of the interval in which the first quartile is located.

where Me is the median of the series;


The symbols are the same as for quantities.

In symmetric or moderately asymmetric distributions Q»2/3s. Since the quartile deviation is not affected by the deviations of all values ​​of the attribute, its use should be limited to cases where determining the standard deviation is difficult or impossible.

Average linear deviation () represents the average value of the absolute deviations of the attribute variants from their average. It can be calculated using the arithmetic mean formula, both unweighted and weighted, depending on the absence or presence of frequencies in the distribution series.

Unweighted average linear deviation,

- weighted average linear deviation.

variance()– the average square of deviations of individual values ​​of a characteristic from their average value. The variance is calculated using the simple unweighted and weighted formulas.

- unweighted,

- weighted.

Standard deviation (s)– the most common indicator of variation, is the square root of the variance value.

The range of variation, quartile deviation, average linear and square deviations are named quantities and have the dimension of the characteristic being averaged. Dispersion has no unit of measurement.

For the purpose of comparing the variability of different characteristics in the same population or when comparing the variability of the same characteristic in several populations, relative indicators of variation are calculated. The basis for comparison is the arithmetic mean. Most often, relative indicators are expressed as percentages and characterize not only a comparative assessment of variation, but also characterize the homogeneity of the population.

Oscillation coefficient(relative range of variation) is calculated by the formula:


Linear coefficient of variation(relative linear deviation):

Relative quartile variation index:


The coefficient of variation:


The most commonly used indicator of relative variability in statistics is the coefficient of variation. It is used not only for a comparative assessment of variation, but also as a characteristic of the homogeneity of the population. The greater the coefficient of variation, the greater the spread of attribute values ​​around the average, the greater the heterogeneity of the population. There is a scale for determining the degree of homogeneity of a population depending on the values ​​of the coefficient of variation (17; P.61).

To obtain an approximate idea of ​​the shape of the distribution, distribution graphs (polygon and histogram) are constructed.

In the practice of statistical research one encounters a wide variety of distributions. When studying homogeneous populations, we usually deal with single-vertex distributions. Multivertex indicates the heterogeneity of the population being studied; the appearance of two or more vertices indicates the need to regroup the data in order to identify more homogeneous groups. Determining the general nature of the distribution involves assessing the degree of its homogeneity, as well as calculating indicators of asymmetry and kurtosis. Symmetrical is a distribution in which the frequencies of any two options, equally spaced on both sides of the distribution center, are equal to each other. For symmetric distributions, the arithmetic mean, mode and median are equal. In this regard, the simplest indicator asymmetry is based on the ratio of indicators of the distribution center: the greater the difference between the means, the greater the asymmetry of the series.

To characterize the asymmetry in the central part of the distribution, that is, the bulk of units, or for a comparative analysis of the degree of asymmetry of several distributions, the relative asymmetry index of K. Pearson is calculated:

The value of the As indicator can be positive and negative. A positive value of the indicator indicates the presence of right-sided asymmetry (the right branch relative to the maximum ordinate is more elongated than the left). With right-sided asymmetry, there is a relationship between the indicators of the distribution center: . A negative sign of the asymmetry index indicates the presence of left-sided asymmetry (Fig. 1). In this case, there is a relationship between the indicators of the distribution center: .

Rice. 1. Distribution:

1 – with left-sided asymmetry; 2 – with right-sided asymmetry.

Another indicator, proposed by the Swedish mathematician Lindbergh, is calculated using the formula:

where P is the percentage of those characteristic values ​​that exceed the arithmetic mean in value.

The most accurate and widespread indicator is based on the determination of the third-order central moment (in a symmetric distribution its value is zero):

where is the third-order central moment:

σ – standard deviation.

The use of this indicator makes it possible not only to determine the magnitude of asymmetry, but also to answer the question about the presence or absence of asymmetry in the distribution of a characteristic in the general population. An assessment of the degree of significance of this indicator is given using the mean square error, which depends on the volume of observations n and is calculated by the formula:


If the ratio is , the asymmetry is significant and the distribution of the trait in the population is not symmetrical. If the ratio , asymmetry is insignificant, its presence can be explained by the influence of various random circumstances.

For symmetric distributions, the indicator is calculated excess(sharpness). Lindbergh proposed the following indicator for assessing kurtosis:


where P is the proportion (%) of the number of options lying in the interval equal to half the standard deviation in one direction or another from the arithmetic mean.

The most accurate indicator is using the fourth order central moment:

where is the central moment of the fourth moment;

- for ungrouped data;

- for grouped data.

Figure 2 shows two distributions: one is peaked (the kurtosis value is positive), the second is flat-topped (the kurtosis value is negative). Kurtosis is the extent of the top of the empirical distribution moving up or down from the top of the normal distribution curve. In a normal distribution the ratio is .

Rice. 2. Distribution:

1.4 – normal; 2 – pointed; 3 – flat top

The mean square error of kurtosis is calculated using the formula:


where n is the number of observations.

If , then the kurtosis is significant, if , then it is not significant.

Assessing the significance of the asymmetry and kurtosis indicators allows us to conclude whether this empirical study can be classified as a type of normal distribution curve.

2. Let's consider the methodology for calculating variation indices.

Relative indicators of variation - section Economy, Data on the activities of banks in one of the regions of the Russian Federation 1. Coefficient of Variation (Vσ) – Relative So far...

A population is considered qualitatively homogeneous if the coefficient of variation does not exceed 0.33 (or 33%).

Table 5.1.3.

Scale for assessing population homogeneity

In this case, the average value of the studied characteristic can be considered a typical, reliable characteristic of a statistical population.

If the coefficient of variation more than 0.33 (or 33%) then, therefore, the variation of the trait under study great, and the found average poorly represents the entire statistical population, is not its typical, reliable characteristic, and the population itself is heterogeneous in terms of the characteristics under consideration.

Similarly to the coefficient of variation, calculate other relative measures of variation, which are used less frequently in statistical practice:

2. Oscillation indicator: ; (5.1.12.)

3. Linear coefficient of variation: . (5.1.13)

Let's calculate the variation indicators for the end-to-end problem:

Table 5.1.4.

Calculation table for finding the characteristics of the distribution series

Groups of banks by volume of loan investments, million rubles. X Middle of the interval Number of banks Product of variants by frequencies
gr.4= gr.2*gr.3 gr.6= gr.5*gr.5 gr.7= gr.6*gr.3
375,00 - 459,00 =417 417*4= 417-585= -168 = 28224*4=
459,00 - 543,00 ? ? ? ?
543,00 - 627,00 ? ? ? ?
627,00 - 711,00 ? ? ? ?
711,00 - 795,00 ? ? ? ?
Total ? X X ?

Calculation of the arithmetic weighted average:

Variance calculation:


Calculation of standard deviation:

Calculation of coefficient of variation:

Conclusion. Analysis of the obtained values ​​of indicators and σ suggests that the average volume of bank credit investments is _______? million. rub., the deviation from the average volume in one direction or another is on average _________? million. rub. (or ______?%), the most typical values ​​of the volume of credit investments are in the range from ______________? million. rub. up to _______________? million rub. (range).(see Table 3.2.5 -_____? banks or ______?% are included in this interval).

Value V σ = ______?% _____? exceeds 33%, therefore, the variation of credit investments in the studied set of banks is insignificant and the set is qualitatively homogeneous on this basis. The discrepancy between the values ​​of , Mo and Me is insignificant (=585 million rubles, Mo=593.40 million rubles, Me=588.818 million rubles), which confirms the conclusion about the homogeneity of the population of banks. Thus, the found average value of the volume of bank credit investments (585 million rubles) ______? is a typical, reliable characteristic of the population of banks under study.

