We have to deal with the calculation of such values ​​as variance, standard deviation and, of course, the coefficient of variation. It is the calculation of the latter that should be given Special attention. It is very important that every beginner who is just starting to work with a spreadsheet editor can quickly calculate the relative scatter of values.

What is the coefficient of variation and why is it needed?

So, it seems to me that it would be useful to conduct a short theoretical digression and understand the nature of the coefficient of variation. This indicator is necessary to reflect the range of data relative to the average value. In other words, it shows the relationship standard deviation to the average. It is customary to measure the coefficient of variation in percentage terms and use it to display the homogeneity of the time series.

The coefficient of variation will become an indispensable assistant in the event that you need to make a forecast based on data from a given sample. This indicator will highlight the main ranges of values ​​that will be most useful for subsequent forecasting, as well as clear the sample from insignificant factors. So, if you see that the value of the coefficient is 0%, then declare with confidence that the series is homogeneous, which means that all values ​​in it are equal to one another. If the coefficient of variation takes on a value exceeding 33%, then this indicates that you are dealing with a heterogeneous series in which individual values ​​differ significantly from the sample average.

How to find the standard deviation?

Since we need to use the standard deviation to calculate the variation indicator in Excel, it would be quite appropriate to figure out how we calculate this parameter.

From the school algebra course, we know that the standard deviation is extracted from the variance Square root, that is, this indicator determines the degree of deviation of a particular indicator of the total sample from its average value. With its help, we can measure the absolute measure of fluctuation of the trait under study and interpret it clearly.

Calculate the coefficient in Excel

Unfortunately, Excel does not have a standard formula that would allow you to calculate the variation indicator automatically. But this does not mean that you have to do the calculations in your head. The absence of a template in the "Formula Bar" in no way detracts from Excel's abilities, so you can easily force the program to perform the calculation you need by manually typing the appropriate command.

In order to calculate the variation indicator in Excel, you need to remember the school math course and divide the standard deviation by the sample mean. That is, in fact, the formula looks like this - STDEV(specified data range) / AVERAGE(specified data range). You need to enter this formula in the Excel cell in which you want to get the calculation you need.

Keep in mind that since the coefficient is expressed as a percentage, the cell with the formula will need to be formatted accordingly. You can do this in the following way:

  1. Open the Home tab.
  2. Find the category in it " Format Cells"And select the required option.

Alternatively, you can set the percentage format to the cell by clicking on the right mouse button on the activated table cell. In the context menu that appears, similarly to the above algorithm, you need to select the “Cell Format” category and set the required value.

Select "Percentage" and optionally enter the number of decimal places

Perhaps the above algorithm will seem complicated to someone. In fact, calculating the coefficient is as simple as adding two natural numbers. Once you complete this task in Excel, you will never return to tedious multi-syllabic solutions in a notebook.

Still not able to make a qualitative comparison of the degree of scatter in the data? Lost in sample size? Then right now get down to business and master in practice all the theoretical material that was presented above! Let the statistical analysis and development of the forecast no longer cause you fear and negativity. Save your energy and time with

Standard deviation is one of those statistical terms in the corporate world that raises the profile of people who manage to screw it up successfully in a conversation or presentation, and leaves a vague misunderstanding for those who don't know what it is but are embarrassed to ask. In fact, most managers don't understand the concept of standard deviation, and if you're one of them, it's time for you to stop living the lie. In today's article, I'll show you how this underrated statistic can help you better understand the data you're working with.

What does standard deviation measure?

Imagine that you are the owner of two stores. And in order to avoid losses, it is important that there is a clear control of stock balances. In an attempt to find out who is the best stock manager, you decide to analyze stocks from the past six weeks. The average weekly cost of the stock of both stores is approximately the same and is about 32 conventional units. At first glance, the average value of the stock shows that both managers work in the same way.

But if you take a closer look at the activity of the second store, you can see that although the average value is correct, the stock variability is very high (from 10 to 58 USD). Thus, it can be concluded that the mean does not always correctly estimate the data. This is where the standard deviation comes in.

The standard deviation shows how the values ​​are distributed relative to the mean in our . In other words, you can understand how big the runoff is from week to week.

In our example, we used the Excel function STDEV to calculate the standard deviation along with the mean.

In the case of the first manager, the standard deviation was 2. This tells us that each value in the sample deviates on average by 2 from the mean. Is it good? Let's look at the question from a different angle - a standard deviation of 0 tells us that each value in the sample is equal to its mean value (in our case, 32.2). For example, a standard deviation of 2 is not much different from 0, indicating that most of the values ​​are close to the mean. The closer the standard deviation is to 0, the more reliable the mean. Moreover, a standard deviation close to 0 indicates little variability in the data. That is, a sink value with a standard deviation of 2 indicates the first manager's incredible consistency.

In the case of the second store, the standard deviation was 18.9. That is, the cost of the runoff deviates on average by 18.9 from the average value from week to week. Crazy spread! The further the standard deviation is from 0, the less accurate the mean. In our case, the figure of 18.9 indicates that the average value ($32.8 per week) simply cannot be trusted. It also tells us that the weekly runoff is highly variable.

This is the concept of standard deviation in a nutshell. Although it does not give insight into other important statistical measures (Mode, Median…), in fact the standard deviation plays decisive role in most statistics. Understanding the principles of standard deviation will shed light on the essence of many processes in your activity.

How to calculate standard deviation?

So, now we know what the standard deviation figure says. Let's see how it counts.

Consider a data set from 10 to 70 in increments of 10. As you can see, I have already calculated the standard deviation for them using the STDEV function in cell H2 (orange).

Below are the steps Excel takes to arrive at 21.6.

Please note that all calculations are visualized for better understanding. In fact, in Excel, the calculation is instantaneous, leaving all the steps behind the scenes.

Excel first finds the mean of the sample. In our case, the average turned out to be 40, which is subtracted from each sample value in the next step. Each resulting difference is squared and summed up. We got the sum equal to 2800, which must be divided by the number of sample elements minus 1. Since we have 7 elements, it turns out that we need to divide 2800 by 6. From the result we find the square root, this figure will be the standard deviation.

For those who are not entirely clear on the principle of calculating the standard deviation using visualization, I give a mathematical interpretation of finding this value.

Standard deviation calculation functions in Excel

There are several varieties of standard deviation formulas in Excel. You just need to type =STDEV and you will see for yourself.

It is worth noting that the functions STDEV.V and STDEV.G (the first and second functions in the list) duplicate the functions STDEV and STDEV (the fifth and sixth functions in the list), respectively, which were retained for compatibility with earlier versions of Excel.

In general, the difference in the endings of the .V and .G functions indicate the principle of calculating the sample standard deviation or population. I already explained the difference between these two arrays in the previous one.

A feature of the STDEV and STDEVPA functions (the third and fourth functions in the list) is that when calculating the standard deviation of an array, logical and text values ​​are taken into account. Text and true booleans are 1, and false booleans are 0. It's hard for me to imagine a situation where I would need these two functions, so I think they can be ignored.

Statistics uses a huge number of indicators, and one of them is the calculation of variance in Excel. If you do it yourself manually, it will take a lot of time, you can make a lot of mistakes. Today we will look at how to decompose mathematical formulas into simple functions. Let's look at some of the simplest, fastest and most convenient calculation methods that will allow you to do everything in a matter of minutes.

Computing the variance

dispersion random variable is called the mathematical expectation of the squared deviation of a random variable from its mathematical expectation.

We calculate by the general population

To calculate mat. expectation in the program, the function VARI.G will be used, and its syntax is as follows "= VARI.G (Number1; Number2; ...)".

It is possible to apply a maximum of 255 arguments, no more. Arguments can be prime numbers or links to the cells in which they are specified. Let's look at how to calculate the variance in Microsoft Excel:

1. The first step is to select the cell where the result of the calculations will be displayed, and then click on the "Insert function" button.

2. The feature management shell will open. There you need to look for the function "DISP.G", which can be in the category "Statistical" or "Full alphabetical list". When it is found, select it and click OK.


3. The function arguments window will open. In it, you need to select the line "Number 1" and on the sheet select a range of cells with a number series.


4. After that, in the cell where the function was entered, the results of the calculations will be displayed.

This is how you can easily find the variance in Excel.

Making a sample calculation

In this case, the sample variance in Excel is calculated with the denominator indicating not the total number of numbers, but one less. This is done for a smaller error using the special function VAR.V, the syntax of which is =VAR.V(Number1;Number2;…). Action algorithm:

  • As in the previous method, you need to select a cell for the result.
  • In the function wizard, you should find "VAR.V" in the category "Full alphabetical list" or "Statistical".


  • Next, a window will appear, and you should proceed in the same way as in the previous method.

Video: Calculate variance in Excel

Conclusion

The variance in Excel is calculated very simply, much faster and more convenient than doing it manually, because the mathematical expectation function is quite complicated and it can take a lot of time and effort to calculate it.

Among the many indicators that are used in statistics, it is necessary to highlight the calculation of variance. It should be noted that manually performing this calculation is a rather tedious task. Fortunately, there are functions in Excel that allow you to automate the calculation procedure. Let's find out the algorithm for working with these tools.

Dispersion is an indicator of variation, which is the average square of deviations from the mathematical expectation. Thus, it expresses the spread of numbers about the mean. The calculation of the dispersion can be carried out both for the general population and for the sample.

Method 1: calculation on the general population

To calculate this indicator in Excel for the general population, the function is used DISP.G. The syntax for this expression is as follows:

DISP.G(Number1;Number2;…)

In total, from 1 to 255 arguments can be applied. Arguments can be numerical values, as well as references to the cells in which they are contained.

Let's see how to calculate this value for a range of numeric data.


Method 2: sample calculation

In contrast to the calculation of the value for the general population, in the calculation for the sample, the denominator is not indicated total numbers, but one less. This is done in order to correct the error. Excel takes into account this nuance in a special function that is designed for this type of calculation - DISP.V. Its syntax is represented by the following formula:

VAR.B(Number1;Number2;…)

The number of arguments, as in the previous function, can also range from 1 to 255.


As you can see, the Excel program is able to greatly facilitate the calculation of the variance. This statistic can be calculated by the application for both the population and the sample. In this case, all user actions are actually reduced only to specifying the range of numbers to be processed, and Excel does the main work itself. It will definitely save significant amount users time.

Carrying out any statistical analysis unthinkable without calculations. In this article, we will look at how to calculate the variance, standard deviation, coefficient of variation and other statistical indicators in Excel.

Maximum and minimum value

Average linear deviation

The average linear deviation is the average of the absolute (modulo) deviations from in the analyzed data set. The mathematical formula looks like:

a is the average linear deviation,

X- analyzed indicator,

- the average value of the indicator,

n

In Excel this function is called SROTCL.

After selecting the SIRT function, we specify the data range for which the calculation should take place. Click "OK".

Dispersion

(module 111)

Perhaps not everyone knows what is, so I will explain - this is a measure that characterizes the spread of data around the mathematical expectation. However, there is usually only a sample available, so the following variance formula is used:

s2 is the sample variance calculated from observational data,

X– individual values,

is the arithmetic mean over the sample,

n is the number of values ​​in the analyzed data set.

Relevant Excel functionDISP.G. When analyzing relatively small samples (up to about 30 observations), you should use , which is calculated by the following formula.

The difference, apparently, is only in the denominator. Excel has a function to calculate the sample unbiased variance DISP.V.

Select the desired option (general or selective), specify the range, click the "OK" button. The resulting value may be very large due to the preliminary squaring of the deviations. Dispersion in statistics is a very important indicator, but it is usually used not in its pure form, but for further calculations.

Standard deviation

Standard deviation (RMS) is the root of the variance. This indicator is also called the standard deviation and is calculated by the formula:

by general population

by sample

You can just take the root of the variance, but in Excel for standard deviation there are ready-made functions: STDEV.G and STDEV.B(for the general and sample population, respectively).

Standard and standard deviation, I repeat, are synonyms.

Next, as usual, specify the desired range and click on "OK". The standard deviation has the same units of measurement as the analyzed indicator, therefore it is comparable with the original data. More on that below.

The coefficient of variation

All the indicators discussed above are linked to the scale of the initial data and do not allow one to get a figurative idea of ​​the variation of the analyzed population. To obtain a relative measure of data scatter, use the coefficient of variation, which is calculated by dividing standard deviation on the average. The formula for the coefficient of variation is simple:

To calculate the coefficient of variation in Excel, there is no ready-made function, which is not a big problem. The calculation can be made by simply dividing the standard deviation by the mean. To do this, in the formula bar, write:

STDEV.G()/AVERAGE()

The data range is indicated in parentheses. If necessary, use the standard deviation for the sample (STDEV.B).

The coefficient of variation is usually expressed as a percentage, so a cell with a formula can be framed with a percentage format. The desired button is located on the ribbon on the "Home" tab:

You can also change the format by selecting from the context menu after selecting the desired cell and clicking the right mouse button.

The coefficient of variation, unlike other indicators of the spread of values, is used as an independent and very informative indicator of data variation. In statistics, it is generally accepted that if the coefficient of variation is less than 33%, then the data set is homogeneous, if more than 33%, then it is heterogeneous. This information can be useful for a preliminary description of the data and for identifying opportunities for further analysis. In addition, the coefficient of variation, measured as a percentage, makes it possible to compare the degree of dispersion of different data, regardless of their scale and units of measurement. Useful property.

Oscillation factor

Another measure of data scatter today is the oscillation coefficient. This is the ratio of the range of variation (the difference between the maximum and minimum values) to the mean. There is no ready-made Excel formula, so you have to put together three functions: MAX, MIN, AVERAGE.

The oscillation coefficient indicates the degree of variation relative to the mean, which can also be used to compare different datasets.

In general, with using Excel many statistics are calculated very simply. If something is not clear, you can always use the search box in the function insert. Well, Google to the rescue.