Condition:

There is data on the age composition of workers (years): 18, 38, 28, 29, 26, 38, 34, 22, 28, 30, 22, 23, 35, 33, 27, 24, 30, 32, 28, 25, 29, 26, 31, 24, 29, 27, 32, 25, 29, 29.

    1. Construct an interval distribution series.
    2. Construct a graphical representation of the series.
    3. Graphically determine the mode and median.

Solution:

1) According to the Sturgess formula, the population must be divided into 1 + 3.322 lg 30 = 6 groups.

Maximum age - 38, minimum - 18.

Interval width Since the ends of the intervals must be integers, we divide the population into 5 groups. Interval width - 4.

To make calculations easier, we will arrange the data in ascending order: 18, 22, 22, 23, 24, 24, 25, 25, 26, 26, 27, 27, 28, 28, 28, 29, 29, 29, 29, 29, 30 , 30, 31, 32, 32, 33, 34, 35, 38, 38.

Age distribution of workers

Graphically, a series can be depicted as a histogram or polygon. Histogram - bar chart. The base of the column is the width of the interval. The height of the column is equal to the frequency.

Polygon (or distribution polygon) - frequency graph. To build it using a histogram, we connect the midpoints of the upper sides of the rectangles. We close the polygon on the Ox axis at distances equal to half the interval from the extreme values ​​of x.

Mode (Mo) is the value of the characteristic being studied, which occurs most frequently in a given population.

To determine the mode from a histogram, you need to select the highest rectangle, draw a line from the right vertex of this rectangle to the upper right corner of the previous rectangle, and from the left vertex of the modal rectangle draw a line to the left vertex of the subsequent rectangle. From the intersection of these lines, draw a perpendicular to the x-axis. The abscissa will be fashion. Mo ≈ 27.5. This means that the most common age in this population is 27-28 years old.

Median (Me) is the value of the characteristic being studied, which is in the middle of the ordered variation series.

We find the median using the cumulate. Cumulates - a graph of accumulated frequencies. Abscissas are variants of a series. Ordinates are accumulated frequencies.

To determine the median over the cumulate, we find a point along the ordinate axis corresponding to 50% of the accumulated frequencies (in our case, 15), draw a straight line through it, parallel to the Ox axis, and from the point of its intersection with the cumulate, draw a perpendicular to the x axis. The abscissa is the median. Me ≈ 25.9. This means that half of the workers in this population are under 26 years of age.

What is a grouping of statistical data, and how is it related to distribution series, was discussed in this lecture, where you can also learn about what a discrete and variational distribution series is.

Distribution series are one of the varieties of statistical series (in addition to them, dynamics series are used in statistics), they are used to analyze data on the phenomena of social life. Constructing variation series is quite a feasible task for everyone. However, there are rules that need to be remembered.

How to construct a discrete variational distribution series

Example 1. There is data on the number of children in 20 surveyed families. Construct a discrete variation series family distribution by number of children.

0 1 2 3 1
2 1 2 1 0
4 3 2 1 1
1 0 1 0 2

Solution:

  1. Let's start with a table layout, which we will then enter data into. Since the distribution rows have two elements, the table will consist of two columns. The first column is always an option - what we are studying - we take its name from the task (the end of the sentence with the task in the conditions) - by number of children– this means our option is the number of children.

The second column is frequency - how often our variant occurs in the phenomenon under study - we also take the name of the column from the task - family distribution – this means our frequency is the number of families with the corresponding number of children.

  1. Now from the source data we select those values ​​that occur at least once. In our case it is

And let’s arrange this data in the first column of our table in logical order, in this case increasing from 0 to 4. We get

And finally, let’s count how many times each value of the variant appears.

0 1 2 3 1

2 1 2 1 0

4 3 2 1 1

1 0 1 0 2

As a result, we obtain a completed table or the required row of distribution of families by number of children.

Exercise . There is data on the wage grades of 30 workers at the enterprise. Construct a discrete variation series for the distribution of workers by tariff category. 2 3 2 4 4 5 5 4 6 3

1 4 4 5 5 6 4 3 2 3

4 5 4 5 5 6 6 3 3 4

How to construct an interval variational distribution series

Let's construct an interval distribution series and see how its construction differs from a discrete series.

Example 2. There is data on the amount of profit received by 16 enterprises, million rubles. — 23 48 57 12 118 9 16 22 27 48 56 87 45 98 88 63. Construct an interval variation series of the distribution of enterprises by profit volume, identifying 3 groups with equal intervals.

The general principle of constructing the series, of course, will remain the same two columns, the same options and frequency, but in this case the options will be located in the interval and the frequencies will be counted differently.

Solution:

  1. Let's start similarly to the previous task by building a table layout, into which we will then enter data. Since the distribution rows have two elements, the table will consist of two columns. The first column is always an option - what we are studying - we take its name from the task (the end of the sentence with the task in the conditions) - by the amount of profit - which means our option is the amount of profit received.

The second column is the frequency - how often our variant occurs in the phenomenon under study - we also take the name of the column from the task - the distribution of enterprises - which means our frequency is the number of enterprises with the corresponding profit, in this case falling into the interval.

As a result, our table layout will look like this:

where i is the value or length of the interval,

Xmax and Xmin – maximum and minimum value of the attribute,

n is the required number of groups according to the conditions of the problem.

Let's calculate the size of the interval for our example. To do this, among the initial data we will find the largest and smallest

23 48 57 12 118 9 16 22 27 48 56 87 45 98 88 63 – the maximum value is 118 million rubles, and the minimum is 9 million rubles. Let's carry out the calculation using the formula.

In the calculation we got the number 36, (3) three in the period, in such situations the value of the interval must be rounded up so that after the calculations the maximum data is not lost, which is why in the calculation the value of the interval is 36.4 million rubles.

  1. Now let's construct intervals - our options in this problem. The first interval begins to be built from the minimum value, the value of the interval is added to it and the upper limit of the first interval is obtained. Then the upper limit of the first interval becomes the lower limit of the second interval, the value of the interval is added to it and the second interval is obtained. And so on as many times as required to construct intervals according to the condition.

Let us note that if we had not rounded the value of the interval to 36.4, but left it at 36.3, then the last value would have been 117.9. It is in order to avoid data loss that it is necessary to round the interval value to a larger value.

  1. Let's count the number of enterprises falling into each specific interval. When processing data, you must remember that the upper value of the interval in a given interval is not taken into account (is not included in this interval), but is taken into account in the next interval (the lower boundary of the interval is included in this interval, and the upper one is not included), with the exception of the last interval.

When carrying out data processing, it is best to indicate the selected data with symbols or colors to simplify processing.

23 48 57 12 118 9 16 22

27 48 56 87 45 98 88 63

We denote the first interval yellow- and determine how much data falls into the interval from 9 to 45.4, while this 45.4 will be taken into account in the second interval (provided that it is in the data) - in the end we get 7 enterprises in the first interval. And so on throughout all intervals.

  1. (additional action) Let's calculate the total amount of profit received by enterprises for each interval and in general. To do this, add up the data marked different colors and get the total profit value.

For the first interval - 23 + 12 + 9 + 16 + 22 + 27 + 45 = 154 million rubles.

For the second interval - 48 + 57 + 48 + 56 + 63 = 272 million rubles.

For the third interval - 118 + 87 + 98 + 88 = 391 million rubles.

Exercise . There is data on the amount of deposits in the bank of 30 depositors, thousand rubles. 150, 120, 300, 650, 1500, 900, 450, 500, 380, 440,

600, 80, 150, 180, 250, 350, 90, 470, 1100, 800,

500, 520, 480, 630, 650, 670, 220, 140, 680, 320

Build interval variation series distribution of depositors, according to the size of the deposit, identifying 4 groups with equal intervals. For each group, calculate the total amount of deposits.

The simplest way to summarize statistical material is to construct series. Summary result statistical research there may be distribution series. A distribution series in statistics is an ordered distribution of population units into groups according to any one characteristic: qualitative or quantitative. If a series is constructed on a qualitative basis, then it is called attributive, and if on a quantitative basis, then it is called variational.

A variation series is characterized by two elements: variant (X) and frequency (f). A variant is a separate value of a characteristic of an individual unit or group of a population. A number showing how many times a given attribute value occurs is called frequency. If frequency is expressed as a relative number, then it is called frequency. A variation series can be intervalal, when the boundaries “from” and “to” are defined, or it can be discrete, when the characteristic being studied is characterized by a certain number.

Let's look at the construction of variation series using examples.

Example. and there is data on the tariff categories of 60 workers in one of the plant’s workshops.

Distribute workers according to tariff category, build a variation series.

To do this, we write down all the values ​​of the characteristic in ascending order and count the number of workers in each group.

Table 1.4

Distribution of workers by category

Worker Rank (X)

Number of workers

person (f)

in % of the total (particularly)

We received a variational discrete series in which the characteristic being studied (the worker’s rank) is represented by a certain number. For clarity, variation series are depicted graphically. Based on this distribution series, a distribution surface was constructed.

Rice. 1.1. Polygon for distribution of workers by tariff category

We will consider the construction of an interval series with equal intervals using the following example.

Example. Data are known on the value of fixed capital of 50 companies in million rubles. It is required to show the distribution of firms by cost of fixed capital.

To show the distribution of firms by cost of fixed capital, we first solve the question of the number of groups that we want to highlight. Suppose we decided to identify 5 groups of enterprises. Then we determine the size of the interval in the group. To do this, we use the formula

According to our example.

By adding the value of the interval to the minimum value of the attribute, we obtain groups of firms by cost of fixed capital.

Unit having double meaning, belongs to the group where it acts as the upper limit (i.e., the value of the attribute 17 will go to the first group, 24 to the second, etc.).

Let's count the number of factories in each group.

Table 1.5

Distribution of firms by value of fixed capital (million rubles)

Cost of fixed capital
in million rubles (X)

Number of firms
(frequency) (f)

Accumulated frequencies
(cumulative)

According to this distribution, a variational interval series was obtained, from which it follows that 36 firms have fixed capital worth from 10 to 24 million rubles. etc.

Interval distribution series can be represented graphically in the form of a histogram.

The results of data processing are presented in statistical tables. Statistical tables contain their own subject and predicate.

The subject is the totality or part of the totality that is being characterized.

Predicates are indicators that characterize the subject.

Tables are distinguished: simple and group, combinational, with simple and complex development of the predicate.

A simple table in the subject contains a list of individual units.

If the subject contains a grouping of units, then such a table is called a group table. For example, a group of enterprises by number of workers, population groups by gender.

The subject of the combination table contains grouping according to two or more characteristics. For example, the population is divided by gender into groups by education, age, etc.

Combination tables contain information that allows one to identify and characterize the relationship of a number of indicators and the pattern of their changes both in space and time. To make the table clear when developing its subject, limit yourself to two or three characteristics, forming a limited number of groups for each of them.

The predicate in tables can be developed in different ways. With a simple development of the predicate, all its indicators are located independently of each other.

With complex development of the predicate, the indicators are combined with each other.

When constructing any table, one must proceed from the purposes of the study and the content of the processed material.

In addition to tables, statistics also use graphs and diagrams. Chart – statistical data is depicted using geometric shapes. Charts are divided into linear and bar charts, but there can be figured charts (drawings and symbols), pie charts (a circle is taken as the size of the entire population, and the areas of individual sectors display the specific gravity or proportion of its components), radial charts (built on the basis of polar ordinates ). The cartogram is a combination contour map or a site plan with a diagram.

A discrete variation series is constructed for discrete characteristics.

In order to construct a discrete variation series, you need to perform the following steps: 1) arrange the units of observation in increasing order of the studied value of the characteristic,

2) determine all possible values ​​of the attribute x i , arrange them in ascending order,

the value of the attribute, i .

frequency of attribute value and denote f i . The sum of all frequencies of a series is equal to the number of elements in the population being studied.

Example 1 .

List of grades received by students in exams: 3; 4; 3; 5; 4; 2; 2; 4; 4; 3; 5; 2; 4; 5; 4; 3; 4; 3; 3; 4; 4; 2; 2; 5; 5; 4; 5; 2; 3; 4; 4; 3; 4; 5; 2; 5; 5; 4; 3; 3; 4; 2; 4; 4; 5; 4; 3; 5; 3; 5; 4; 4; 5; 4; 4; 5; 4; 5; 5; 5.

Here is the number X - gradeis discrete random variable, and the resulting list of ratings isstatistical (observable) data .

    arrange observation units in ascending order of the studied characteristic value:

2; 2; 2; 2; 2; 2; 2; 2; 3; 3; 3; 3; 3; 3; 3; 3; 3; 3; 3; 3; 4; 4; 4; 4; 4; 4; 4; 4; 4; 4; 4; 4; 4; 4; 4; 4; 4; 4; 4; 4; 4; 4; 4; 5; 5; 5; 5; 5; 5; 5; 5; 5; 5; 5; 5; 5; 5; 5; 5; 5.

2) determine all possible values ​​of the attribute x i, order them in ascending order:

In this example, all estimates can be divided into four groups with the following values: 2; 3; 4; 5.

The value of a random variable corresponding to a particular group of observed data is called the value of the attribute, option (option) and designate x i .

A number that shows how many times the corresponding value of a characteristic occurs in a number of observations is called frequency of attribute value and denote f i .

For our example

score 2 occurs - 8 times,

score 3 occurs - 12 times,

score 4 occurs - 23 times,

score 5 occurs - 17 times.

There are 60 ratings in total.

4) write the received data into a table of two rows (columns) - x i and f i.

Based on these data, it is possible to construct a discrete variation series

Discrete variation series – this is a table in which the occurring values ​​of the characteristic being studied are indicated as individual values ​​in ascending order and their frequencies

  1. Construction of an interval variation series

In addition to the discrete variational series, a method of grouping data such as an interval variational series is often encountered.

An interval series is constructed if:

    the sign has a continuous nature of change;

    There were a lot of discrete values ​​(more than 10)

    the frequencies of discrete values ​​are very small (do not exceed 1-3 with a relatively large number of observation units);

    many discrete values ​​of a feature with the same frequencies.

An interval variation series is a way of grouping data in the form of a table that has two columns (the values ​​of the characteristic in the form of an interval of values ​​and the frequency of each interval).

Unlike a discrete series, the values ​​of the characteristic of an interval series are represented not by individual values, but by an interval of values ​​(“from - to”).

The number that shows how many observation units fell into each selected interval is called frequency of attribute value and denote f i . The sum of all frequencies of a series is equal to the number of elements (units of observation) in the population being studied.

If a unit has a characteristic value equal to the upper limit of the interval, then it should be assigned to the next interval.

For example, a child with a height of 100 cm will fall into the 2nd interval, and not into the first; and a child with a height of 130 cm will fall into the last interval, and not into the third.

Based on these data, an interval variation series can be constructed.

Each interval has a lower limit (xn), an upper limit (xv) and an interval width ( i).

The interval boundary is the value of the attribute that lies on the border of two intervals.

children's height (cm)

children's height (cm)

amount of children

more than 130

If an interval has an upper and lower boundary, then it is called closed interval. If an interval has only a lower or only an upper boundary, then it is - open interval. Only the very first or the very last interval can be open. In the above example, the last interval is open.

Interval width (i) – the difference between the upper and lower limits.

i = x n - x in

The width of the open interval is assumed to be the same as the width of the adjacent closed interval.

children's height (cm)

amount of children

Interval width (i)

for calculations 130+20=150

20 (because the width of the adjacent closed interval is 20)

All interval series are divided into interval series with equal intervals and interval series with unequal intervals . In spaced rows with equal intervals, the width of all intervals is the same. In interval series with unequal intervals, the width of the intervals is different.

In the example under consideration - an interval series with unequal intervals.

Send your good work in the knowledge base is simple. Use the form below

Good work to the site">

Students, graduate students, young scientists who use the knowledge base in their studies and work will be very grateful to you.

Posted on http://www.allbest.ru/

TASK1

The following information is available about wages employees at the enterprise:

Table 1.1

The amount of wages in conventional terms. den. units

It is required to construct an interval distribution series by which to find;

1) average salary;

2) average linear deviation;

4) standard deviation;

5) range of variation;

6) oscillation coefficient;

7) linear coefficient variations;

8) simple coefficient of variation;

10) median;

11) asymmetry coefficient;

12) Pearson asymmetry index;

13) kurtosis coefficient.

Solution

As you know, the options (recognized values) are arranged in ascending order to form discrete variation series. With a large number option (more than 10), even in the case of discrete variation, interval series are constructed.

If an interval series is compiled with even intervals, then the range of variation is divided by the specified number of intervals. Moreover, if the resulting value is integer and unambiguous (which is rare), then the length of the interval is assumed to be equal to this number. In other cases produced rounding Necessarily V side increase, So to the last digit left was even. Obviously, as the length of the interval increases, the range of variation by an amount equal to the product of the number of intervals: by the difference between the calculated and initial length of the interval

A) If the magnitude of the expansion of the range of variation is insignificant, then it is either added to the largest or subtracted from the smallest value of the characteristic;

b) If the magnitude of the expansion of the range of variation is noticeable, then, so that the center of the range does not shift, it is approximately divided in half, simultaneously adding to the largest and subtracting from lowest values sign.

If an interval series with unequal intervals is compiled, then the process is simplified, but still the length of the intervals must be expressed as a number with the last even digit, which greatly simplifies subsequent calculations of numerical characteristics.

30 is the sample size.

Let's create an interval distribution series using the Sturges formula:

K = 1 + 3.32*log n,

K - number of groups;

K = 1 + 3.32*lg 30 = 5.91=6

We find the range of the attribute - wages of workers at the enterprise - (x) using the formula

R= xmax - xmin and divide by 6; R= 195-112=83

Then the length of the interval will be l lane=83:6=13.83

The beginning of the first interval will be 112. Adding to 112 l ras = 13.83, we get its final value 125.83, which is also the beginning of the second interval, etc. end of the fifth interval - 195.

When finding frequencies, one should be guided by the rule: “if the value of a feature coincides with the boundary of the internal interval, then it should be attributed to the previous interval.”

We obtain an interval series of frequencies and cumulative frequencies.

Table 1.2

Therefore, 3 employees have a salary. fee from 112 to 125.83 conventional monetary units. Highest salary fee from 181.15 to 195 conventional monetary units. only 6 employees.

To calculate numerical characteristics, we transform the interval series into a discrete series, taking the middle of the intervals as an option:

Table 1.3

14131,83

Using the weighted arithmetic mean formula

conventional monetary units

Average linear deviation:

where xi is the value of the characteristic being studied for the i-th unit of the population,

Average value of the studied trait.

Posted on http://www.allbest.ru/

LPosted on http://www.allbest.ru/

Conventional monetary units

Standard deviation:

Dispersion:

Relative range of variation (oscillation coefficient): c= R:,

Relative linear deviation: q = L:

The coefficient of variation: V = y:

The oscillation coefficient shows the relative fluctuation of the extreme values ​​of a characteristic around the arithmetic mean, and the coefficient of variation characterizes the degree and homogeneity of the population.

c= R: = 83 / 159.485*100% = 52.043%

Thus, the difference between the extreme values ​​is 5.16% (=94.84%-100%) less than the average salary of employees at the enterprise.

q = L: = 17.765/ 159.485*100% = 11.139%

V = y: = 21.704/ 159.485*100% = 13.609%

The coefficient of variation is less than 33%, which indicates a weak variation in wages of workers at the enterprise, i.e. that the average value is a typical characteristic of workers’ wages (the population is homogeneous).

In interval distribution series fashion determined by the formula -

The frequency of the modal interval, i.e. the interval containing greatest number option;

Frequency of the interval preceding the modal;

Frequency of the interval following the modal;

Modal interval length;

The lower limit of the modal interval.

For determining medians in the interval series we use the formula

where is the cumulative (accumulated) frequency of the interval preceding the median;

Lower limit of the median interval;

Median interval frequency;

Length of the median interval.

Median interval- an interval whose accumulated frequency (=3+3+5+7) exceeds half the sum of frequencies - (153.49; 167.32).

Let's calculate asymmetry and kurtosis, for which we will create a new worksheet:

Table 1.4

Factual data

Calculated data

Let's calculate the third order moment

Therefore, the asymmetry is equal to

Since 0.3553 0.25, the asymmetry is considered significant.

Let's calculate the fourth order moment

Therefore, the kurtosis is equal to

Because< 0, то эксцесс является плосковершинным.

The degree of asymmetry can be determined using the Pearson asymmetry coefficient (As): oscillation sample value turnover

where is the arithmetic mean of the distribution series; -- fashion; -- standard deviation.

With a symmetric (normal) distribution = Mo, therefore, the asymmetry coefficient is zero. If As > 0, then there is more mode, therefore, there is a right-handed asymmetry.

If As< 0, то less fashion, therefore, there is left-sided asymmetry. The asymmetry coefficient can vary from -3 to +3.

The distribution is not symmetrical, but has left-sided asymmetry.

TASK 2

What should the sample size be so that with probability 0.954 the sampling error does not exceed 0.04 if, based on previous surveys, it is known that the variance is 0.24?

Solution

The sample size for non-repetitive sampling is calculated using the formula:

t - confidence coefficient (with a probability of 0.954 it is equal to 2.0; determined from tables of probability integrals),

y2=0.24 - standard deviation;

10,000 people - sample size;

Dx =0.04 - maximum error of the sample mean.

With a probability of 95.4%, it can be stated that the sample size, ensuring a relative error of no more than 0.04, should be at least 566 families.

TASK3

The following data is available on income from the main activities of the enterprise, million rubles.

To analyze a series of dynamics, determine the following indicators:

1) chain and basic:

Absolute increases;

Rates of growth;

Growth rate;

2) average

Dynamics row level;

Absolute increase;

Growth rate;

Rate of increase;

3) absolute value of 1% increase.

Solution

1. Absolute increase (Dy)- this is the difference between the next level of the series and the previous (or basic):

chain: DN = yi - yi-1,

basic: DN = yi - y0,

уi - row level,

i - row level number,

y0 - base year level.

2. Growth rate (Tu) is the ratio of the subsequent level of the series and the previous one (or base year 2001):

chain: Tu = ;

basic: Tu =

3. Growth rate (TD) is the ratio of absolute growth to the previous level, expressed in %.

chain: Tu = ;

basic: Tu =

4. Absolute value of 1% increase (A)- this is the ratio of chain absolute growth to the growth rate, expressed in %.

A =

Average row level calculated using the arithmetic mean formula.

Average level of income from core activities for 4 years:

Average absolute increase calculated by the formula:

where n is the number of levels of the series.

On average, for the year, income from core activities increased by 3.333 million rubles.

Average annual growth rate calculated using the geometric mean formula:

уn is the final level of the row,

y0 - First level row.

Tu = 100% = 102.174%

Average annual growth rate calculated by the formula:

T? = Tu - 100% = 102.74% - 100% = 2.74%.

Thus, on average over the year, income from the main activities of the enterprise increased by 2.74%.

TASKSA4

Calculate:

1. Individual price indices;

2. General trade turnover index;

3. Aggregate price index;

4. Aggregate index of the physical volume of sales of goods;

5. Break down the absolute increase in the value of trade turnover by factors (due to changes in prices and the number of goods sold);

6. Draw brief conclusions on all obtained indicators.

Solution

1. According to the condition, individual price indices for products A, B, C amounted to -

ipA=1.20; iрБ=1.15; iрВ=1.00.

2. We will calculate the general trade turnover index using the formula:

I w = = 1470/1045*100% = 140.67%

Trade turnover increased by 40.67% (140.67%-100%).

On average, commodity prices increased by 10.24%.

The amount of additional costs of buyers from price increases:

w(p) = ? p1q1 - ? p0q1 = 1470 - 1333.478 = 136.522 million rubles.

As a result of rising prices, buyers had to spend an additional 136.522 million rubles.

4. General index of physical volume of trade turnover:

The physical volume of trade turnover increased by 27.61%.

5. Let’s determine the overall change in trade turnover in the second period compared to the first period:

w = 1470-1045 = 425 million rubles.

due to price changes:

W(p) = 1470 - 1333.478 = 136.522 million rubles.

due to changes in physical volume:

w(q) = 1333.478 - 1045 = 288.478 million rubles.

The turnover of goods increased by 40.67%. Prices on average for 3 goods increased by 10.24%. The physical volume of trade turnover increased by 27.61%.

In general, sales volume increased by 425 million rubles, including due to rising prices it increased by 136.522 million rubles, and due to an increase in sales volumes - by 288.478 million rubles.

TASK5

The following data is available for 10 factories in one industry.

Plant number

Product output, thousand pcs. (X)

Based on the given data:

I) to confirm the provisions logical analysis about the presence of a linear correlation between a factor characteristic (volume of output) and a resultant characteristic (electricity consumption), plot the initial data on a graph of the correlation field and draw conclusions about the form of the relationship, indicate its formula;

2) determine the parameters of the connection equation and plot the resulting theoretical line on the graph of the correlation field;

3) calculate the linear correlation coefficient,

4) explain the meaning of the indicators obtained in paragraphs 2) and 3);

5) using the resulting model, make a forecast about the possible energy consumption at a plant with a production volume of 4.5 thousand units.

Solution

The data of the attribute - the volume of production (factor), will be denoted by xi; sign - electricity consumption (result) through yi; points with coordinates (x, y) are plotted on the correlation field OXY.

The points of the correlation field are located along a certain straight line. Therefore, the relationship is linear; we will look for a regression equation in the form of a straight line Уx=ax+b. To find it, we use the system of normal equations:

Let's create a calculation table.

Using the averages found, we compose a system and solve it with respect to parameters a and b:

So, we get the regression equation for y on x: = 3.57692 x + 3.19231

We build a regression line on the correlation field.

Substituting the x values ​​from column 2 into the regression equation, we obtain the calculated ones (column 7) and compare them with the y data, which is reflected in column 8. By the way, the correctness of the calculations is confirmed by the coincidence of the average values ​​of y and.

Coefficientlinear correlation evaluates the closeness of the relationship between characteristics x and y and is calculated using the formula

The angular coefficient of direct regression a (at x) characterizes the direction of the identifieddependenciessigns: for a>0 they are the same, for a<0- противоположны. Its absolute value - a measure of change in the resultant characteristic when the factor characteristic changes by a unit of measurement.

The free term of direct regression reveals the direction, and its absolute value is a quantitative measure of the influence of all other factors on the resulting characteristic.

If< 0, then the resource of the factor characteristic of an individual object is used with less, and when>0 Withgreater efficiency than the average for the entire set of objects.

Let's conduct a post-regression analysis.

The coefficient at x of the direct regression is equal to 3.57692 >0, therefore, with an increase (decrease) in production output, electricity consumption increases (decreases). Increase in production output by 1 thousand units. gives an average increase in electricity consumption by 3.57692 thousand kWh.

2. The free term of the direct regression is equal to 3.19231, therefore, the influence of other factors increases the strength of the impact of product output on electricity consumption in absolute measurement by 3.19231 thousand kWh.

3. The correlation coefficient of 0.8235 reveals a very close dependence of electricity consumption on product output.

It is easy to make predictions using the regression model equation. To do this, the values ​​of x - the volume of production - are substituted into the regression equation and electricity consumption is predicted. In this case, the values ​​of x can be taken not only within a given range, but also outside it.

Let's make a forecast about the possible energy consumption at a plant with a production volume of 4.5 thousand units.

3.57692*4.5 + 3.19231= 19.288 45 thousand kWh.

LIST OF SOURCES USED

1. Zakharenkov S.N. Socio-economic statistics: Textbook and practical guide. -Mn.: BSEU, 2002.

2. Efimova M.R., Petrova E.V., Rumyantsev V.N. General theory statistics. - M.: INFRA - M., 2000.

3. Eliseeva I.I. Statistics. - M.: Prospekt, 2002.

4. General theory of statistics / Under general. ed. O.E. Bashina, A.A. Spirina. - M.: Finance and Statistics, 2000.

5. Socio-economic statistics: Educational and practical. allowance / Zakharenkov S.N. and others - Mn.: Yerevan State University, 2004.

6. Socio-economic statistics: Textbook. allowance. / Ed. Nesterovich S.R. - Mn.: BSEU, 2003.

7. Teslyuk I.E., Tarlovskaya V.A., Terlizhenko N. Statistics. - Minsk, 2000.

8. Kharchenko L.P. Statistics. - M.: INFRA - M, 2002.

9. Kharchenko L.P., Dolzhenkova V.G., Ionin V.G. Statistics. - M.: INFRA - M, 1999.

10. Economic statistics / Ed. Yu.N. Ivanova - M., 2000.

Posted on Allbest.ru

...

Similar documents

    Calculation of the arithmetic mean for an interval distribution series. Determination of the general index of physical volume of trade turnover. Analysis of the absolute change in the total cost of production due to changes in physical volume. Calculation of the coefficient of variation.

    test, added 07/19/2010

    The essence of wholesale, retail and public trade. Formulas for calculating individual and aggregate turnover indices. Calculation of characteristics of an interval distribution series - arithmetic mean, mode and median, coefficient of variation.

    course work, added 05/10/2013

    Calculation of planned and actual sales volume, percentage of plan fulfillment, absolute change in turnover. Determination of absolute growth, average growth rates and increase in cash income. Calculation of structural averages: modes, medians, quartiles.

    test, added 02/24/2012

    Interval series of distribution of banks by profit volume. Finding the mode and median of the resulting interval distribution series using a graphical method and by calculations. Calculation of characteristics of interval distribution series. Calculation of the arithmetic mean.

    test, added 12/15/2010

    Formulas for determining the average values ​​of an interval series - modes, medians, dispersion. Calculation of analytical indicators of dynamics series using chain and basic schemes, growth rates and increments. The concept of a consolidated index of costs, prices, expenses and turnover.

    course work, added 02/27/2011

    Concept and purpose, order and rules for constructing a variation series. Analysis of data homogeneity in groups. Indicators of variation (fluctuation) of a trait. Determination of average linear and square deviation, coefficient of oscillation and variation.

    test, added 04/26/2010

    The concept of mode and median as typical characteristics, the order and criteria for their determination. Finding the mode and median in discrete and interval variation series. Quartiles and deciles as additional characteristics of a variation statistical series.

    test, added 09/11/2010

    Construction of an interval distribution series based on grouping characteristics. Characteristics of the deviation of the frequency distribution from a symmetrical shape, calculation of kurtosis and asymmetry indicators. Analysis of indicators balance sheet or income statement.

    test, added 10/19/2014

    Converting empirical series into discrete and interval ones. Determination of the average value by discrete series using its properties. Calculation using a discrete series of mode, median, variation indicators (dispersion, deviation, oscillation coefficient).

    test, added 04/17/2011

    Construction of a statistical series of distribution of organizations. Graphical determination of the mode and median values. The closeness of the correlation using the coefficient of determination. Definition of sampling error average number workers.