Careers360 Logo
Measures of Central Tendency in Statistics

Measures of Central Tendency in Statistics

Edited By Komal Miglani | Updated on Sep 19, 2024 02:04 PM IST

In statistics, the central value of data is an important concept as it helps to summarize the data and describe the set with a single value like the mean. These provide better insights about data that cluster around a value. Understanding these concepts helps to solve complex problems more easily.

Central Value of Data

A measure of central tendency (or central value) is a single value that attempts to describe a set of data by identifying the central position within that set of data. Apart from mean (often called the average), there are other central values such as the median and the mode.

The mean, median, and mode are all valid measures of central tendency, but under different conditions, some measures of central tendency become more appropriate to use than others.

Importance of Central Value of Data:

1. It simplifies complex data by providing a single value.

2. They facilitate comparison between different datasets.

Mean

The mean is equal to the sum of all the values in the data set divided by the number of values in the data set. If we have $n$ values in a data set, i.e. $x_1, x_2, x_3, \ldots, x_n$, then its mean, usually denoted by $\bar{x}$ (pronounced " $x$ bar"), is:

$
\bar{x}=\frac{x_1+x_2+\cdots+x_n}{n}
$

Applications of Mean:
1. Calculating average income or expenditure.
2. Analyzing trends and patterns.

For example, to calculate the mean weight of 50 people, add the 50 weights together and divide by 50. Technically this is the arithmetic mean.

Mean of the Ungrouped Data

If n observations in data are $\mathrm{x}_1, \mathrm{x}_2, \mathrm{x}_3, \ldots \ldots, \mathrm{x}_n$, then arithmetic mean $\bar{x}$ is given by

$
\bar{x}=\frac{x_1+x_2+x_3+\ldots \cdots+x_n}{n}=\frac{1}{n} \sum_\limits{i=1}^n x_i
$

Mean of Ungrouped Frequency Distribution

If observations in data are $x_1, x_2, x_3, \ldots \ldots, x_n$ with respective frequencies $f_1, f_2$, $f_3, \ldots \ldots, f_n$; then

Sum of the value of the observations $=f_1 x_1+f_2 x_2+f_3 x_3+\ldots \ldots .+f_n x_n$
and Number of observations $=f_1+f_2+f_3+\ldots .+f_n$
The mean in this case is given by

$\bar{x}=\frac{f_1 x_1+f_2 x_2+f_3 x_3+\ldots \ldots+f_n x_n}{f_1+f_2+f_3+\ldots \ldots+f_n}=\frac{\sum_\limits{i=1}^n f_i x_i}{\sum_\limits{i=1}^n f_i}$

Grouped Frequency Distribution

$x_i$ is taken as mid-point of respective classes (or interval). i.e.,

$
m=\frac{\text { lower boundary }+ \text { upper boundary }}{2}
$

then, $\bar{x}=\frac{\sum_\limits{i=1}^n f_i m_i}{\sum_\limits{i=1}^n f_i}$

For example,
A frequency table displaying professor's last statistic test is shown, the best estimate of the class mean is

$
\begin{array}{|c|c|}
\hline \text { Grade Interval } & \text { Number of Students } \\
\hline 10-12 & 1 \\
\hline 12-14 & 2 \\
\hline 14-16 & 0 \\
\hline 16-18 & 4 \\
\hline 18-20 & 1 \\
\hline
\end{array}
$

First find the midpoints for all intervals

$
\begin{array}{|c|c|}
\hline \text { Grade Interval } & \text { Midpoint } \\
\hline 10-12 & 11 \\
\hline 12-14 & 13 \\
\hline 14-16 & 15 \\
\hline 16-18 & 17 \\
\hline 18-20 & 19 \\
\hline
\end{array}
$

Now calculate the sum of the product of each interval frequency and midpoint,

$
\begin{aligned}
& \sum_{i=i}^n f_i m_i \\
& 11(1)+13(2)+15(0)+17(4)+19(1)=124 \\
& \bar{x}=\frac{\sum_\limits{i=1}^n f_i m_i}{\sum_\limits{i=1}^n f_i}=\frac{124}{8}=15.5
\end{aligned}
$

Median

The median is the middle value for a set of data that has been arranged in ascending or descending order.

It is a number that separates ordered data into 2 equal halves. Half the values are the same number or smaller than the median, and half the values are the same number or larger.

For example, to find the median of the following data

$\begin{array}{lllllllllll}65 & 55 & 89 & 56 & 35 & 14 & 56 & 55 & 87 & 45 & 92\end{array}$

We first rearrange that data into order (ascending)
$\begin{array}{lllllllllll}14 & 35 & 45 & 55 & 55 & 56 & 56 & 65 & 87 & 89 & 92\end{array}$
The median mark is the value exactly in the middle - in this case, 56
When the $n$ is even in the data set, then simply you have to take the middle two scores and average them.

Median helps do Income distribution analysis.

Median of Ungrouped Data

If the number of observations is $n$,
First arrange the observations in ascending or descending order.

If n is odd :

$
\text { Median }=\left(\frac{n+1}{2}\right)^{t h} \text { observation }
$

If n is even :

$
\text { Median }=\frac{\text { Value of }\left(\frac{n}{2}\right)^{t h} \text { observation }+ \text { Value of }\left(\frac{n}{2}+1\right)^{t h} \text { observation }}{2}
$

For example,

Consider the following data: $1 ; 11.5 ; 6 ; 7.2 ; 4 ; 8 ; 9 ; 10 ; 6.8 ; 8.3 ; 2 ; 2 ; 10 ; 1$
Ordered from smallest to largest: : $1 ; 1 ; 2 ; 2 ; 4 ; 6 ; 6.8 ; 7.2 ; 8 ; 8.3 ; 9 ; 10 ; 10 ; 11.5$
Since there are 14 observations, the median is average of $(\mathrm{n} / 2) \mathrm{th}=7$ th and $(\mathrm{n} / 2$ +1 )th $=8$ th term. So median is the average of 6.8 and 7.2 , which equals 7 .

The median is seven. Half of the values are smaller than seven and half of the values are larger than seven.

Median of Ungrouped Frequency Distribution

To find the median, first arrange the observations in ascending order. After this the cumulative frequencies are obtained.

Let the sum of frequencies is denoted by N .
Now if $N$ is odd, then identify the observation whose cumulative frequency equal to or just greater than $\frac{N+1}{2}$. This value of the observation lies in the middle of the data and therefore, it is the required median.

If $N$ is even, then find two observations, first whose cumulative frequency equal to or just greater than (N/2) and second whose cumulative frequency equal to or just greater than $(\mathrm{N} / 2+1)$. The median is the average of these two observations

Median of Continuous Frequency Distribution

In this case, the following formula can be used

  1. When observations arranged in ascending order

$
\text { Median }=l+\frac{\left(\frac{N}{2}-c f\right)}{f} \times h
$

where,
I = lower limit of median class,
$\mathrm{N}=$ number of observations,
cf = cumulative frequency of class preceding the median class,
$f=$ frequency of median class,
$\mathrm{h}=$ class size (width) (assuming class size to be equal).

Mode

The mode is the most frequent value in our data set.

Normally, the mode is used for categorical data where we wish to know which is the most common category,

$
\begin{array}{llllllllllll}
65 & 55 & 89 & 56 & 35 & 14 & 56 & 55 & 87 & 45 & 92 & 55
\end{array}
$

in the above case, the mode of the data set is 55.

Mode is useful in Market research.

Mode is that value among the observations which occurs most often, that is, the value of the observation having the maximum frequency.

In a grouped frequency distribution, it is not possible to determine the mode by looking at the frequencies. Here, we can only locate a class with the maximum frequency, called the modal class. The mode is a value inside the modal class, and is given by the formula:

Mode $=l+\left(\frac{f_1-f_0}{2 f_1-f_0-f_2}\right) \times h$
where
I = lower limit of the modal class,
$\mathrm{h}=$ size of the class interval (assuming all class sizes to be equal),
$\mathrm{f}_1=$ frequency of the modal class,
$\mathrm{f}_0=$ frequency of the class preceding the modal class,
$\mathrm{f}_2=$ frequency of the class succeeding the modal class.

Solved Examples Based On Central Value Of Data:

Example 1: The median of the items $6,10,4,3,9,11,22,18$ is

1) $9$

2) $10$

3) $9.5$

4) $11$

Solution

Measure of location - A measure of location or a measure of central tendency helps us to know the average character of the data under study by a Single quantity.

Let s arrange the items in ascending order $3,4,6,9,9,10,11,18,22$.
In this data, the number of items is $\mathbf{n}=8$, which is even.
Median $=\mathrm{M}=$ average of $\left(\frac{n}{2}\right)$ th and $\left(\frac{n}{2}+1\right)$ th terms.
$=$ Average of $\left(\frac{8}{2}\right)$ th and $\left(\frac{8}{2}+1\right)$ th terms
$=$ Average of $4^{\text {th }}$ and $5^{\text {th }}$ terms
$
=\frac{9+10}{2}=\frac{19}{2}=9.5
$

Hence, the answer is option 3.

Example 2: In a class of 100 students there are $70$ boys whose average marks in a subject are $75$. If the average marks of the complete class is $72$, then what is the average of the girls?

1) $73$

2) $65$

3) $68$

4) $74$

Solution

$\begin{aligned} & \frac{\sum_\limits{i=1}^{75} x_i}{70}=75 \\ & \Rightarrow \frac{S_B}{70}=75 \\ & S_B=5250 \\ & \text { Also } \\ & \qquad \frac{\sum_\limits{i=1}^{100} x_i}{100}=72 \\ & \Rightarrow \frac{S_T}{100}=72 \\ & S_T=7200 \\ & \Rightarrow S_G=7200-5250 \\ & \quad=1950\end{aligned}$

$
\begin{aligned}
&\text { Thus, it gives us the mean marks for girls }\\
&\begin{aligned}
& =\frac{1950}{30} \\
& =65
\end{aligned}
\end{aligned}
$

Hence, the correct option is option (2).

Example 3: The mean of $5$ observations is $5$ and their variance is $124$. If three of the observations are $1, 2$ and $6$ ; then the mean deviation from the mean of the data is :

1) $2.4$

2) $2.8$

3) $2.5$

4) $2.6$

Solution

Initially, we need to look at the following concepts:

Arithmetic Mean -

$
\begin{aligned}
&\text { For the values } x_1, x_2, \ldots x_n \text { of the variant } x \text { the arithmetic mean is given by }\\
&\bar{x}=\frac{x_1+x_2+x_3+\cdots+x_n}{n}
\end{aligned}
$

In case of discrete data,

Mean Deviation -

If $x_1, x_2, \ldots x_n$ are $n$ observations then the mean deviation from the point $A$ is given by :

$
\frac{1}{n} \sum\left|x_i-A\right|
$
Variance -

In case of discrete data,

$\sigma^2=\left(\frac{\sum x_i^2}{n}\right)-\left(\frac{\sum x_i}{n}\right)^2$

Now,

$\begin{aligned} & \frac{\sum x_i}{5}=5 \Rightarrow \sum x_i=25 \\ & \frac{\sum x_i^2}{n}-\left(\frac{\sum x_i}{n}\right)^2=124 \\ & \frac{\sum x_i^2}{5}-25=124 \\ & \sum x_i^2=149 \times 5=745\end{aligned}$

Let the two observations be $a \& b$

$
\begin{aligned}
& a+b+1+2+6=25 \\
& a+b=16 \\
& a^2+b^2+1^2+2^2+6^2=745 \\
& a^2+b^2+1+4+36=745 \\
& a^2+b^2=704
\end{aligned}
$

$\begin{aligned} & \text { Mean deviation }=\frac{\sum\left|x_i-5\right|}{5}=\frac{\left|x_1-5\right|+\left|x_2-5\right|+8}{5} \\ & =\frac{8+\left|x_1-5\right|+\left|11-x_1\right|}{5}=\frac{8+6}{5}=2.8\end{aligned}$

Hence, the answer is the option 2.

Example 4: In a set of $2n$ distinct observations, each of the observations below the median of all the observations is increased by $5$ and each of the remaining observations is decreased by $3$. Then the mean of the new set of observations :

1) increases by $1$.

2) decreases by $1$.

3) decreases by $2$.

4) increases by $2$.

Solution

The observations are $x1 x2.................x2n $
New observations $=x1+5, x2+5 ..........................xn+5$

and $x_{n+1}-3, x_{n+2}-3 \cdots \cdots x_{2 n}-3$

$\begin{aligned} \int Q \bar{x}_{\text {new }} & =\frac{\sum x i+5 n-3 n}{2 n} \\ = & \frac{\sum x i}{2 n}+1 \\ = & \bar{x}_{\text {old }}+1\end{aligned}$

Hence, the answer is the option 1.

Example 5: All the students of a class performed poorly in Mathematics. The teacher decided to give grace marks of $10$ to each of the students. Which of the following statistical measures will not change even after the grace marks are given?

1) variance

2) mean

3) median

4) mode

Solution

Mean, Mode, and Median are the measures of central tendency. All of these change with change in any observation.

Variance is the measure of the scattering of data. It is a measure of dispersion which do not change if every given observation changes by the same amount.

The measures of central tendency will change, but not measures of dispersion.

So variance will not change.

Hence, the answer is the option (1).

Summary

The central value of data plays a critical role in understanding the data and analyzing it. Data is the new engine oil and central data helps to provide more insights around a central point. Understanding these measures helps the analyst to interpret results in a better way.

Frequently Asked Questions (FAQs)

1. What are the measures of central tendency?

The valid measures of central tendency are mean, median and mode.

2. Define mean.

The mean is equal to the sum of all the values in the data set divided by the number of values in the data set.

3. What is a median?

The median is the middle value for a set of data that has been arranged in ascending or descending order.

4. What is mode?

The mode is the most frequent value in our data set.

Articles

Back to top