When all possible values are plotted on a frequency graph, the shape of the graph is the distribution of the data set (showing how often they occur). In most cases, we won’t be able to collect all of the data we need for our variable of interest. As a result, we take a sample. The results of this sample are used to draw conclusions about the entire data set. To ensure that the results of our sample accurately reflect the entire data set, we must first comprehend the limitations of sampling.
A symmetric distribution is one in which the left and right sides are mirror images of one another. The normal distribution, which has a distinct bell shape, is the most well-known symmetric distribution
A long left tail characterizes a left-skewed distribution. Negatively skewed distributions are also known as left-skewed distributions. The reason for this is that the number line has a long tail in the negative direction. In addition, the mean is to the left of the peak.
The right tail of a right-skewed distribution is long. Positive-skew distributions are also known as right-skewed distributions. This is due to the fact that the number line has a long tail in the positive direction. In addition, the mean is to the right of the peak.
The mean absolute deviation (MAD) is calculated using the mean.
When describing a symmetric data distribution, use the mean to describe the centre and the MAD to describe the variation.
The interquartile range (IQR) is calculated using quartiles. When describing a skewed data distribution, use the median to describe the centre and the IQR to describe the variation.
The least value, greatest value, and quartiles of the data are used to represent a data set along a number line in a box-and-whisker plot. A box-and-whisker plot depicts the variability of a data set.
The diagram below shows Box-and-Whisker Plot:
The steps to create a Box-and-Whisker Plot are shown below:
Step 1: Arrange the given data into ascending order.
Step 2: Draw a number line that has the lowest and largest values of the given series. The points above the number line are known as the five-number summary.
Step 3: Using the quartiles, draw a box. Draw a line through the middle of the graph. Draw whiskers from the bottom of the box to the top of the box, indicating the smallest and largest values.
Example 1: The age distribution of people watching a comedy in a theatre is depicted in the frequency table. A histogram can be used to visualize the data. Then describe the distribution’s shape.
Answer: Make the axes and label them, then draw the bar to represent the interval frequency.
The majority of the information is on the right, with the tail extending to the left. Hence, the data is skewed-left distribution.
Example 2: The histogram shows the number of passes done by a player in the game of Rugby union. Using the help of the histogram below, describe the shape distribution of the histogram.
Answer: The graph’s left side is almost a mirror image of the graph’s right side. As a result, the distribution follows a symmetrical pattern.
Example 3: The histogram below shows the distribution data of students studying at night. Identify the shape distribution of the histogram.
Answer: The majority of the information is on the left, with the tail extending to the right. Hence, the data is skewed-right distribution.
Example 4: The dot plot depicts the average number of hours each class member sleeps each night. Describe the data set’s center and variability.
Answer: The majority of the data values are clustered around 9 on the right, with the tail extending to the left. Because the distribution is skewed to the left, the median and interquartile range are the best ways to describe the center and variation.
The average workday is 8.5 hours. The first quartile has a score of 7.5, while the third quartile has a score of 9. So 9 – 7.5 = 1.5 hours is the interquartile range.
The data is centered on an 8.5-hour period. The data in the middle half varies by no more than 1.5 hours.
Example 5: The body mass index (BMI) of a sixth-grade class is depicted in this box-and-whisker plot.
(a) Is the data more evenly distributed below or above the first quartile? Explain.
(b) Find and interpret the data’s interquartile range.
(c) What percentage of the students have a BMI of 22 or higher?
(a) The right whisker is slightly longer than the left. As a result, the data are more dispersed above the third quartile than below the first.
(b) Third quartile – first quartile = 22 – 19 = 3 interquartile range. As a result, the BMIs of the middle half of the students differ by no more than 3 points.
(c) Students with a BMI of at least 22 are represented by the right whisker. As a result, roughly a quarter of the students have a BMI of at least 22.
Identifying the Distribution of Your Data with Probability Plots. Probability plots may be the most effective way to see if your data follows a specific distribution. The distribution fits your data if it follows the straight line on the graph.
The normal distribution, also known as the Gaussian distribution, is a symmetric probability distribution centred on the mean, indicating that data near the mean occur more frequently than data far from it. The normal distribution will appear as a bell curve on a graph. line on the graph.
A data distribution is a function or a list that displays all of the data’s possible values (or intervals). It also tells you how often each value occurs (which is very important). concepts. By looking at what your child is doing correctly and which concepts they understand, you can determine ways to practice areas that they are still developing.