Box Plot

When we display the data distribution in a standardized way using 5 summary – minimum, Q1 (First Quartile), median, Q3(third Quartile), and maximum), it is called as Box plot. It helps to find out how much is the data values vary or spread out with the help of graph. As we need more information than just knowing the measures of central tendency, this is where box plot helps. This also takes less space. Check the image below.

It is used to know

  • the outliers and its values.
  • Symmetry of Data
  • Tight grouping of data
  • Data Skewness -if and how

Important Terms of Box Plots

  • Median – The mid value(Vertical line inside the box)
  • First quartile – the mid value between the lowest number and median.(Upper Quartile)
  • Third Quartile – the mid value between the median and the largest value. (Lower Quartile)
  • Interquartile range – Range between 25th percentile to 75th percentile
  • Whiskers – The 2 lines extending to highest and lowest observations outside the box
  • Outliers –
  • Maximum – Third Quartile + 1.5 * (Interquartile range)
  • Minimum – First quartile – 1.5 * (Interquartile range)

Box and Whisker Plot

The method to summarize a set of data which is measured using an interval scale is called a box and whisker plot. These are maximum used for data analysis. We use these types of graphs or graphical representation to know

  • Distribution Shape
  • Central Value of it
  • Variability of it