The Box plot is one of the important statistical improvement techniques that we use in Lean Six Sigma projects. This article will cover all the tips and tricks which will help you successfully draft and analyse the Box plot. A box plot comprises all data distributed following certain rules of data distribution and quartiles. It can be applied to any set of data to get a detailed insight into the distribution.
What is a Box plot?
The box plots diagram is a simple representation method that is widely used in statistics to find the correlation between two variables. Box plots are slightly delicate to understand and need some concentration to understand in depth, as it involves statistical analysis.
- Box plots are one of the important lean and six sigma topics and need to be understood correctly to apply them in real-time applications.
- This article will explain box plots in-depth so that you can score high marks on the box plot topic in the exam.
- We will also learn how to make a box plot in MINITAB.
- It has a median, whiskers, outliers, 1st 2nd 3rd and 4th quartile and interquartile range (IQR) like normal distributions.
- The following diagram shows the similarities between the box plot and the bell curve.
Notice how each quartile contains 25 % of the total data in the above diagram.
How to make a Box plot?
There are a total of 7 tips to complete box plots correctly. These tips were created by our experts who have over 30 years of experience. Read the given points carefully to understand box plots completely
- If input measures are represented as X factors and outputs as Y, then box plots are used when X is attribute data and Y factors are continuous data.
- A box plot is a type of chart that is normally used for observation of relationships between two different variables, one is an attribute, and one is a variable, as well as to visually display the relationship between those variables.
- The values of the variables in the box plot diagram are represented by small size box and the positioning of the box and whiskers on the vertical axis is used to determine the value of the respective data point, hence box plots use the Cartesian coordinates system to display the values of those variables in a data set.
- Since the box plot is made of box and whiskers it is sometimes called a box and whiskers plot.
- The study of such a graphical representation involving two variables and using such a diagram is known as a box plot diagram. Box plots are also known as box diagrams, box graphs, box charts and correlation charts.
- There are some common challenges that arise with the use of box plot diagrams, like the interpretation of causation as correlation and over-plotting.
- The most important thing to remember in correlation is that it doesn’t mean that the changes observed in one X variable are responsible for the changes observed in another X variable.
Features of box plot
Following are the features of the boxplot that are helpful for studying the correlation between X factors and Y factors that is input and output. It has a median, whiskers, outliers, 1st 2nd 3rd and 4th quartile and interquartile range (IQR) like normal distributions. The following diagram shows the similarities between the box plot and the bell curve. The relationship between the box plot and normal distribution curve is described in the diagram below, where each boxplot and normal distribution curve are kept side by side as you can see here the highlighted area in blue shows the interquartile range between Q1 and Q3 which forms the box in the box plot as shown in the diagram.
Box plots are also skewed
The boxplot has often skewed the skewness can be seen by the non-uniform distribution of data over the median of the line when the data is biased distribution towards either side of the median, we can say that the box plot is skewed. The below diagram is showing skewness of data with respect to data.
Box plot examples:
Let us take a simple example of numbers 1 to 10.
For example, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10.
The median of data will be (5+6)/2 = 5.5 as there are even terms. Plotting the box plot in MINITAB. Go to Graph > Box Plot > Simple > OK > select “C1” > ok.
We will get the following box plot with the data given below:
Q1 = 2.75, Q2=5.5, Q3=8.25, Q4=10
IQR = Q3-Q1 = 8.25-2.75 = 5.5
Whiskers = 1, 10
Median = 5.5 (notice the median is nothing but the Q2)
Example Exercise1: We use a box plot when Input is ___ and output is ___ data
- Attribute, continuous.
- continuous, continuous.
- continuous, Attribute.
- Attribute, Attribute.
Answer: Option 1
Example Exercise2: Inter quartile range is the difference between which two parameters?
- 1st quartile and 2nd quartile
- 1st quartile and 3rd quartile
- 2nd quartile and 3rd quartile
- 1st quartile and 4th quartile
Answer: Inter quartile range is the difference between 1st quartile and 3rd quartile.
Example Exercise3: Which of the following statement is correct:
- Median of the data is the same as Q2.
- Outliers are represented by the Asterisk mark (also called “astronomical point”)
- Box plots can be used to correlate between two continuous data.
- a and b
- b and c
- c and a
- a, b, c all correct.
Answer: Option 1
The Box plot is one of the important statistical improvement techniques that we use in Lean Six Sigma projects. It is a tool that can establish a correlation between Continuous data and attribute data. (for example; coming late/not late to the office and speed of the car). Box plot gives insight into statistical data like median, Quartiles, Interquartile range, etc. Box plot can be plotted in MINITAB also with a few easy steps.