How to Winsorize Data in Excel

This guide will explain how to winsorize data in Excel

Since Excel has several built-in functions and tools, we can easily perform different kinds of statistical and mathematical calculations. And one of these is winsorizing data. 

When we say to winsorize data, we mean to set or create extreme or outrageous outliers that are equal to a specific percentile of the data. 

Additionally, we usually winsorize data when we want to retain or keep the observations that are at the extreme end. But, we also do not want to take or use them too literally. 

Usually, this entire process would take some time to perform manually. However, we can easily and quickly winsorize data in Excel using the built-in functions and tools. In this case, we will use the PERCENTILE.INC function and the IF function to winsorize data in Excel.

Let’s take a sample scenario wherein we need to winsorize data in Excel.

Suppose you have a list of data containing an ascending order of numeric values. Since there are extreme values at each end of the data set, you want to winsorize the data to keep the extreme values. 

So you used the PERCENTILE.INC function to calculate the percentile of the values. Then, you utilized the IF function to winsorize the data. 

Before we move on to a real example of winsorizing data, let’s first learn how to write the functions we will use in Excel.

 

The Anatomy of the PERCENTILE.INC Function

The syntax or the way we write the PERCENTILE.INC function is as follows:

=PERCENTILE.INC(array, k)

Let’s take apart this formula and understand what each term means:

  • = the equal sign is how we begin any function in Excel.
  • PERCENTILE.INC() refers to our PERCENTILE.INC function. And this function is used to return the k-th percentile of values in the selected range wherein k is in the range of 0 to 1. Additionally, this function is inclusive, meaning it returns the range at or below the value.
  • array is a required argument. So it refers to the array or range of data that will define the relative standing. 
  • k is another required argument. And it refers to the percentile value that is between 0 to 1. Furthermore, this value is inclusive.

 

The Anatomy of the IF Function

The syntax or the way we write the IF function is as follows:

=IF(logical_test, [value_if_true], [value_if_false])

Let’s take apart this formula and understand what each term means:

  • = the equal sign is how we activate any function in Excel.
  • IF() is our IF function. And this is used to check whether the set condition is met. If TRUE, it will return one value. If FALSE, it will return another value.
  • logical_test is a required argument. So this refers to any value or expression that can be evaluated as TRUE or FALSE.
  • value_if_true is an optional argument. And this refers to a value that will be returned if the logical test argument is TRUE. If left empty or omitted, TRUE is returned.
  • value_if_false is an optional argument. So this refers to a value that will be returned if the logical test argument is FALSE. If omitted or left blank, FALSE is returned.

Great! Now let’s move on to a real example of how to winsorize data in Excel.

 

A Real Example of How to Winsorize Data in Excel

Let’s say we have a data set containing a list of numeric values. And there are extreme values within the data set. So our initial data set would look like this:

Initial data set

 

Since we have extreme values in the data set, we want to winsorize the data to keep them in the observation. When we winsorize data, we set or create extreme outliers that are equal to a specific percentile of the data. 

For instance, we will set all the values in the data set that are greater than the 95th percentile to be equal to the value at the 95th percentile. Then, we will set all the values that are less than the 5th percentile to be equal to the value at the 5th percentile. And this is what we call a 90% winsorization.

To do this in Excel, we will be utilizing the PERCENTILE.INC function and the IF function. So the PERCENTILE.INC function will be used to find the 95th percentile and the 5th percentile. Afterward, we will use the IF function to winsorize the data using the returned value from the PERCENTILE.INC function.

Next, we can simply copy and paste the formula to apply it to the other cells and return the winsorized data value. Furthermore, we can perform 80% winsorization, 75% winsorization, 88% winsorization, etc. And we can do this by simply calculating different values of the upper and lower percentiles. 

So our final data set would look like this:

Final data set

 

You can make your own copy of the spreadsheet above using the link attached below. 

Amazing! Now we can dive into the steps of how to winsorize data in Excel. 

 

How to Winsorize Data in Excel

In this section, we will explain the step-by-step process of how to winsorize data in Excel. Furthermore, each step will contain detailed instructions and pictures to guide you along the way.

1. Firstly, we need to calculate the upper and lower percentiles using the PERCENTILE.INC function. In this case, we want to perform a 90% winsorization. So we will be calculating the 95th percentile and the 5th percentile. Let’s first calculate the 5th percentile.

To do this, we will type in the formula “=PERCENTILE.INC(B3:B9, 0.05)”. Then, we will press the Enter key to return the result.

Winsorize Data in Excel

 

2. Secondly, we will calculate the 95th percentile. So this is now the upper percentile of the data set. To do this, we will input the formula “=PERCENTILE.INC(B3:B9, 0.95)”. Lastly, we will press the Enter key to return the result.

Winsorize Data in Excel

 

3. Thirdly, we will now winsorize the data using a formula with the IF function. So we will type in the formula “=IF(B3<$E$2, $E$2, IF(B3>$E$3,$E$3,B3))”. Next, we will press the Enter key to return the result.

Winsorize Data in Excel

 

4. Afterward, we will drag down the Fill Handle tool to copy and apply it to the rest of the cells.

Drag down formula

 

5. And tada! We have successfully winsorized the data in Excel.

Final data set

 

And that’s pretty much it! We have explained how to winsorize data in Excel. Now you can apply this learning and use it in your own work whenever you need to winsorize data in Excel.

 Are you interested in learning more about what Excel can do? You can now use the PERCENTILE.INC function and the various other Microsoft Excel formulas available to create great worksheets that work for you. Make sure to subscribe to our newsletter to be the first to know about the latest guides and tutorials from us.

Get emails from us about Google Sheets.

Our goal this year is to create lots of rich, bite-sized tutorials for Google Sheets users like you. If you liked this one, you'll love what we are working on! Readers receive ✨ early access ✨ to new content.

0 Shares:
Leave a Reply

Your email address will not be published. Required fields are marked *

You May Also Like