This guide will explain how you can use regular expressions in Excel to remove whitespace and empty lines from text values in your spreadsheet.
Since Excel does not have any built-in regex support, we’ll be adding custom VBA code to our worksheet to create a RegExpReplace function.
Regular expressions are sequences of characters that allow programs to perform advanced pattern matching. For example, users can use the following expression to find URLs:
We can also use regular expressions to find specific characters and replace them with other characters. If we replace a character with an empty string, this effectively removes this character.
Users who want to remove whitespace and empty lines in a string can use regular expressions.
Let’s take a look at a quick example of a scenario where we can use regular expressions to remove whitespace.
Suppose you have a database that allows you to give each entry a tag or comment. For consistency with existing data, you would like to remove all whitespace from the user-inputted tags. For example, the tag ‘To Be Paid’ must be converted to ‘ToBePaid’.
We can use a simple regular expression to find all whitespace characters. The main challenge is that Excel does not have any built-in functions that support regular expressions. However, there is a workaround that involves adding custom code.
Excel supports adding custom VBA code to define new functions. VBA stands for Visual Basic for Applications. Excel VBA is the programming language of Excel and includes a regular expression object. Custom code can use the
VBScript.RegExp object to perform operations such as matching and replacing.
Now that we know when to use regular expressions to remove whitespace characters, head over to the next section to see how the expressions
A Real Example of Using Regex to Remove Whitespace in Excel
Let’s take a look at a real example of the custom
RegExpReplace function being used in an Excel spreadsheet.
In the example below, we have a list of strings in column A that contain whitespace characters and line breaks. We use the
RegExpReplace function in column B to remove all whitespace characters. Line breaks found in the two-line value in cell A4 are converted into a single line.
To get the values in Column C, we just need to use the following formula:
=RegExpReplace(A2, "\s+", " ")
The first argument of the function refers to the original text. We then input the regular expression to use for pattern matching. In this example, the ‘\s+’ regex string refers to one or more whitespace characters.
You can make your own copy of the spreadsheet above using the link attached below.
If you’re ready to try out the
RegExpReplace function in Excel, let’s begin writing it ourselves!
How to Use Regex to Remove Whitespace and Empty Lines in Excel
This section will explain each step needed to start using a custom regex function to remove whitespaces and empty lines in Excel. You’ll learn how we can add our own VBA function to our Excel workbook.
Follow these steps to start using the
- First, we’ll have to enter the Visual Basic Editor. You can open the VBE by clicking on the Visual Basic icon in the Developer tab. Alternatively, you can use the shortcut Alt + F11.
- Next, right-click on your workbook name in the project pane on the left and select Insert -> Module from the pop-up menu.
- Visual Basic will now open a new pane where you can add custom code. Paste the
RegExpReplacecode into the open pane in the VBA editor.
- Next, save your worksheet as an ‘Excel Macro-Enabled Workbook”. This option is found under the ‘Save as type’ drop-down list. Click the Save button.
- Now that you’ve defined the
RegExpReplacefunction, you can now use it in your Excel spreadsheet. Select cell B2, type the function into the formula bar, and provide the necessary arguments. In this example, we’ve added arguments that will replace all whitespace characters with an empty string.
- You can now drag down the formula in cell B2 to fill out the rest of the column.
- If you would like to replace whitespace characters with another character, such as a hyphen or underscore, simply change the third argument of
- Different regular expressions can be used for slightly different effects. For example, we can choose to specify only leading or trailing whitespace characters.
- In the example below, we’ve used a regular expression that removes only duplicate whitespace characters. For example, a string with five consecutive spaces will be converted to a string without the last four spaces.
Frequently Asked Questions (FAQ)
- What counts as a whitespace character?
The ‘\s’ character refers to several types of whitespace characters. This includes spaces, tabs, new line characters (\n), and the carriage return (\r).
- Can I run VBA-defined functions in Excel online?
Excel for the web does not support creating, running, or editing VBA macros for the web.
- What counts as a whitespace character?
That’s all you need to remember to start using a regular expression function to replace whitespace and empty lines in Excel. This step-by-step guide shows how you can easily add your own custom VBA code to your worksheet to use regex pattern matching.
Regular expression support is just one example of a feature in Excel that is used to manipulate text. With so many other Excel functions out there, you can surely find one that best suits your worksheet.
Are you interested in learning more about what Excel can do? Stay notified of new guides like this by subscribing to our newsletter!