This guide will explain how to use the Python in Excel feature to match strings with regular expressions.
Table of Contents
A regular expression, often shortened as “regex”, is a sequence of characters that defines a search pattern. This pattern can be used to match strings or to find and replace substrings within larger strings.
For example, you can use regex patterns to determine if a certain email follows the valid format or if a given text is a valid date.
Python provides support for working with regular expressions through the re module. This library allows you to compile and use regex patterns to perform various operations on strings.
With the new Python in Excel feature, we can now take advantage of Python’s re module and use it on Excel data.
In this guide, we will provide a step-by-step tutorial on how to use Python in Excel to use regular expressions to match and extract text.
The Anatomy of the PY Function
The syntax of the PY function is as follows:
=PY(python_code,return_type)
Let’s look at each argument to understand how to use the PY function.
- PY() refers to our
PYfunction. This function allows us to run Python code on a secure Microsoft Cloud runtime. - python_code refers to the Python code you wish to run. This value must be static text.
- The return_type parameter allows us to specify the desired return type. A value of 0 indicates an Excel value, while a value of 1 indicates a Python object.
- Do note that the
PYfunction cannot be used with any other Excel functions.
A Real Example of Using Python in Excel for Regex
Let’s explore a few simple examples where we can use Python in Excel to match and find strings based on specific regex patterns.
Using the match() Function
The re.match() function in Python allows you to determine if a given string matches a particular regex pattern. In this section, we’ll create a Python script that validates email addresses using this function.

In the table above, we have a column of email addresses. We want to return either TRUE or FALSE in column B, depending on whether the corresponding email address follows the right pattern.
We can use the following Python code to determine the validity of each email:
import re
def is_valid_email(email):
pattern = r”^[a-zA-Z0-9_.+-]+@[a-zA-Z0-9-]+\.[a-zA-Z0-9-.]+$”
return bool(re.match(pattern, str(email)))
df =xl(“A1:A11”, headers=True)
df[‘is_valid_email’] = df[‘Email Address’].apply(is_valid_email)
Df[‘is_valid_email’].values
The script above defines a custom function named is_valid_email(). This function accepts an email address string and returns True or False depending on the result of the re.match() function.
The script then creates a DataFrame object containing our original list of email addresses. We create a new field in our DataFrame labeled is_valid_email and use that field to store the result of our is_valid_email() function. The script then returns the values of that second column.

After running the Python script, we’re able to determine that two of the provided emails are invalid.
Using the findall() Function
Besides matching text to a regex pattern, we can also output all matches of a regex pattern as a list. We can use the re.findall() function to return all the non-overlapping matches of a specific regex pattern.
For example, suppose you have a string containing one or more URLs. If we define a regex pattern that filters for only valid URLs, we can use the re.findall() function to find all URLs in a given string.
We’ll use the following Python script to extract all URLs from cell A2:
import re
def extract_urls(text):
pattern = r”http[s]?://(?:[a-zA-Z]|[0-9]|[$-_@.&+]|[!*\\(\\),]|(?:%[0-9a-fA-F][0-9a-fA-F]))+”
return re.findall(pattern, text)
extract_urls(xl(“A2”))

After running the code, we were able to extract two distinct URLs from a sample text.
Click on the link below to create your own copy of our examples.
Head to the next section to read our step-by-step tutorial on how to use Python RegEx in Excel.
How to Use Python RegEx in Excel
- Use the keyboard shortcut Ctrl+Shift+Alt+P to insert Python code in cell B2.

- We’ll start the script by importing the re module.

- Next, we’ll specify the regex pattern to use, then use the re.match() function to see if the value in cell A2 matches that pattern.

- Hit Ctrl+Enter to run the Python code.
If the email address follows the regex pattern, the script will return TRUE.
Otherwise, the Python code will return FALSE. - If you want to use the re.match() function on an entire dataset, we recommend using a custom Python function.
In the code above, we set up a function called is_valid_email that accepts an email string. - Next, use the xl() Python function to load our Excel data into a DataFrame object.

- Once we have a DataFrame object, we’ll create a new column called is_valid_email. We’ll use the apply() function to generate our values for is_valid_email based on the existing column “Email Address”.
The apply() function works by using it on an existing column and adding the function you want to use as input. - To return Excel values, click on the Python Output button next to the formula bar and select Excel Value.
The PY formula should now return a range of TRUE and FALSE values.
- These are all the steps you need to learn to start using Python regex patterns in Excel.
If you want to learn more about Python in Excel, you can read our post on how to use the scikit-learn library in Excel. You may also be interested in our guide on removing empty lines using Regex in Excel.
That’s all for this guide! Be sure to check out our library of spreadsheet resources, tips, and tricks!