Please Note: This article is written for users of the following Microsoft Excel versions: 97, 2000, 2002, and 2003. If you are using a later version (Excel 2007 or later), this tip may not work for you. For a version of this tip written specifically for later versions of Excel, click here: Extracting a Pattern from within Text.

Extracting a Pattern from within Text

by Allen Wyatt
(last updated June 2, 2015)

5

Tom has a worksheet that contains about 20,000 cells full of textual data. From within these cells he needs to extract a specific pattern of text. The pattern is ##-##### where each # is a digit. This pattern does not appear at a set place in each cell. Tom wonders if there is a way to extract the desired information.

There are several ways that you can approach this problem, and the correct solution for your needs will depend on the characteristics of the data with which you are working. If you know that the only place in your data that you will have a dash is within your pattern, then you can key off of the presence of the dash by using a formula such as the following:

=MID(A1,FIND("-",A1)-2,8)

This finds the dash and then grabs the eight characters beginning two characters to the left of the dash. This obviously will not work if there are dashes in other places in the text or if it is possible to have "patterns" that include non-digits (such as 12-34B32) and you want those excluded. In that case you'll need a much more complex formula:

=IF(ISERROR(INT(MID(A1, FIND("-", A1, 1)-2, 2)) & INT(MID(
A1, FIND("-", A1, 1)+1, 5))), "", MID(A1, FIND("-", A1)-2, 8))

This includes an error checking component that finds out if the characters just before the dash and just after the dash contain anything other than digits. If they do, then nothing is returned.

The one thing that these formulaic approaches don't do is handle those situations where there may be more than one occurrence of the pattern within the same cell. In that case, a macro is the best approach. The following will extract the valid patterns and place them in a new worksheet called "Results".

Sub ExtractPattern()
    On Error Resume Next
    Set SourceSheet = ActiveSheet
    Set TargetSheet = ActiveWorkbook.Sheets("Results")
    If Err = 0 Then
        Worksheets("Results").Delete
    End If
    Worksheets.Add
    ActiveSheet.Name = "Results"
    Set TargetSheet = ActiveSheet
    Cells(1, 1).Value = "Found Codes"
    Cells(1, 1).Font.Bold = True
    iTargetRow = 2

    SourceSheet.Select
    Selection.SpecialCells(xlCellTypeLastCell).Select
    Range(Selection, Cells(1)).Select

    For Each c In Selection.Cells
        If c.Value Like "*##-#####*" Then
            sRaw = c.Value
            iPos = InStr(sRaw, "-")
            Do While iPos > 0
                If iPos < 3 Then
                    sRaw = "  " & sRaw
                    iPos = iPos + 2
                End If
                sTemp = Mid(sRaw, iPos - 2, 8)
                sRaw = Mid(sRaw, iPos + 6, Len(sRaw))
                If sTemp Like "##-#####" Then
                    TargetSheet.Cells(iTargetRow, 1) = sTemp
                    iTargetRow = iTargetRow + 1
                Else
                    sRaw = Mid(sTemp, 4, 5) & sRaw
                End If
                iPos = InStr(sRaw, "-")
            Loop
        End If
    Next c
End Sub

Note that the macro uses the Like function in two places. The first instance determines if the pattern occurs anywhere in the cell, and the second instance is used to determine if the extracted characters exactly match the desired pattern.

ExcelTips is your source for cost-effective Microsoft Excel training. This tip (7348) applies to Microsoft Excel 97, 2000, 2002, and 2003. You can find a version of this tip for the ribbon interface of Excel (Excel 2007 and later) here: Extracting a Pattern from within Text.

Author Bio

Allen Wyatt

With more than 50 non-fiction books and numerous magazine articles to his credit, Allen Wyatt is an internationally recognized author. He  is president of Sharon Parq Associates, a computer and publishing services company. ...

MORE FROM ALLEN

Understanding Unicode Characters

Unicode is a character-encoding scheme that works with a huge variety of characters. This tip explains what Unicode is and ...

Discover More

Applying Table Formats

Want to make short work of formatting a large data table? You can use the AutoFormat feature of Excel to apply all sorts of ...

Discover More

Converting UTC Times to Local Times

Dates and times are often standardized on UTC time, which is analogous to GMT times. How to convert such times to your local ...

Discover More

Excel Smarts for Beginners! Featuring the friendly and trusted For Dummies style, this popular guide shows beginners how to get up and running with Excel while also helping more experienced users get comfortable with the newest features. Check out Excel 2013 For Dummies today!

More ExcelTips (menu)

Summing Every Fourth Cell in a Row

Need to sum a series of cells that fits some regular pattern? Here are several ways that you can get the summation that you ...

Discover More

Starting Out Formulas

When you enter a formula from the keyboard, Excel only knows it is a formula if you start it with an equal sign. You can also ...

Discover More

Viewing Formula Results

When editing information in a cell, you may need to know the result of a portion of your formula. The shortcut described in ...

Discover More
Subscribe

FREE SERVICE: Get tips like this every week in ExcelTips, a free productivity newsletter. Enter your address and click "Subscribe."

View most recent newsletter.

Comments

If you would like to add an image to your comment (not an avatar, but an image to help in making the point of your comment), include the characters [{fig}] in your comment text. You’ll be prompted to upload your image when you submit the comment. Maximum image size is 8Mpixels. Images larger than 600px wide or 1000px tall will be reduced. Up to three images may be included in a comment. All images are subject to review. Commenting privileges may be curtailed if inappropriate images are posted.

What is 6 - 4?

2017-01-12 01:13:02

Gaurav Gupta

how to extract a word eg. "eye" written anywhere in a cell of excel with the help of formula.


2016-03-02 10:04:18

Chris

Love the pattern match, and am hoping to use the macro version as my data field has multiple instances of the pattern key (in my case a decimal point).

My question...
Is there a way to use the pattern match macro with a variable length pattern? For example my pattern is
XXX.XXXXXXXXXXXX
or
XXX.XXXXXXX
or
XXX.XXXXXXXXXX

Other similar patterns occur in the cell and need to be excluded such as:
auto.xyz.df3
or
111.222.333.444


2015-12-21 17:52:59

help

I'm working on a similar problem where I'm looking to extract a pattern from cells containing large strings the pattern is AA-##-##### (where the A's represent alphas and the #'s represent numbers). How could I alter this macro to accomplish this?


2015-09-03 14:07:05

Steve C.

Excellent instructions. Thanks very, very much. You are a lifesaver.


2013-04-17 11:36:39

Vitold

Wow!

That's pretty cool! Thanks for that.

What I'm trying to achieve is to extract patterns which will look like this: 3##### (it is a number always starting with digit 3 and having 6 digits alltogehter). There could be a number of such numbers in one cell (from none to 9 maximum)

I'll play with your macro (thanks again for posting this), but as I'm not macro guru, I'd be very gratefull if you could change the macro to reflect my pattern.

cheers

Vitold


This Site

Got a version of Excel that uses the menu interface (Excel 97, Excel 2000, Excel 2002, or Excel 2003)? This site is for you! If you use a later version of Excel, visit our ExcelTips site focusing on the ribbon interface.

Newest Tips
Subscribe

FREE SERVICE: Get tips like this every week in ExcelTips, a free productivity newsletter. Enter your address and click "Subscribe."

(Your e-mail address is not shared with anyone, ever.)

View the most recent newsletter.