# Extracting a Pattern from within Text

This article is written for users of the following Microsoft Excel versions: 97, 2000, 2002, and 2003.

Tom has a worksheet that contains about 20,000 cells full of textual data. From within these cells he needs to extract a specific pattern of text. The pattern is ##-##### where each # is a digit. This pattern does not appear at a set place in each cell. Tom wonders if there is a way to extract the desired information.

There are several ways that you can approach this problem, and the correct solution for your needs will depend on the characteristics of the data with which you are working. If you know that the only place in your data that you will have a dash is within your pattern, then you can key off of the presence of the dash by using a formula such as the following:

```=MID(A1,FIND("-",A1)-2,8)
```

This finds the dash and then grabs the eight characters beginning two characters to the left of the dash. This obviously will not work if there are dashes in other places in the text or if it is possible to have "patterns" that include non-digits (such as 12-34B32) and you want those excluded. In that case you'll need a much more complex formula:

```=IF(ISERROR(INT(MID(A1, FIND("-", A1, 1)-2, 2)) & INT(MID(
A1, FIND("-", A1, 1)+1, 5))), "", MID(A1, FIND("-", A1)-2, 8))
```

This includes an error checking component that finds out if the characters just before the dash and just after the dash contain anything other than digits. If they do, then nothing is returned.

The one thing that these formulaic approaches don't do is handle those situations where there may be more than one occurrence of the pattern within the same cell. In that case, a macro is the best approach. The following will extract the valid patterns and place them in a new worksheet called "Results".

```Sub ExtractPattern()
On Error Resume Next
Set SourceSheet = ActiveSheet
Set TargetSheet = ActiveWorkbook.Sheets("Results")
If Err = 0 Then
Worksheets("Results").Delete
End If
ActiveSheet.Name = "Results"
Set TargetSheet = ActiveSheet
Cells(1, 1).Value = "Found Codes"
Cells(1, 1).Font.Bold = True
iTargetRow = 2

SourceSheet.Select
Selection.SpecialCells(xlCellTypeLastCell).Select
Range(Selection, Cells(1)).Select

For Each c In Selection.Cells
If c.Value Like "*##-#####*" Then
sRaw = c.Value
iPos = InStr(sRaw, "-")
Do While iPos > 0
If iPos < 3 Then
sRaw = "  " & sRaw
iPos = iPos + 2
End If
sTemp = Mid(sRaw, iPos - 2, 8)
sRaw = Mid(sRaw, iPos + 6, Len(sRaw))
If sTemp Like "##-#####" Then
TargetSheet.Cells(iTargetRow, 1) = sTemp
iTargetRow = iTargetRow + 1
Else
sRaw = Mid(sTemp, 4, 5) & sRaw
End If
iPos = InStr(sRaw, "-")
Loop
End If
Next c
End Sub
```

Note that the macro uses the Like function in two places. The first instance determines if the pattern occurs anywhere in the cell, and the second instance is used to determine if the extracted characters exactly match the desired pattern.

Note that the macro uses the Like function in two places. The first instance determines if the pattern occurs anywhere in the cell, and the second instance is used to determine if the extracted characters exactly match the desired pattern.

Gaurav Gupta    12 Jan 2017, 01:13
how to extract a word eg. "eye" written anywhere in a cell of excel with the help of formula.
Chris    02 Mar 2016, 10:04
Love the pattern match, and am hoping to use the macro version as my data field has multiple instances of the pattern key (in my case a decimal point).

My question...
Is there a way to use the pattern match macro with a variable length pattern? For example my pattern is
XXX.XXXXXXXXXXXX
or
XXX.XXXXXXX
or
XXX.XXXXXXXXXX

Other similar patterns occur in the cell and need to be excluded such as:
auto.xyz.df3
or
111.222.333.444
help    21 Dec 2015, 17:52
I'm working on a similar problem where I'm looking to extract a pattern from cells containing large strings the pattern is AA-##-##### (where the A's represent alphas and the #'s represent numbers). How could I alter this macro to accomplish this?
Steve C.    03 Sep 2015, 14:07
Excellent instructions. Thanks very, very much. You are a lifesaver.
Vitold    17 Apr 2013, 11:36
Wow!

That's pretty cool! Thanks for that.

What I'm trying to achieve is to extract patterns which will look like this: 3##### (it is a number always starting with digit 3 and having 6 digits alltogehter). There could be a number of such numbers in one cell (from none to 9 maximum)

I'll play with your macro (thanks again for posting this), but as I'm not macro guru, I'd be very gratefull if you could change the macro to reflect my pattern.

cheers

Vitold

