Quote:
Originally Posted by gmaxey
Batmat1,
Specifically the last instance ", John smith," is returned as a match. How could we prevent that?
|
Code:
Sub ScratchMacro()
Dim RegEx As Object, Matches As Object, Match As Object
Set RegEx = CreateObject("VBScript.RegExp")
With RegEx
.Global = True
.Pattern = ", [A-Z][^, ]+(|( [^, ]+)* [A-Z][^,]+),"
End With
Set Matches = RegEx.Execute(ActiveDocument.Range.text)
For Each Match In Matches
Debug.Print Match.Value
Next
End Sub
The author of the thread did not really provide criteria for the input data. If we have requirements for the form of the results, we must also specify the form of the input data. If the input data can be any, we must provide all characters accepted between commas. See that the given code finds the result ", Beate 123-van4 Ackeren," and that is not a surname and name, right?
The code below accepts only characters in CONST characters. Tested with data as in the picture.
Code:
Sub ScratchMacro()
Const characters As String = "[A-Za-zü\-]"
Dim RegEx As Object, Matches As Object, Match As Object
Set RegEx = CreateObject("VBScript.RegExp")
With RegEx
.Global = True
.Pattern = ", [A-Z]" & characters & "+(|( " & characters & "+)* [A-Z]" & characters & "+),"
End With
Set Matches = RegEx.Execute(ActiveDocument.Range.text)
For Each Match In Matches
Debug.Print Match.Value
Next
End Sub
Quote:
If you don't mind, can you explain what each part of your pattern is intended to perform?
For others following, with mine it is
1. "," match a comma
2. "\s" match a space
3. "[A-Z]" match a capital letter A to Z
4. "[^,]*" match any characters excluding a comma one or more times
5. "," match a comma
|
2. "\s" match a comma, TAB, form-feed, .... equivalent with "[ \f\n\r\t\v]"