Thread: [Solved] Regex-pattern
View Single Post
 
Old 01-03-2025, 02:36 PM
batman1 batman1 is offline Windows 11 Office 2013
Advanced Beginner
 
Join Date: Jan 2025
Posts: 57
batman1 is on a distinguished road
Default

Quote:
Originally Posted by gmaxey View Post
Batmat1,


Specifically the last instance ", John smith," is returned as a match. How could we prevent that?

Code:
Sub ScratchMacro()
Dim RegEx As Object, Matches As Object, Match As Object
    Set RegEx = CreateObject("VBScript.RegExp")
    With RegEx
        .Global = True
        .Pattern = ", [A-Z][^, ]+(|( [^, ]+)* [A-Z][^,]+),"
    End With
    Set Matches = RegEx.Execute(ActiveDocument.Range.text)
    For Each Match In Matches
        Debug.Print Match.Value
    Next
End Sub
The author of the thread did not really provide criteria for the input data. If we have requirements for the form of the results, we must also specify the form of the input data. If the input data can be any, we must provide all characters accepted between commas. See that the given code finds the result ", Beate 123-van4 Ackeren," and that is not a surname and name, right?

The code below accepts only characters in CONST characters. Tested with data as in the picture.
Code:
Sub ScratchMacro()
Const characters As String = "[A-Za-zü\-]"
Dim RegEx As Object, Matches As Object, Match As Object
    Set RegEx = CreateObject("VBScript.RegExp")
    With RegEx
        .Global = True
        .Pattern = ", [A-Z]" & characters & "+(|( " & characters & "+)* [A-Z]" & characters & "+),"
    End With

    Set Matches = RegEx.Execute(ActiveDocument.Range.text)
    For Each Match In Matches
        Debug.Print Match.Value
    Next
End Sub

Quote:

If you don't mind, can you explain what each part of your pattern is intended to perform?


For others following, with mine it is
1. "," match a comma
2. "\s" match a space
3. "[A-Z]" match a capital letter A to Z
4. "[^,]*" match any characters excluding a comma one or more times
5. "," match a comma
2. "\s" match a comma, TAB, form-feed, .... equivalent with "[ \f\n\r\t\v]"
Attached Images
File Type: png regex.png (37.2 KB, 25 views)