Microsoft Office Forums

Go Back   Microsoft Office Forums > >

Reply
 
Thread Tools Display Modes
  #1  
Old 04-18-2018, 06:29 AM
Cosmo Cosmo is offline Regular expressions and field codes Windows Vista Regular expressions and field codes Office 2007
Competent Performer
Regular expressions and field codes
 
Join Date: Mar 2012
Posts: 240
Cosmo is on a distinguished road
Default Regular expressions and field codes

I have a function which uses a regular expression to search through the document text, which needs to make changes to the found text based on their position in the document.

This works fine, except when there's a field code in the document, the positions of the found text is not correct, and I can't figure out what is offsetting the position (I tried checking the length of the field.result vs the field.code, but the amount of offset doesn't match any combination of these lengths)

Here is a boiled down version of the code I am using.
Code:
Set re = New RegExp
re.Pattern = "(TEXT1)( Text2)? \(text3\)( text4)?(?: text5)?"    
re.IgnoreCase = True
re.Global = True
txt = ActiveDocument.range.Text
 
If re.TEST(txt) Then
    'get all matches
    Set allmatches = re.Execute(txt)
    'look at each match and hilight corresponding range
    For Each m In allmatches
        ' Set new Range
        startPos = m.FirstIndex
        endPos = startPos + m.Length
        Set newRNG = ActiveDocument.Range(start:=startPos, End:=endPos)
 
        ' This range is NOT correct if there are fields
        newRNG.Select
 
        ' Code here to process found text
        if (condition) then
             ' Edit range here
 
        end if
 
    Next m
End If
This code works fine in the document, it runs through and selects each range found that matches the pattern. But if there is a field in the document (e.g. 'CreateDate' field, or a text field), after it the ranges selected is not the found text, it selects a range before the found text.

Is there a proper way to do the regular expression search that will allow me to edit the found ranges when necessary? I don't believe I can use a word 'find' using wildcards, since I need to use the SubMatch values (left out here for brevity), not the full found text, and I don't think wildcards would perform the search I need to use.

I hope I have explained my issue correctly, please let me know if there is any more information I need to provide.
Reply With Quote
  #2  
Old 04-18-2018, 10:48 AM
Cosmo Cosmo is offline Regular expressions and field codes Windows Vista Regular expressions and field codes Office 2007
Competent Performer
Regular expressions and field codes
 
Join Date: Mar 2012
Posts: 240
Cosmo is on a distinguished road
Default

To hopefully better explain the problem, I use a regular expression to do a search for all matches in the document's text. When I go through all of the matches, if there's a field preceding the match, the FirstIndex value doesn't match the position of the found text in the document's range.

e.g. if the first found match is at 168 in the text, if there's a field before position 168, then the match might be at position 228 in the document.
Reply With Quote
  #3  
Old 04-19-2018, 09:47 PM
macropod's Avatar
macropod macropod is offline Regular expressions and field codes Windows 7 64bit Regular expressions and field codes Office 2010 32bit
Administrator
 
Join Date: Dec 2010
Location: Canberra, Australia
Posts: 21,963
macropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond repute
Default

Hi Cosmo,

Can you attach a document to a post with some representative data (delete anything sensitive) demonstrating the problem? You do this via the paperclip symbol on the 'Go Advanced' tab at the bottom of this screen.

I'm not that familiar with RegEx, but may be able to advise you on how to adjust the ranges to account for the fields.
__________________
Cheers,
Paul Edstein
[Fmr MS MVP - Word]
Reply With Quote
  #4  
Old 04-20-2018, 05:39 AM
Cosmo Cosmo is offline Regular expressions and field codes Windows Vista Regular expressions and field codes Office 2007
Competent Performer
Regular expressions and field codes
 
Join Date: Mar 2012
Posts: 240
Cosmo is on a distinguished road
Default

Thanks for the response. I'm attaching a demo file with a function 'tetingRegEx'.

For the purposes of this test, I have simplified the regular expression to a simple text pattern, but the one I will be using is much more complicated. This doesn't affect the purpose of this test.

Running the function should highlight every instance of 'Lorem' within the document. There are 2 fields (a 'CreateDate' field, and a text form field) after the second paragraph. In the paragraphs after these fields, the text highlighted is not the found text.

I figured it was due to a discrepancy between the Field.code ( CREATEDATE \@ "M/d/yyyy" \* MERGEFORMAT ) vs Field.result (4/2/2018), but I don't see any correlation between those numbers. e.g. the date field code is 43 characters, the result is 8 characters. But it seems to offset the found range by 46 characters.

If I could find out how to calculate the offset (46 for the date) from each field, then I could loop through all fields that preceed the found text and adjust the position.
Attached Files
File Type: docm TestingRegex.docm (19.7 KB, 9 views)
Reply With Quote
  #5  
Old 04-20-2018, 02:48 PM
Cosmo Cosmo is offline Regular expressions and field codes Windows Vista Regular expressions and field codes Office 2007
Competent Performer
Regular expressions and field codes
 
Join Date: Mar 2012
Posts: 240
Cosmo is on a distinguished road
Default

Just found out while experimenting that the text it highlights is different if I have the field codes toggled open (i.e. the range text includes the field code value instead of the field's result value). Calculating the offset at that point was 1 character more than the date field's result value. But doesn't work with more complicated fields, or with text fields.

Oddly, it doesn't work if I toggle the field codes 'on' in the function, only if they were toggled on manually. Not sure why that would be, but it is yet another annoyance.

I'll have to experiment some more next week. I would like to solve this problem, but I might have to settle with merging ALL fields in the document to text before running the find function I need to perform.
Reply With Quote
  #6  
Old 04-20-2018, 04:10 PM
macropod's Avatar
macropod macropod is offline Regular expressions and field codes Windows 7 64bit Regular expressions and field codes Office 2010 32bit
Administrator
 
Join Date: Dec 2010
Location: Canberra, Australia
Posts: 21,963
macropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond repute
Default

Perhaps:
Code:
Private Function testingRegEx()
    Dim re As RegExp
    Dim txt As String
    Dim allmatches As MatchCollection, m As Match
    
    Set re = New RegExp
    
    re.Pattern = "(Lorem)"
    re.IgnoreCase = True
    re.Global = True
    
    txt = ActiveDocument.Range.Text
    
    If re.TEST(txt) Then
        'get all matches
        Set allmatches = re.Execute(txt)
        'look at each match and hilight corresponding range
        For Each m In allmatches
            With oDoc.Range.Find
                .ClearFormatting
                .Replacement.ClearFormatting
                .Text = m
                .Replacement.Text = "^&"
                .Replacement.Highlight = True
                .Forward = True
                .Wrap = wdFindStop
                .Execute Replace:=wdReplaceAll
            End With
        Next m
    End If
End Function
__________________
Cheers,
Paul Edstein
[Fmr MS MVP - Word]
Reply With Quote
Reply



Similar Threads
Thread Thread Starter Forum Replies Last Post
Regular expressions and field codes regular expressions in footnotes loes Word 3 09-04-2019 07:52 AM
Word Regular Expressions: zero or more occurences? tinfanide Word 6 09-16-2015 03:13 PM
Regular expressions and field codes Regular Expressions: match words within quotes? tinfanide Word VBA 3 02-02-2013 10:07 PM
regular expressions for empty lines eNGiNe Word 1 01-21-2013 06:38 AM
Regular expressions and field codes Regular Expressions: [!0-9] does not work??? tinfanide Excel Programming 4 05-30-2012 04:09 AM

Other Forums: Access Forums

All times are GMT -7. The time now is 11:50 PM.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.
Search Engine Optimisation provided by DragonByte SEO (Lite) - vBulletin Mods & Addons Copyright © 2024 DragonByte Technologies Ltd.
MSOfficeForums.com is not affiliated with Microsoft