#1
|
|||
|
|||
Wildcard replace any string in context with a specified string
I'm working with OCR'd documents with frequently repeated titles, like "CBS EVENING NEWS WITH WALTER". For cases where WITH is completely misspelled, as in "CBS EVENING NEWS BIFG WALTER", I need to find that string and replace it with WITH. Pseudocode would be something like "find NEWS followed by any single word except WITH, followed by WALTER. Replace with NEWS WITH WALTER".
I'm having no luck specifying "any string except this", like "CBS EVENING NEWS [!WITH] WALTER". Search just says it's not found. Is it possible to specify a particular string using [!], or does it work only for single characters? |
#2
|
||||
|
||||
You could use a wildcard Find/Replace where:
Find = (CBS EVENING NEWS )*( WALTER) Replace = \1WITH\2
__________________
Cheers, Paul Edstein [Fmr MS MVP - Word] |
#3
|
|||
|
|||
Well, that certainly works; thanks very much, Paul!
|
#4
|
||||
|
||||
If this is a search and replace activity, you don't NEED to exclude the correct word since you would be replacing it with itself.
Find: EVENING NEWS ???? WALTER Replace with: EVENING NEWS WITH WALTER To examine why the search you did isn't working. 1. [!WITH] will be looking for a single character other than those 4 letters. If you wanted a 4 digit word then you would need [!WITH]{4} 2. However that fix wouldn't work as expected because your search would exclude any four digit word that happens to contain any one of those letters. Since 'I' appears in BIFG it would be missed in the search. 3. It still wouldn't work if you got specific about the placement of excluded characters looked for [!W][!I][!T][!H] because if the second letter is I as per your example, that won't get found.
__________________
Andrew Lockton Chrysalis Design, Melbourne Australia |
#5
|
||||
|
||||
The advantage of using * instead of ???? is that the OCR process may have created more or fewer than 4 characters instead of WITH. The disadvantage, is that is may make false matches (e.g. EVENING NEWS WITHOUT WALTER)
__________________
Cheers, Paul Edstein [Fmr MS MVP - Word] |
#6
|
||||
|
||||
Another disadvantage of * is that the search is not restricting itself to a single word so there is the danger of
EVENING NEWS BY HERBERT blah blah multiple paragraphs blah blah EVENING NEWS WITH WALTER would become EVENING NEWS WITH WALTER so you would lose a lot of content you possibly didn't want to.
__________________
Andrew Lockton Chrysalis Design, Melbourne Australia |
#7
|
||||
|
||||
True, in which case one might replace the * in the F/R Find I posted with *{3,6}, for example, to limit the wildcard matches to strings with 3-6 characters.
__________________
Cheers, Paul Edstein [Fmr MS MVP - Word] |
#8
|
|||
|
|||
Thank you, Andrew and Paul; I now know a lot more about F/R exclusions. I'll try both the * and ???? methods to see which is best in my situation.
|
Tags |
wildcard searches |
Thread Tools | |
Display Modes | |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Replace characters in a string | Anthon | Excel Programming | 1 | 11-03-2016 12:48 AM |
How to find all string within string. | PRA007 | Word VBA | 18 | 02-12-2016 08:11 PM |
Find Multiple Wildcard string and Highlight | PRA007 | Word VBA | 2 | 10-17-2015 01:07 AM |
Way to search for a string in text file, pull out everything until another string? | omahadivision | Excel Programming | 12 | 11-23-2013 12:10 PM |
Extract from String using Wildcard | whousedmy | Word | 0 | 05-21-2009 01:35 AM |