#1
|
|||
|
|||
Convert RegEx to Word (Devanagari Font Find/Replace)
Help! So I have a converter from (non-Unicode) Sanskrit 1.2 font to (Devanagari Unicode) Sanskrit 2003 font. http://is.gd/sw9ruq it works perfect in EmEditor. But Word does not support correct RegEx expressions, so I need and ask for help. Code:
1) Works perfect: "([क-ह]|[क़-य़])ρ","ρ\1" "((([क-ह]|[क़-य़])्)+)ρ","ρ\1" "i((([क-ह]|[क़-य़])्)+)","\1i" 2) Works perfect: "i([क-ह]|[क़-य़])","\1ि" "([१३])(([॒॑])+)","\1" 3) Does not work at all: We look for "([॒॑])([ा-ौ]|[ॢॣ]|[ँंः])" and replace it with "\2\1" "(([ा-ौ]|[ॢॣ]|[ँंः]|[॒॑])+)ρ","ρ\1" HTML Code:
(()()) Code:
Word.Find find = app.Selection.Find; string finds = listView1.Items[s].Text; // строка для поиска find.Text = finds; string fonts = txB_NAME_FONT.Text; find.Font.Name = fonts; // поисковый шрифт find.Replacement.ClearFormatting(); string Repl = listView1.Items[s].SubItems[1].Text; //строка для замены Repl = Repl.Replace("\r", string.Empty); find.Replacement.Text = Repl; Object wrap = Word.WdFindWrap.wdFindContinue; Object replace = Word.WdReplace.wdReplaceAll; if ((s == 14) || (s == 199)) { find.Text = ""; if ((finds.IndexOf("[") > 0)||(finds.IndexOf("]") > 0)) { find.Text = ""; find.Text = finds; } else { find.Text = ""; find.Text = "[" + finds + "]"; //оборачиваем в скобки, иначе исключение для регулярок } find.Execute(FindText: Type.Missing, MatchCase: false, MatchWholeWord: false, MatchWildcards: true, // включаем регулярки MatchSoundsLike: missing, MatchAllWordForms: false, Forward: true, Wrap: wrap, Format: true, ReplaceWith: missing, Replace: replace); } |
#2
|
||||
|
||||
In Word, if you're using wildcards (as your code indicates) to Find a literal '(' or ')', you must precede them with '\'. Other characters you need to treat the same way include:
{}[]@*^<>?\!
__________________
Cheers, Paul Edstein [Fmr MS MVP - Word] |
#3
|
|||
|
|||
Thank you
Code:
-i*iàyay -vraeg-yaphay %¢ay dugR-vsagrtar[ay, JyaeitmRyay punéÑvvar[ay dairÔ(du>o dhnay nm> izvay. 3. cmaRMbray zv-Smivlepnay -ale][ay )i[ku{flmi{ftay, mÃIrpadyuglay jqaxray dairÔ(du>o dhnay nm> izvay. 4. HTML Code:
(du>o dhnay nm> izvay. 3. cmaRMbray zv-Smivlepnay -ale][ay ) |
#4
|
||||
|
||||
As per my previous post, if those are literal Find strings, you would need to use:
\(du\>o dhnay nm\> izvay. 3. cmaRMbray zv-Smivlepnay -ale\]\[ay \)
__________________
Cheers, Paul Edstein [Fmr MS MVP - Word] |
#5
|
|||
|
|||
Thanks for your advice. I was thinking there is no real VBA talk around Word and I'm glad I was wrong. I did understand it the first time. I do not know how Word treats "(()())", do you know? In Excel constructions like "(()())" work perfectly. But I do not know about Word.
Do you want to say that "i((([क-ह]|[क़-य़])्)+)" should be written down to "i\(\(\([क-ह]|[क़-य़]\)्\)\+\)" like that? So nested syntax "(()())" does not work at all? So I should rewrite (([क-ह]|[क़-य़])्)h as([क-ह]्|[क़-य़]्)h code? |
#6
|
||||
|
||||
As I don't know what your Excel construction "(()())" is supposed to represent, I can't really say what the equivalent might be in Word, or whether it's even possible. Word can certainly use a wildcard expression like Find = (A)(B) Replace = \2\1, but you can't use a wildcard expression like Find = ((A)(B)C).
Even so, anything you can do with RegEx code in Excel you can do with the same RegEx code in Word. The RegExp Object is accessed in Word the same as it is in Excel, via either Early or Late Binding. Early Binding requires setting a VBA reference from the Visual Basic Editor via Tools>References>Microsoft VBScript Regular Expression 5.5.
__________________
Cheers, Paul Edstein [Fmr MS MVP - Word] |
#7
|
||||
|
||||
Cross-posted at: http://www.vbaexpress.com/forum/showthread.php?t=45896
For cross-posting etiquette, please read: http://www.excelguru.ca/content.php?184
__________________
Cheers, Paul Edstein [Fmr MS MVP - Word] |
#8
|
|||
|
|||
Paul, you are the eye of Ra
How to rewrite a RegEx for Word: 1) @e|@ˆ 2) AaE|Aa‰ 3)[ा-ौ]|[ॢॣ]|[ँंः] I do not know how to write the OR argument for Word. |
#9
|
||||
|
||||
In Word wildcard terms, your #3 has three OR arguments. This [ा-ौ] is an OR argument. So is this [ॢॣ] and this [ँंः]. Word does not use the RegEx '|' for an OR separator. If you want to be able to search for any of those terms using a wildcard Find in Word, you would need to use [ा-ौॢॣँंः].
Your #1 and #2 represent RegEx OR expressions but they are not Word wildcard expressions. If you want to use true regular expressions, then use RegEx, not Word wildcards. As advised in post #6, you can use RegEx in Word just the same as in Excel.
__________________
Cheers, Paul Edstein [Fmr MS MVP - Word] |
#10
|
||||
|
||||
Now also cross-posted, again providing without links, at: http://windowssecrets.com/forums/sho...2013-VBA-RegEx
If you want to be banned from here, keep cross-posting without providing links.
__________________
Cheers, Paul Edstein [Fmr MS MVP - Word] |
Tags |
convert, devanagari, regex |
Thread Tools | |
Display Modes | |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
wildcards in find & replace to reverse word order | jeffk | Word | 3 | 11-11-2012 01:47 PM |
MS Word Find and Replace not working | allenglishboy | Word | 10 | 07-25-2012 08:05 AM |
Word Find and Replace Query | bthart | Word | 1 | 12-29-2011 12:45 AM |
Bad view when using Find and Find & Replace - Word places found string on top line | paulkaye | Word | 4 | 12-06-2011 11:05 PM |
find&replace word in uppercase with word in lowercase | andrei | Word | 3 | 10-03-2011 05:11 AM |