#1
|
|||
|
|||
Extract phone number from word file
Hello
I’m looking for a word macro that will extract numbers from a word file Its has bunch of numbers. The format of those numbers are ###+###-####, so it’s a 10 digit format after first 3 digits there is a plus sign and then after 3 digits there is a minus sign and then the last 4 digits. What the macro should do is find the phone number and extract that into a new word window and then if the there is a DUPLICATE OF THAT NUMBER then the word should SKIP THAT NUMBER. and then search and extract the next number since the file has lots of duplcate number and i only need unique numbers. Below is the sample data, data data data data data data data data data data data data data data data 951+907-3887 data data data data data data data data data data data data data data data 951+907-3887 data data data data data data data data data data data data data data data 951+907-3887 data data data data data data data data data data data data data data data 916+555-1212 data data data data data data data data data data data data data data data 916+555-1212 data data data data data data data data data data data data data data data Thanks. |
#2
|
||||
|
||||
Are there numbers that are not phone numbers in the document?
__________________
Cheers, Paul Edstein [Fmr MS MVP - Word] |
#3
|
||||
|
||||
Actually, I don't need to know that - try the following macro:
Code:
Sub Demo() Application.ScreenUpdating = False Dim StrTxt As String StrTxt = "|" With ActiveDocument.Range With .Find .ClearFormatting .Replacement.ClearFormatting .Text = "[0-9]{3}+[0-9]{3}-[0-9]{4}" .Replacement.Text = "" .Forward = True .Wrap = wdFindStop .Format = False .MatchWildcards = True .Execute End With Do While .Find.Found If InStr(StrTxt, .Duplicate.Text) = 0 Then StrTxt = StrTxt & .Duplicate.Text & "|" .Collapse wdCollapseEnd .Find.Execute Loop End With With ActiveDocument.Range .InsertAfter Replace(StrTxt, "|", vbCr) .Characters.Last.Delete End With Application.ScreenUpdating = True End Sub
__________________
Cheers, Paul Edstein [Fmr MS MVP - Word] |
#4
|
|||
|
|||
Extract Phone number from word file
Hello Paul
Thanks for the macro, it worked great.!! After running the macro I realized that there was more data that need to be extracted. And they are in a block format. The format of the block is below beginning code=abc bunch of data 908+555-1212 ending code=xyz garbage data garbage data garbage data garbage data garbage data Sample Data =========== beginning code=abc bunch of data 908+555-1212 ending code=xyz garbage data garbage data garbage data garbage data garbage data beginning code=abc bunch of data 908+555-1212 ending code=xyz garbage data garbage data garbage data garbage data garbage data beginning code=abc bunch of data 510+555-1212 ending code=xyz garbage data garbage data garbage data garbage data garbage data beginning code=abc bunch of data 510+555-1212 ending code=xyz garbage data garbage data garbage data garbage data garbage data beginning code=abc bunch of data 510+555-1212 ending code=xyz garbage data garbage data garbage data garbage data garbage data beginning code=abc bunch of data 510+555-1212 ending code=xyz garbage data garbage data garbage data garbage data garbage data beginning code=abc bunch of data 510+555-1212 ending code=xyz garbage data garbage data garbage data garbage data garbage data beginning code=abc bunch of data 510+555-1212 ending code=xyz garbage data garbage data garbage data garbage data garbage data beginning code=abc bunch of data 510+555-1212 ending code=xyz garbage data garbage data garbage data garbage data garbage data beginning code=abc bunch of data 510+555-1212 ending code=xyz garbage data garbage data garbage data garbage data garbage data I was able to locate the 9 digit phone number with the following code. Code:
Do Selection.Find.ClearFormatting With Selection.Find .Text = "^?^?^?+^?^?^?-^?^?^?^?" .Replacement.Text = """" .Forward = True .Wrap = wdFindContinue .Format = False .MatchCase = False .MatchWholeWord = False .MatchWildcards = False .MatchSoundsLike = False .MatchAllWordForms = False End With Selection.Find.Execute Selection.MoveRight Unit:=wdCharacter, Count:=2 Selection.HomeKey Unit:=wdLine, Extend:=wdExtend Selection.MoveUp Unit:=wdLine, Count:=1, Extend:=wdExtend Selection.Cut Windows(2).Activate Selection.Paste Selection.TypeParagraph Windows(1).Activate Loop Until Selection.Find.Found = False 1) find the 9 digits number and take out that whole record into a new window 2) now the next few records are just in sequence following the 1st record, so the macro should check to see if the next records have the same number, if it finds it then it should delete THAT NUMBER and then check for the next record and so on, once it finds a unique number then it should extract THAT RECORD AND ADD IT DOWN BELOW 1ST RECORD THAT WAS EXTRACTED IN THAT NEW WINDOW. any help would be greatly appreciated. Thanks a lot. Last edited by macropod; 06-15-2012 at 06:22 PM. Reason: Added code tags & formatting |
#5
|
||||
|
||||
In your latest post you have numerous refernces to 'beginning code=abc' and 'ending code=xyz'. What are the actual codes? Does the same phone number appear with more than one kind of 'bunch of data' and, if so, what is to be done in such cases?
It's not apparent what your use of: Code:
Selection.MoveRight Unit:=wdCharacter, Count:=2 Selection.HomeKey Unit:=wdLine, Extend:=wdExtend Selection.MoveUp Unit:=wdLine, Count:=1, Extend:=wdExtend Finally, you refer to 'take out that whole record into a new window'. I take it you mean another document. Is this a new document or an existing document that you need the content pasted into? PS: When posting code, please use the code tags - they're on the 'Advanced' screen.
__________________
Cheers, Paul Edstein [Fmr MS MVP - Word] |
#6
|
|||
|
|||
Hello Paul
Thanks for replying back to me. Below is my response In your latest post you have numerous references to 'beginning code=abc' and 'ending code=xyz'. What are the actual codes? Does the same phone number appear with more than one kind of 'bunch of data' and, if so, what is to be done in such cases? The beginning code is FirstN and the ending code is LastN Between the begin code and end code there are other items and those change all the time like address, ss#, dob, phone number, place of business so the line record looks something like this. FirstN=john, address=123 xyz street, ss#=555-55-5555, dob=09-09-2009, phone=916-555-1212, place of business=abc co, LastN=smith. And then there is bunch of garbage data, af;adkfjakfjadfjadfjdfasdfadfasdfadfsdfdsfdfdsafad safasfdfs Then the records begins again in the garbage data, as the above sequence. So data looks something like this. Sffsaffafa afadfadfasdfadf adfafdadfasdfdsfsaf sdfadfsdf FirstN=john, address=123 xyz street, ss#=555-55-5555, dob=09-09-2009, phone=916-555-1212, place of business=abc co, LastN=smith. Sfdfsdfsaf fadfasdfsfsffa sdfasdfasdfsadf fsadfadsfasdf sdfadfasf FirstN=john, address=123 xyz street, ss#=555-55-5555, dob=09-09-2009, phone=916-555-1212, place of business=abc co, LastN=smith. Dfsdfsfadfasfa afsadfd asdfasdfas sfdafdaf sdfsdfsfa sfdfsfdsf FirstN=john, address=123 xyz street, ss#=555-55-5555, dob=09-09-2009, phone=916-555-1212, place of business=abc co, LastN=smith. Asfasfsfdfsfa sfsfdfsfs sfdfsfdsfs FirstN=Dave, address=345 xyz street, ss#=444-44-444, dob=09-08-2008, phone=908-555-1212, place of business=abczey co, LastN=Bartel. Sdfsdfdsf sdfadfafkjl a fafjdfajd;fk; asdff sdfsddfsdf sfdfsfdsf sfsfdfsfs sfdfsfdsfs FirstN=Dave, address=345 xyz street, ss#=444-44-444, dob=09-08-2008, phone=908-555-1212, place of business=abczey co, LastN=Bartel. Sdfsdfdsf sdfadfafkjl a fafjdfajd;fk; asdff sdfsddfsdf sfdfsfdsf sfsfdfsfs sfdfsfdsfs FirstN=Dave, address=345 xyz street, ss#=444-44-444, dob=09-08-2008, phone=908-555-1212, place of business=abczey co, LastN=Bartel. Sdfsdfdsf sdfadfafkjl a fafjdfajd;fk; asdff sdfsddfsdf sfdfsfdsf sfsfdfsfs sfdfsfdsfs FirstN=Dave, address=345 xyz street, ss#=444-44-444, dob=09-08-2008, phone=908-555-1212, place of business=abczey co, LastN=Bartel. Sdfsdfdsf sdfadfafkjl a fafjdfajd;fk; asdff sdfsddfsdf sfdfsfdsf so as you can see from the above data, I need to pull the record out of this data a record is considred FirstN=john, address=123 xyz street, ss#=555-55-5555, dob=09-09-2009, phone=916-555-1212, place of business=abc co, LastN=smith. and it’s a TEXT file with continuous data, one thing is good that the once the 1st number is found the duplicate number and next few duplicate numbers appears right in the flow of the file like in the above data and they don’t appear anywhere else in the file. Does the same phone number appear with more than one kind of 'bunch of data' and, if so, what is to be done in such cases? No the bunch of data and phone numbers are always same, so I only need to extract one time and if the next number is duplicate then the macro should skip it, since deleting will take too much time for the file to adjust upward. It's not apparent what your use of: Code: Selection.MoveRight Unit:=wdCharacter, Count:=2 Selection.HomeKey Unit:=wdLine, Extend:=wdExtend Selection.MoveUp Unit:=wdLine, Count:=1, Extend:=wdExtendis for. It doesn't seem to have any logical connection with the data you've posted. Perhaps you could explain this. Yeah you can ignore the code that I posted, I’m using windows 7 and word 2003, ( forgot to mentioned that earlier) but don’t need the above code. I used word 2003 macro record functions to record the macro keystrokes. If you can extract the data that I mentioned some how then that’s all I need. Finally, you refer to 'take out that whole record into a new window'. I take it you mean another document. Is this a new document or an existing document that you need the content pasted into? Just a new document window and switching back and forth to extract the data from the main window and then put in the other window. PS: When posting code, please use the code tags - they're on the 'Advanced' screen. I will try, not too familiar with the system. I will really appreciate all your help Thanks.!! |
#7
|
||||
|
||||
OK, try the following version of the macro. You don't even need to create the output document beforehand - the macro does that for itself.
Code:
Sub Demo() Application.ScreenUpdating = False Dim StrTxt As String StrTxt = "" With ActiveDocument.Range With .Find .ClearFormatting .Replacement.ClearFormatting .Text = "FirstN*[0-9]{3}[\+\-][0-9]{3}[\+\-][0-9]{4}*LastN\=*>?" .Replacement.Text = "" .Forward = True .Wrap = wdFindStop .Format = False .MatchWildcards = True .Execute End With Do While .Find.Found If InStr(StrTxt, .Duplicate.Text) = 0 Then StrTxt = StrTxt & .Duplicate.Text & "|" .Collapse wdCollapseEnd .Find.Execute Loop End With Documents.Add With ActiveDocument.Range .InsertAfter Replace(StrTxt, "|", vbCr) .Characters.Last.Delete End With Application.ScreenUpdating = True End Sub
__________________
Cheers, Paul Edstein [Fmr MS MVP - Word] |
#8
|
|||
|
|||
Extract phone number from word file
Hello Paul
Thanks for the Macro It worked great. I had another file and it’s pretty much the same type of thing, but the beginning and ending codes are different. Can you tweak the code for the below data. The records begins as rfn= and ends with rph= Sample record rfn=Michael;fzp=95070;fst=CA;qn=;fln=;fph=;rst=CA; rzp=95070;lt=R;fad=15160+xy+zat+tlm;rln=dffdsfdsfs fd;rph=408+555-1212 rfn=Kim;fzp=84569;fst=CA;qn=;fln=;fph=;rst=CA;rzp= 84569;lt=R;fad=2660+sdfdf+dsf;rln=Jang;rph=508+555-1212 Below is the dummy data rfn=Michael;fzp=95070;fst=CA;qn=;fln=;fph=;rst=CA; rzp=95070;lt=R;fad=15160+xy+zat+tlm;rln=dffdsfdsfs fd;rph=408+555-1212;rad=2343+sfdsf+dfsfsf;ord=232231701404?' var tile = window.dfp_tile_count = window.dfp_tile_count ? ++window.dfp_tile_count : 1 url = url .replace(/;ord=/, ";bt=" + rsi_data + ";ord=") .replace(/;tile=\d*;/, ";tile=" + tile + ";") document.write('\x3Cscript type="text/javascript" src="' + url + '" >\x3C/script>') })() </script> </div> <div class="ad ad_720x300 ad_under"> <script type="text/javascript"> (function () { var dfd = 'dfadfadfada adsfadfa afadfafasd adfadffadf asdfa da adfafas rfn=Michael;fzp=95070;fst=CA;qn=;fln=;fph=;rst=CA; rzp=95070;lt=R;fad=15160+xy+zat+tlm;rln=dffdsfdsfs fd;rph=408+555-1212;rad=2343+sfdsf+dfsfsf;ord=232231701404?' var tile = window.dfp_tile_count = window.dfp_tile_count ? ++window.dfp_tile_count : 1 url = url .replace(/;ord=/, ";bt=" + rsi_data + ";ord=") .replace(/;tile=\d*;/, ";tile=" + tile + ";") document.write('\x3Cscript type="text/javascript" src="' + url + '" >\x3C/script>') })() </script> </div> <div class="ad ad_970x1 ad_"> <script type="text/javascript"> (function () { var url = 'fadsffafdf/N6794/adj/sfdsfs;sz=1x1;tile=2;pos=left;sid=wp.us;pgid=addre ss.results;fct=Saratoga;exact=0;xp=E;ffn=;rct=Saog a;rfn=Michael;fzp=12345;fst=CA;qn=;fln=;fph=;rst=CA; rzp=12345;lt=R;fad=15160+sdfd+dsfsdf+sfd;rln=sfdsf ;rph=408+555-1212;rad=25160+sfsdf+ssdfsd;ord=842231701404?' var tile = window.dfp_tile_count = window.dfp_tile_count ? ++window.dfp_tile_count : 1 url = url .replace(/;ord=/, ";bt=" + rsi_data + ";ord=") .replace(/;tile=\d*;/, ";tile=" + tile + ";") document.write('\x3Cscript type="text/javascript" src="' + url + '" >\x3C/script>') })() </script> </div> <div id="right"> <div class="ad ad_160x600-300x250-300x600 ad_top"> <script type="text/javascript"> (function () { var url = 'sdfdfadfsfresults;sz=720x300;tile=11;pos=over;sid =wp.us;pgid=address.results;fct=meilroy;exact=0;xp =E;ffn=;rct=meilroy;rfn=Kim;fzp=84569;fst=CA;qn=;fln=;fph=;rst=CA;rzp= 84569;lt=R;fad=2660+sdfdf+dsf;rln=Jang;rph=508+555-1212;rad=2660+sdfdsf+sdf;ord=7174278076361?' Really appreciate your help Thanks. !! |
#9
|
||||
|
||||
Simply change:
.Text = "FirstN*[0-9]{3}[\+\-][0-9]{3}[\+\-][0-9]{4}*LastN\=*>?" to: .Text = "rfn=*rph=[0-9]{3}[\+\-][0-9]{3}[\+\-][0-9]{4}"
__________________
Cheers, Paul Edstein [Fmr MS MVP - Word] |
#10
|
|||
|
|||
ct phone number from word file
I did change that. but some how it extracted bunch of stuff also,
can you test the macro with the above data to see what causing it. Thanks. |
#11
|
||||
|
||||
I'm not sure what you mean by 'extra stuff' - the macro extracts 3 records from your dummy data. Although the first two have the same phone number, the other details differ.
__________________
Cheers, Paul Edstein [Fmr MS MVP - Word] |
#12
|
|||
|
|||
Hello Paul
Thanks for the macro, Just needed a bit of tweaking. seems like when there is no phone numbers in the file, then the macro loose it block and grabs on to other data, here is the modified data with number missing and if you run the macro on the below data, you will see how it grab the extra stuff. Thanks. rfn=Michael;fzp=95070;fst=CA;qn=;fln=;fph=;rst=CA; rzp=95070;lt=R;fad=15160+xy+zat+tlm;rln=dffdsfdsfs fd;rph=408+555-1212;rad=2343+sfdsf+dfsfsf;ord=232231701404?' var tile = window.dfp_tile_count = window.dfp_tile_count ? ++window.dfp_tile_count : 1 url = url .replace(/;ord=/, ";bt=" + rsi_data + ";ord=") .replace(/;tile=\d*;/, ";tile=" + tile + ";") document.write('\x3Cscript type="text/javascript" src="' + url + '" >\x3C/script>') })() </script> </div> <div class="ad ad_720x300 ad_under"> <script type="text/javascript"> (function () { var dfd = 'dfadfadfada adsfadfa afadfafasd adfadffadf asdfa da adfafas rfn=Michael;fzp=95070;fst=CA;qn=;fln=;fph=;rst=CA; rzp=95070;lt=R;fad=15160+xy+zat+tlm;rln=dffdsfdsfs fd;rph=408+555-1212;rad=2343+sfdsf+dfsfsf;ord=232231701404?' var tile = window.dfp_tile_count = window.dfp_tile_count ? ++window.dfp_tile_count : 1 url = url .replace(/;ord=/, ";bt=" + rsi_data + ";ord=") .replace(/;tile=\d*;/, ";tile=" + tile + ";") document.write('\x3Cscript type="text/javascript" src="' + url + '" >\x3C/script>') })() </script> </div> <div class="ad ad_970x1 ad_"> <script type="text/javascript"> (function () { var url = 'fadsffafdf/N6794/adj/sfdsfs;sz=1x1;tile=2;pos=left;sid=wp.us;pgid=addre ss.results;fct=Saratoga;exact=0;xp=E;ffn=;rct=Saog a;rfn=Michael;fzp=12345;fst=CA;qn=;fln=;fph=;rst=C A; rzp=12345;lt=R;fad=15160+sdfd+dsfsdf+sfd;rln=sfdsf ;rph=;rad=25160+sfsdf+ssdfsd;ord=842231701404?' var tile = window.dfp_tile_count = window.dfp_tile_count ? ++window.dfp_tile_count : 1 url = url .replace(/;ord=/, ";bt=" + rsi_data + ";ord=") .replace(/;tile=\d*;/, ";tile=" + tile + ";") document.write('\x3Cscript type="text/javascript" src="' + url + '" >\x3C/script>') })() </script> </div> <div id="right"> <div class="ad ad_160x600-300x250-300x600 ad_top"> <script type="text/javascript"> (function () { var url = 'sdfdfadfsfresults;sz=720x300;tile=11;pos=over;sid =wp.us;pgid=address.results;fct=meilroy;exact=0;xp =E;ffn=;rct=meilroy;rfn=Kim;fzp=84569;fst=CA;qn=;f ln=;fph=;rst=CA;rzp= 84569;lt=R;fad=2660+sdfdf+dsf;rln=Jang;rph=508+555-1212;rad=2660+sdfdsf+sdf;ord=7174278076361?' |
#13
|
||||
|
||||
Try using:
.Text = "rfn=[!^13]@rph=[0-9]{3}[\+\-][0-9]{3}[\+\-][0-9]{4}"
__________________
Cheers, Paul Edstein [Fmr MS MVP - Word] |
Thread Tools | |
Display Modes | |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Extract Video from .ppsx file | designer | PowerPoint | 1 | 10-14-2011 08:00 AM |
Phone number formatting | Mark Micallef | Outlook | 1 | 08-04-2010 02:38 PM |
Can I Extract a Page from Word and Make a New DOCX File? | tatihulot | Word | 1 | 06-20-2010 11:38 PM |
in WORD, how do i type a 9-digit phone number "xxxxxxxxx" and have it automatically.. | jay8962 | Word | 0 | 04-08-2010 11:08 AM |
automatically extract footnotes into new file and apply character format to footnote | hrdwa | Word | 0 | 02-27-2010 03:16 AM |