View Single Post
 
Old 10-22-2015, 01:33 AM
PRA007's Avatar
PRA007 PRA007 is offline Windows 7 32bit Office 2010 32bit
Competent Performer
 
Join Date: Dec 2014
Location: Ahmedabad, Gujrat, India
Posts: 145
PRA007 is on a distinguished road
Default Extract Data From Text file based on Pattern

I have Table like this in word



In this table there are tow types of row
Row having multiple numbers
Row having single number

In both the case I would like to find number in first line only

I have a large .txt file (800 mb) containing text having formate.

8232394 06774483 N 19850910 19870818 19910818 EXP.
8309716 06774483 N 19850910 19870818 19910319 REM.
4687262 06908244 N 19860917 19870818 19990815 EXP.
4687262 06908244 N 19860917 19870818 19990309 REM.
4687262 06908244 N 19860917 19870818 19950221 M184
4687262 06908244 N 19860917 19870818 19910108 M173
4687262 06908244 N 19860917 19870818 19880802 ASPN
4687263 06868897 N 19860527 19870818 19990128 M185
4687263 06868897 N 19860527 19870818 19950509 RMPN
4687263 06868897 N 19860527 19870818 19950509 ASPN
4687263 06868897 N 19860527 19870818 19950119 M184
4687263 06868897 N 19860527 19870818 19910311 ASPN
4687263 06868897 N 19860527 19870818 19910124 M173
4687264 06882047 N 19860703 19870818 19990815 EXP.
4687264 06882047 N 19860703 19870818 19990309 REM.
4687264 06882047 N 19860703 19870818 19950503 RMPN
4687264 06882047 N 19860703 19870818 19950503 ASPN
4687264 06882047 N 19860703 19870818 19950119 M184
4687264 06882047 N 19860703 19870818 19910311 ASPN
RE45781 14176526 N 20140210 20151027 20150929 ASPN
RE45786 14260890 N 20140424 20151027 20150929 ASPN
RE45790 14454285 Y 20140807 20151103 20151008 ASPN
RE45793 13445791 N 20120412 20151103 20151006 ASPN

there are three important column here
Column I is having Patent Numbers
Column 5 is having event Date
column 6 is having event status code

I want to search in ms word US [0-9]{7} or US [0-9,]{9} or US RE[0-9]{5}
and use number after US (without coma in second case) to crawl in .txt file and extract data from last column in space separated file.

for example if my word table is having US 8,309,716 B2 I would like to search corresponding number i.e. 8309716 from .txt file and extract data at last column for example in this case EXP.

.txt file sometimes contain Numbers somewhere else, I just want to search in column 1 only. in case of multiple result during search in column 1, I would like to keep search having latest column 5 date.

I want to extract event code from the text file to Word table column 3

After finding event code I want to replace it with corresponding event from another word document having following two column



Finally the table should look like this.

Reply With Quote