Microsoft Office Forums

Go Back   Microsoft Office Forums > >

Reply
 
Thread Tools Display Modes
  #1  
Old 10-22-2015, 01:33 AM
PRA007's Avatar
PRA007 PRA007 is offline Extract Data From Text file based on Pattern Windows 7 32bit Extract Data From Text file based on Pattern Office 2010 32bit
Competent Performer
Extract Data From Text file based on Pattern
 
Join Date: Dec 2014
Location: Ahmedabad, Gujrat, India
Posts: 145
PRA007 is on a distinguished road
Default Extract Data From Text file based on Pattern

I have Table like this in word



In this table there are tow types of row
Row having multiple numbers
Row having single number

In both the case I would like to find number in first line only

I have a large .txt file (800 mb) containing text having formate.

8232394 06774483 N 19850910 19870818 19910818 EXP.
8309716 06774483 N 19850910 19870818 19910319 REM.
4687262 06908244 N 19860917 19870818 19990815 EXP.
4687262 06908244 N 19860917 19870818 19990309 REM.
4687262 06908244 N 19860917 19870818 19950221 M184
4687262 06908244 N 19860917 19870818 19910108 M173
4687262 06908244 N 19860917 19870818 19880802 ASPN
4687263 06868897 N 19860527 19870818 19990128 M185
4687263 06868897 N 19860527 19870818 19950509 RMPN
4687263 06868897 N 19860527 19870818 19950509 ASPN
4687263 06868897 N 19860527 19870818 19950119 M184
4687263 06868897 N 19860527 19870818 19910311 ASPN
4687263 06868897 N 19860527 19870818 19910124 M173
4687264 06882047 N 19860703 19870818 19990815 EXP.
4687264 06882047 N 19860703 19870818 19990309 REM.
4687264 06882047 N 19860703 19870818 19950503 RMPN
4687264 06882047 N 19860703 19870818 19950503 ASPN
4687264 06882047 N 19860703 19870818 19950119 M184
4687264 06882047 N 19860703 19870818 19910311 ASPN


RE45781 14176526 N 20140210 20151027 20150929 ASPN
RE45786 14260890 N 20140424 20151027 20150929 ASPN
RE45790 14454285 Y 20140807 20151103 20151008 ASPN
RE45793 13445791 N 20120412 20151103 20151006 ASPN

there are three important column here
Column I is having Patent Numbers
Column 5 is having event Date
column 6 is having event status code

I want to search in ms word US [0-9]{7} or US [0-9,]{9} or US RE[0-9]{5}
and use number after US (without coma in second case) to crawl in .txt file and extract data from last column in space separated file.

for example if my word table is having US 8,309,716 B2 I would like to search corresponding number i.e. 8309716 from .txt file and extract data at last column for example in this case EXP.

.txt file sometimes contain Numbers somewhere else, I just want to search in column 1 only. in case of multiple result during search in column 1, I would like to keep search having latest column 5 date.

I want to extract event code from the text file to Word table column 3

After finding event code I want to replace it with corresponding event from another word document having following two column



Finally the table should look like this.

Reply With Quote
  #2  
Old 10-22-2015, 01:54 PM
macropod's Avatar
macropod macropod is offline Extract Data From Text file based on Pattern Windows 7 64bit Extract Data From Text file based on Pattern Office 2010 32bit
Administrator
 
Join Date: Dec 2010
Location: Canberra, Australia
Posts: 21,956
macropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond repute
Default

An 800MB text file is way too large to process - I'm not even sure Word could open it. The published limit is 512MB with no more than 32MB of text. I suggest you use another app to filter the data to eliminate everything that doesn't match your US/RE pattern. You may even need to break those up into smaller chunks.
__________________
Cheers,
Paul Edstein
[Fmr MS MVP - Word]
Reply With Quote
  #3  
Old 10-23-2015, 09:57 AM
PRA007's Avatar
PRA007 PRA007 is offline Extract Data From Text file based on Pattern Windows 7 32bit Extract Data From Text file based on Pattern Office 2010 32bit
Competent Performer
Extract Data From Text file based on Pattern
 
Join Date: Dec 2014
Location: Ahmedabad, Gujrat, India
Posts: 145
PRA007 is on a distinguished road
Default Extract Data from Text file based on Pattern Match

I was able to cut the large file to small based on the pattern. I used word wild card search to find patent numbers and made a txt file neming small

I found answers from superuser to cut the file for what I want

http://superuser.com/questions/99066...search-pattern
grep -f small_list.txt big_file.txt

Now I have made the resultant file in excel having only 3 column
1st column containing Patent numbers, second for event date and last for event code.
Is it possible with small files to do the above task

I also created word document having table with same as excel file if thats easy to be used.
Reply With Quote
  #4  
Old 10-26-2015, 04:20 AM
PRA007's Avatar
PRA007 PRA007 is offline Extract Data From Text file based on Pattern Windows 7 32bit Extract Data From Text file based on Pattern Office 2010 32bit
Competent Performer
Extract Data From Text file based on Pattern
 
Join Date: Dec 2014
Location: Ahmedabad, Gujrat, India
Posts: 145
PRA007 is on a distinguished road
Default Re: Extract Data From Text file based on Pattern.

I have Table like this in word (main.docx)


My search cell may contain multiple numbers.
I would like to search for first line only in every case.

I have another database word/Excel (database1.docx/xlsx) file containing following table.

4818798 19970406 EXP.
4818798 19961112 REM.
4818798 19921116 ASPN
4818798 19920810 M183
4894153 19980118 EXP.
4894153 19970826 REM.
4894153 19930715 M283
4896965 20020130 EXP.
4896965 20010821 REM.
4896965 19970203 M184
4896965 19930630 M183
5021423 19990606 EXP.
5021423 19981229 REM.
5021423 19941117 M183
5107211 19960421 EXP.
5107211 19951128 REM.
5117870 19960602 EXP.
5117870 19960109 REM.
5185937 20010218 EXP.

Column one contains patent nubers
Column two contains event date in yyyymmdd formate
Column three contains event code in short formate.

I Want to search (i): US ([3-9][0-9,RE]{7}) in my word document and search for corresponding \1 result in database1 file and put column 3 ^13 column 2 data in (i+1) cell.

In case of multiple result from database file I would like to consider event code with latest event date. Generally it is first entry but if possible I would like to consider the date as filter as in future USPTO may change the formate.

After putting the event code in target cell (i+1) I would like to replace it with real event from another two column database2 word/excel file(database2.docx/xlsx. I just want to find and replace Event code with real events as in table.


Files can be found at:
https://sites.google.com/site/rtsk2015/fo

I am not able to attach it in my office as it is forbidden because of cyberoam.

Sorry for setting up strict requrement as I am at learning stage.

Last edited by PRA007; 10-26-2015 at 10:38 PM. Reason: Editing Requested by admin
Reply With Quote
  #5  
Old 10-26-2015, 06:56 PM
macropod's Avatar
macropod macropod is offline Extract Data From Text file based on Pattern Windows 7 64bit Extract Data From Text file based on Pattern Office 2010 32bit
Administrator
 
Join Date: Dec 2010
Location: Canberra, Australia
Posts: 21,956
macropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond repute
Default

It would be helpful if you posted an actual Word document and an actual Excel worksheet, rather than expecting others to recreate what they think you might be working with, only to find out it's something different.
__________________
Cheers,
Paul Edstein
[Fmr MS MVP - Word]
Reply With Quote
  #6  
Old 10-26-2015, 10:24 PM
PRA007's Avatar
PRA007 PRA007 is offline Extract Data From Text file based on Pattern Windows 7 32bit Extract Data From Text file based on Pattern Office 2010 32bit
Competent Performer
Extract Data From Text file based on Pattern
 
Join Date: Dec 2014
Location: Ahmedabad, Gujrat, India
Posts: 145
PRA007 is on a distinguished road
Default

Files are uploaded and link to file is edited in above post.
I am not able to upload files because of cyberoam in my office.

Files can be found at:
https://sites.google.com/site/rtsk2015/fo

Last edited by PRA007; 10-27-2015 at 03:06 AM.
Reply With Quote
  #7  
Old 10-27-2015, 03:57 AM
macropod's Avatar
macropod macropod is offline Extract Data From Text file based on Pattern Windows 7 64bit Extract Data From Text file based on Pattern Office 2010 32bit
Administrator
 
Join Date: Dec 2010
Location: Canberra, Australia
Posts: 21,956
macropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond repute
Default

Try:
Code:
Sub GetPatentStatus()
Application.ScreenUpdating = False
Dim Tbl As Table, i As Long, j As Long, k As Long, lRow As Long, ArrFnd
Dim xlApp As Object, xlWkBk As Object, xlWkSht As Object, xlRng As Object
Dim bStrt As Boolean, bFnd As Boolean, bOpen As Boolean, bBar As Boolean, bFit As Boolean
Dim StrTxt As String, StrWkBkNm As String, StrFnd As String, StrWkSht As String
'Word Find expressions
ArrFnd = Array("US [0-9]{7}", "US [0-9,]{9}", "US RE[0-9]{5}")
'Excel constants for use with late binding
Const xlCellTypeLastCell As Long = 11: Const xlValues As Long = -4163
Const xlWhole As Long = 1: Const xlByRows As Long = 1
'Excel workbook name & path
StrWkBkNm = "C:\Users\" & Environ("Username") & "\Downloads\Database1.xlsx"
'Excel worksheet name
StrWkSht = "Sheet1"
If Dir(StrWkBkNm) = "" Then
  MsgBox "Cannot find the designated workbook: " & StrWkBkNm, vbExclamation
  Exit Sub
End If
bStrt = False ' Flag to record if we start Excel, so we can close it later.
bOpen = False ' Flag to record if we open the workbook, so we can close it later.
' Test whether Excel is already running.
On Error Resume Next
Set xlApp = GetObject(, "Excel.Application")
'Start Excel if it isn't running
If xlApp Is Nothing Then
  Set xlApp = CreateObject("Excel.Application")
  If xlApp Is Nothing Then
    MsgBox "Can't start Excel.", vbExclamation
    Exit Sub
  End If
  ' Record that we've started Excel.
  bStrt = True
End If
On Error GoTo 0
With xlApp
  'Hide our Excel session if we started it
  If bStrt = True Then .Visible = False
  'Check if the workbook is open.
  For Each xlWkBk In .Workbooks
    If xlWkBk.FullName = StrWkBkNm Then ' It's open
      Set xlWkBk = xlWkBk
      bOpen = True
      Exit For
    End If
  Next
  ' If not open by the current user.
  If bOpen = False Then
    ' Check if another user has it open.
    If IsFileLocked(StrWkBkNm) = True Then
      ' Report and exit if true
      MsgBox "The Excel workbook is in use." & vbCr & "Please try again later.", vbExclamation, "File in use"
      GoTo ErrExit
    End If
    ' The file is available, so open it.
    Set xlWkBk = .Workbooks.Open(FileName:=StrWkBkNm)
    If xlWkBk Is Nothing Then
      MsgBox "Cannot open:" & vbCr & StrWkBkNm, vbExclamation
      GoTo ErrExit
    End If
  End If
  On Error Resume Next
  Set xlWkSht = xlWkBk.Sheets(StrWkSht)
  On Error GoTo 0
  If xlWkSht Is Nothing Then
    MsgBox "Cannot find the worksheet named: '" & StrWkSht & "' in:" & vbCr & StrWkBkNm, vbExclamation
    GoTo ErrExit
  End If
  With xlWkSht.UsedRange
  lRow = .Cells.SpecialCells(xlCellTypeLastCell).Row
  End With
End With
' Store current Status Bar status, then switch on
bBar = Application.DisplayStatusBar
Application.DisplayStatusBar = True
With ActiveDocument
  For Each Tbl In .Tables
    With Tbl
      bFit = .AllowAutoFit
      .AllowAutoFit = False
      j = .Rows.Count
      For i = 1 To j
        Application.StatusBar = "Processing row " & i & " of " & j
        StrTxt = ""
        With .Cell(i, 2).Range.Paragraphs(1).Range
          'Find the references
          For k = 0 To UBound(ArrFnd)
            StrFnd = ArrFnd(k): bFnd = False
            With .Find
              .ClearFormatting
              .Replacement.ClearFormatting
              .Format = False
              .Forward = True
              .Wrap = wdFindStop
              .MatchWildcards = True
              .Text = StrFnd
              .Execute
            End With
            If .Find.Found Then
              StrTxt = Split(.Text, " ")(1): bFnd = True
              With xlWkSht
                Set xlRng = .Range("A1:A" & lRow).Find(What:=StrTxt, LookIn:=xlValues, _
                  LookAt:=xlWhole, SearchOrder:=xlByRows, MatchCase:=True, SearchFormat:=False)
                If Not xlRng Is Nothing Then
                  StrTxt = .Cells(xlRng.Row, 3).Text
                Else
                  StrTxt = ""
                End If
              End With
            End If
            If bFnd = True Then Exit For
          Next
        End With
        With .Cell(i, 3).Range
          If Len(.Text) > 2 Then
            .InsertBefore StrTxt & vbCr
          Else
            .InsertBefore StrTxt
          End If
        End With
      Next
      .AllowAutoFit = bFit
    End With
  Next
End With
' Clear the Status Bar
Application.StatusBar = False
' Restore original Status Bar status
Application.DisplayStatusBar = bBar
MsgBox "Finished!", vbInformation
ErrExit:
If Not xlWkBk Is Nothing Then If bOpen = False Then xlWkBk.Close
If Not xlApp Is Nothing Then If bStrt = True Then xlApp.Quit
Set xlRng = Nothing: Set xlWkSht = Nothing: Set xlWkBk = Nothing: Set xlApp = Nothing
Application.ScreenUpdating = True
End Sub
 
 
Function IsFileLocked(strFileName As String) As Boolean
  On Error Resume Next
  Open strFileName For Binary Access Read Write Lock Read Write As #1
  Close #1
  IsFileLocked = Err.Number
  Err.Clear
End Function
As coded, the macro looks in your Downloads folder for the Database1.xlsx file and Sheet1 in that workbook. You can substitute another folder, Excel file or worksheet if you prefer.
__________________
Cheers,
Paul Edstein
[Fmr MS MVP - Word]
Reply With Quote
  #8  
Old 10-28-2015, 03:16 AM
PRA007's Avatar
PRA007 PRA007 is offline Extract Data From Text file based on Pattern Windows 7 32bit Extract Data From Text file based on Pattern Office 2010 32bit
Competent Performer
Extract Data From Text file based on Pattern
 
Join Date: Dec 2014
Location: Ahmedabad, Gujrat, India
Posts: 145
PRA007 is on a distinguished road
Default

WORKED.... still throws error '91' on some occasion. I will work on solution one given by you. Can we make code not to disturb content already present in target cell. I mean can we paste the resulting content to be pasted before the text present in the content.

I made Big mistake in pinning image for final.docx as from my office while pasting the images it is not visible. The image shown in above question for final.docx is actually of database 2 word file. Any way after finding event code I may use simple recorded macro to convert event code to actual event without bothering about database 2 file.
Reply With Quote
  #9  
Old 10-28-2015, 03:34 AM
macropod's Avatar
macropod macropod is offline Extract Data From Text file based on Pattern Windows 7 64bit Extract Data From Text file based on Pattern Office 2010 32bit
Administrator
 
Join Date: Dec 2010
Location: Canberra, Australia
Posts: 21,956
macropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond repute
Default

You could try changing:
.Cell(i, 3).Range.Text = StrTxt
to:
.Cell(i, 3).Range.InsertBefore StrTxt & vbcr
__________________
Cheers,
Paul Edstein
[Fmr MS MVP - Word]
Reply With Quote
  #10  
Old 10-30-2015, 11:31 PM
PRA007's Avatar
PRA007 PRA007 is offline Extract Data From Text file based on Pattern Windows 7 32bit Extract Data From Text file based on Pattern Office 2010 32bit
Competent Performer
Extract Data From Text file based on Pattern
 
Join Date: Dec 2014
Location: Ahmedabad, Gujrat, India
Posts: 145
PRA007 is on a distinguished road
Default

After trial and error I could find why my error '91' is still there.

Coincidently document I provided contains all the occurrences of word find in the corresponding database files.

In my day to day word there may be some US numbers that may not be present in excel document.

In those cases I want to go ahead without adding anything.

Is there any way to skip if the search not foud
Reply With Quote
  #11  
Old 10-30-2015, 11:49 PM
macropod's Avatar
macropod macropod is offline Extract Data From Text file based on Pattern Windows 7 64bit Extract Data From Text file based on Pattern Office 2010 32bit
Administrator
 
Join Date: Dec 2010
Location: Canberra, Australia
Posts: 21,956
macropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond repute
Default

Quote:
Originally Posted by PRA007 View Post
After trial and error I could find why my error '91' is still there.
Unless you can provide details of when the error occurs (e.g. what is the first row in the Word document that doesn't get updated & does the column 2 value exist in the workbook), there's not much I can do about that.
Quote:
Originally Posted by PRA007 View Post
Is there any way to skip if the search not foud
That's what the code already does.
__________________
Cheers,
Paul Edstein
[Fmr MS MVP - Word]
Reply With Quote
  #12  
Old 10-31-2015, 02:01 AM
PRA007's Avatar
PRA007 PRA007 is offline Extract Data From Text file based on Pattern Windows 7 32bit Extract Data From Text file based on Pattern Office 2010 32bit
Competent Performer
Extract Data From Text file based on Pattern
 
Join Date: Dec 2014
Location: Ahmedabad, Gujrat, India
Posts: 145
PRA007 is on a distinguished road
Default

It worked perfect for first sample files I provided.

Please forgive me for providing new sample files as it seems the only solution for now.

In my current sample file there are all the elements (or say possible problems).

For my current file till raw 4 your code works fine after that when it comes to raw 5 it shows error. I thought error is because US RE44048 is not present in the database 1 file.

I don't know what the problem is but it shows
I got temporary solution.

I recorded macro to remove US [0-9]{5} and US [0-9]{8,11} in a temporary
It works fine for now. I Replaced column 3 then with original document.
Code:
Run-time Error '91':
Object Variable or with block variable not set.
I tried to add RE44048 in database1 file (last raw) still problem persist. It stopped without showing any error.

I don't have any clue whats going on now.

Please find my files at:
https://sites.google.com/site/rtsk2015/fo
Download: Main.docx
Download: Database1.xlsx

Last edited by PRA007; 10-31-2015 at 04:40 AM.
Reply With Quote
  #13  
Old 10-31-2015, 01:24 PM
macropod's Avatar
macropod macropod is offline Extract Data From Text file based on Pattern Windows 7 64bit Extract Data From Text file based on Pattern Office 2010 32bit
Administrator
 
Join Date: Dec 2010
Location: Canberra, Australia
Posts: 21,956
macropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond reputemacropod has a reputation beyond repute
Default

I've updated the original macro.
__________________
Cheers,
Paul Edstein
[Fmr MS MVP - Word]
Reply With Quote
  #14  
Old 11-01-2015, 11:20 PM
PRA007's Avatar
PRA007 PRA007 is offline Extract Data From Text file based on Pattern Windows 7 32bit Extract Data From Text file based on Pattern Office 2010 32bit
Competent Performer
Extract Data From Text file based on Pattern
 
Join Date: Dec 2014
Location: Ahmedabad, Gujrat, India
Posts: 145
PRA007 is on a distinguished road
Default

That is Huge Help.
I have Now complete solution (from MSoffice forum and superuser.com) how to get Patent fees status from USPTO.
Reply With Quote
Reply

Thread Tools
Display Modes


Similar Threads
Thread Thread Starter Forum Replies Last Post
Extract Data From Text file based on Pattern How to Extract Data from table based on pattern. PRA007 Word Tables 4 03-17-2015 11:05 PM
Retrieving data from data base based on text selection capitala PowerPoint 0 12-10-2014 08:10 AM
Extract Data From Text file based on Pattern Inserting text from one word file into another based on an excel input jmaxcy Excel 14 11-01-2013 04:07 PM
Inserting text from one word file into another based on an excel input jmaxcy Word 3 11-01-2013 01:26 AM
Is there a way to do this? (automatically enter text based on form data) TIKKI555 Word 0 05-26-2010 09:21 AM

Other Forums: Access Forums

All times are GMT -7. The time now is 12:18 PM.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.
Search Engine Optimisation provided by DragonByte SEO (Lite) - vBulletin Mods & Addons Copyright © 2024 DragonByte Technologies Ltd.
MSOfficeForums.com is not affiliated with Microsoft