![]() |
#1
|
|||
|
|||
![]()
Hi, Good Day!
Please assist with the following. I have a 1500 page word file full of text. I want to create a table that lists the unique words making up the word file. For this, I believe I need a code that copies each word from the file into one cell each of a table in Word or Excel, then I can run another code found online to do case-sensitive 'Match Duplicates' function of Excel and I will be left over with the unique words. Many thanks. |
#2
|
|||
|
|||
![]()
See: http://gregmaxey.com/word_tip_pages/...cy_report.html
It might take a considerable long time to process a file that large. |
#3
|
||||
|
||||
![]()
__________________
Cheers, Paul Edstein [Fmr MS MVP - Word] |
#4
|
|||
|
|||
![]()
Hi Thanks for the 2 replies. I used the link by Macropod and it works on small files that I test. With the large file, I will try and run it overnight because the PC jams if I do it now.
I don't need column 2 and 3 that it outputs, what lines do I omit from the code? Last edited by Singh_Edm; 11-13-2015 at 08:46 PM. Reason: Removed the nested question and posted it separately. |
#5
|
||||
|
||||
![]() Quote:
Code:
Sub Demo() Application.ScreenUpdating = False Dim StrIn As String, StrOut As String, StrTmp As String, StrExcl As String Dim i As Long, j As Long, k As Long, l As Long, Rng As Range 'Define the exlusions list StrExcl = "a,am,an,and,are,as,at,b,be,but,by,c,can,cm,d,did," & _ "do,does,e,eg,en,eq,etc,f,for,g,get,go,got,h,has,have," & _ "he,her,him,how,i,ie,if,in,into,is,it,its,j,k,l,m,me," & _ "mi,mm,my,n,na,nb,no,not,o,of,off,ok,on,one,or,our,out," & _ "p,q,r,re,s,she,so,t,the,their,them,they,this,t,to,u,v," & _ "via,vs,w,was,we,were,who,will,with,would,x,y,yd,you,your,z" With ActiveDocument 'Get the document's text StrIn = .Content.Text 'Strip out unwanted characters. Amongst others, hyphens and formatted single quotes are retained at this stage For i = 1 To 255 Select Case i Case 1 To 35, 37 To 38, 40 To 43, 45, 47, 58 To 64, 91 To 96, 123 To 127, 129 To 144, 147 To 149, 152 To 162, 164, 166 To 171, 174 To 191, 247 StrIn = Replace(StrIn, Chr(i), " ") End Select Next 'Delete any periods or commas at the end of a word. Formatted numbers are thus retained. StrIn = Replace(Replace(Replace(Replace(StrIn, Chr(44) & Chr(32), " "), Chr(44) & vbCr, " "), Chr(46) & Chr(32), " "), Chr(46) & vbCr, " ") 'Convert smart single quotes to plain single quotes & delete any at the start/end of a word StrIn = Replace(Replace(Replace(Replace(StrIn, Chr(145), "'"), Chr(146), "'"), "' ", " "), " '", " ") 'Convert to lowercase StrIn = " " & LCase(Trim(StrIn)) & " " 'Process the exclusions list For i = 0 To UBound(Split(StrExcl, ",")) While InStr(StrIn, " " & Split(StrExcl, ",")(i) & " ") > 0 StrIn = Replace(StrIn, " " & Split(StrExcl, ",")(i) & " ", " ") Wend Next 'Clean up any duplicate spaces While InStr(StrIn, " ") > 0 StrIn = Replace(StrIn, " ", " ") Wend StrIn = " " & Trim(StrIn) & " " j = UBound(Split(StrIn, " ")) l = j For i = 1 To j 'Find how many occurences of each word there are in the document StrTmp = Split(StrIn, " ")(1) While InStr(StrIn, " " & StrTmp & " ") > 0 StrIn = Replace(StrIn, " " & StrTmp & " ", " ") Wend 'Update the output string StrOut = StrOut & StrTmp & vbCr l = UBound(Split(StrIn, " ")) If l = 1 Then Exit For DoEvents Next 'Create the concordance table on a new last page Set Rng = .Range.Characters.Last With Rng .InsertAfter vbCr & Chr(12) & StrOut .Start = .Start + 2 .ConvertToTable Separator:=vbTab, Numcolumns:=1 .Tables(1).Sort Excludeheader:=False, FieldNumber:=1, _ SortFieldType:=wdSortFieldAlphanumeric, _ SortOrder:=wdSortOrderAscending, CaseSensitive:=False End With End With Application.ScreenUpdating = True End Sub
__________________
Cheers, Paul Edstein [Fmr MS MVP - Word] |
#6
|
||||
|
||||
![]()
Now cross-posted at: http://www.mrexcel.com/forum/excel-q...-keep-1-a.html
For cross-posting etiquette, please read: http://www.excelguru.ca/content.php?184 Singh_Edm, you've been advised of the expected cross-posting etiquette before (https://www.msofficeforums.com/word/...html#post57406). If you make a habit of ignoring that etiquette, you're liable to find people will stop providing help.
__________________
Cheers, Paul Edstein [Fmr MS MVP - Word] |
#7
|
|||
|
|||
![]()
Hi Macropod
Thank you for alerting me. I should have deleted that question from here first because this is a Word forum. I deleted it now from my post in this thread. Can the post in the excel forum be okayed now? |
![]() |
Tags |
macro, word to table |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Pasting text from Excel cell into word without creating a table, and keeping the in-cell formatting | hanvyj | Excel Programming | 0 | 08-28-2015 01:15 AM |
Macro to highlight repeated words in word file and extract into excel file | aabri | Word VBA | 1 | 06-14-2015 07:20 AM |
![]() |
adisl | Word VBA | 6 | 03-31-2014 11:30 PM |
![]() |
Kapoof | Excel | 1 | 02-18-2014 02:12 PM |
Is it possible to link an Excel File to a Word table? | KIM SOLIS | Excel | 9 | 09-08-2011 07:05 AM |