![]() |
|
|
|
#1
|
|||
|
|||
|
I was hoping this could be done without using VBA. If VBA is the only way I can figure that out.
I have a spreadsheet that among other things has a list of text words and phrases. I would like to create a new list that lists all the unique words used in the original list with a 2nd col that shows how often that word is used and have it sorted by highest to lowest. If I have the list below 1. red widgets 2. blue widgets 3. red widgets that glow 4. buy widgets 6. blue and red widget repair (you count the blue and red in this) This would give widgets 4 red 3 blue 2 buy 1 repair 1 widget 1 glow 1 and 1 that 1 |
|
#2
|
||||
|
||||
|
Try the following. In columns B & C you should get sorted count of the words in column A.
The macro also has an exclusion list, so that various words and phrases can be excluded. That list (defined by the words & phrases in the StrExcl string variable) holds the words and phrases to be omitted from the concordance. Any phrases should be inserted into the exclusions list ahead of any of the single-word exclusions (so that conflicts don’t occur). The code also strips out trailing apostrophes, with the result that some possessive word forms may look a bit odd. Code:
Sub WordFrequencyCounter()
Application.ScreenUpdating = False
Dim StrWrds As String, StrTmp As String
Dim i As Long, j As Long, k As Long, l As Long, r As Long
'Define the exlusions list
Const StrExcl As String = "a,am,an,and,are,as,at,b,be,but,by,c,can,cm," & _
"d,did,do,does,e,eg,en,eq,etc,f,for,g,get,go,got,h,has,have," & _
"he,her,him,how,i,ie,if,in,into,is,it,its,j,k,l,m,me,mi," & _
"mm,my,n,na,nb,no,not,o,of,off,ok,on,one,or,our,out,p,q" & _
",r,re,s,she,so,t,the,their,them,they,this,t,to,u,v,via," & _
"vs,w,was,we,were,who,will,with,would,x,y,yd,you,your,z"
With ActiveSheet
r = .UsedRange.Range("A" & .Cells.SpecialCells(xlCellTypeLastCell).Row).Row
For i = 1 To r
StrTmp = Trim(.Cells(i, 1).Text)
If StrTmp <> "" Then StrWrds = StrWrds & " " & StrTmp
Next
'Strip out unwanted characters. Amongst others, hyphens and formatted single quotes are retained at this stage
For i = 1 To 255
Select Case i
'To strip out numbers, delete ', 58'
Case 1 To 35, 37 To 38, 40 To 43, 45, 47, 58 To 64, 91 To 96, 123 To 127, 129 To 144, 147 To 149, 152 To 162, 164, 166 To 171, 174 To 191, 247
StrWrds = Replace(StrWrds, Chr(i), " ")
End Select
Next
'Delete any periods or commas at the end of a word. Formatted numbers are thus retained.
StrWrds = Replace(Replace(Replace(Replace(StrWrds, Chr(44) & Chr(32), " "), Chr(44) & vbCr, " "), Chr(46) & Chr(32), " "), Chr(46) & vbCr, " ")
'Convert smart single quotes to plain single quotes & delete any at the start/end of a word
StrWrds = Replace(Replace(Replace(Replace(StrWrds, Chr(145), "'"), Chr(146), "'"), "' ", " "), " '", " ")
'Convert to lowercase
StrWrds = " " & LCase(Trim(StrWrds)) & " "
'Process the exclusions list
For i = 0 To UBound(Split(StrExcl, ","))
While InStr(StrWrds, " " & Split(StrExcl, ",")(i) & " ") > 0
StrWrds = Replace(StrWrds, " " & Split(StrExcl, ",")(i) & " ", " ")
Wend
Next
'Clean up any duplicate spaces
While InStr(StrWrds, " ") > 0
StrWrds = Replace(StrWrds, " ", " ")
Wend
StrWrds = " " & Trim(StrWrds) & " "
If Trim(StrWrds) = "" Then Exit Sub
j = UBound(Split(StrWrds, " "))
l = j: r = 1
.Cells(r, 2).Value = "Word"
.Cells(r, 3).Value = "Count"
For i = 1 To j
'Find how many occurences of each word there are in the column
StrTmp = Split(StrWrds, " ")(1)
While InStr(StrWrds, " " & StrTmp & " ") > 0
StrWrds = Replace(StrWrds, " " & StrTmp & " ", " ")
Wend
'Calculate the number of words replaced
k = l - UBound(Split(StrWrds, " "))
r = r + 1
'Update the output range
.Cells(r, 2).Value = StrTmp
.Cells(r, 3).Value = k
l = UBound(Split(StrWrds, " "))
If l = 1 Then Exit For
DoEvents
Next
With .Sort
With .SortFields
.Clear
.Add Key:=Range("C2:C" & r), SortOn:=xlSortOnValues, Order:=xlDescending, DataOption:=xlSortNormal
.Add Key:=Range("B2:B" & r), SortOn:=xlSortOnValues, Order:=xlAscending, DataOption:=xlSortNormal
End With
.SetRange Range("B1:C" & r)
.Header = xlYes
.MatchCase = False
.Orientation = xlTopToBottom
.SortMethod = xlPinYin
.Apply
End With
End With
Application.ScreenUpdating = True
End Sub
__________________
Cheers, Paul Edstein [Fmr MS MVP - Word] |
|
#3
|
|||
|
|||
|
Cool thanks it worked. I was hoping if I could not get a formula I would figure it out myself. I supposed I shouldn't want to recreate the wheel if you have already done it.
|
|
#4
|
||||
|
||||
|
I very much doubt you'd find a useful formula for dealing with your requirements. The macro I posted was an adaptation of one I'd written for Word, so I didn't have to start from scratch.
__________________
Cheers, Paul Edstein [Fmr MS MVP - Word] |
|
| Thread Tools | |
| Display Modes | |
|
|
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Create a list that contains duplicate and unique names without blanks | Iced42 | Excel | 1 | 06-30-2015 04:50 PM |
| Excel list to Word with Page Break on new unique data | jeverett1 | Mail Merge | 3 | 07-30-2014 11:59 AM |
| 'Auto' fields, I want to create a unique ID... | kateattat | Word | 7 | 07-10-2012 05:10 AM |
| Have Outlook suggest words as you create/reply to an email | daveblack | Outlook | 0 | 04-05-2012 05:46 AM |
| How do you create a list similar to an itunes list? | hatemail13 | Excel | 1 | 08-06-2010 02:21 AM |