Microsoft Office Forums

Go Back   Microsoft Office Forums > >

Reply
 
Thread Tools Display Modes
  #1  
Old 05-19-2024, 05:00 AM
Rasec Rasec is offline Help in trying to identify class string in each stage of URL Windows 10 Help in trying to identify class string in each stage of URL Office 2021
Novice
Help in trying to identify class string in each stage of URL
 
Join Date: May 2024
Posts: 3
Rasec is on a distinguished road
Default Help in trying to identify class string in each stage of URL

Hi everyone, maybe someone could help me!



I'm trying to identify the classname related with each string named "toSearch" in each URL. My logic is to loop every link and search for a text that is present in each level especified.

With my code below I'm able to identify classname of level1 and level2, but is not working for level3 and level4 and the other issue is that some cities don't have more than one location, in that case level4 exists but level3 doesn´t exist.

Then maybe someone could help me how to identify the classname for level3 and level4 taking in consideration the cases when there are only 3 levels (level1, level2, levell4) and if there is a way to give as input only the first URL and the macro be able to identify the other 3 as needed in each stage.

Level1 = Name of the state
Level2 = Name of the city
Level3 = Some text (location) in a link that is present for each city
Level4 = The street address (is not a link, but a text)

Thanks in advance

Code:
Sub GetClass()
Dim url1 As String, url2 As String, url3 As String, url4 As String
Dim toSearch1 As String, toSearch2 As String, toSearch3 As String, toSearch4 As String
Dim HTMLDoc As New HTMLDocument


    'URL levels
    url1 = "https://locations.bojangles.com/"
    url2 = "https://locations.bojangles.com/al.html"
    url3 = "https://locations.bojangles.com/al/huntsville.html"
    url4 = "https://locations.bojangles.com/al/huntsville/11375-south-memorial-pkwy.html"
    
    'Text to search in each level
    toSearch1 = "Alabama"
    toSearch2 = "Huntsville"
    toSearch3 = "South Memorial Pkwy"
    toSearch4 = "11375 South Memorial Pkwy"
    
    'Print className for each level
    Call LoopElements(url1, toSearch1, "Level1")
    Call LoopElements(url2, toSearch2, "Level2")
    Call LoopElements(url2, toSearch3, "Level3")
    Call LoopElements(url4, toSearch4, "Level4")

End Sub

Function LoopElements(url As String, toSearch As String, level As String)
Dim HTMLDoc As New HTMLDocument
Dim links As Object
Dim i As Integer

    With New ServerXMLHTTP60
        .Open "Get", url, False
        .send
        HTMLDoc.body.innerHTML = .responseText
    End With
        
    Set links = HTMLDoc.body.getElementsByTagName("a")
    
    With links
        For i = 0 To .Length - 1
            If .Item(i).innerText Like "*" & toSearch & "*" Then
                Debug.Print level & " ClassName: " & .Item(i).className
            End If
        Next i
    End With

End Function
Reply With Quote
  #2  
Old 05-20-2024, 04:49 AM
Guessed's Avatar
Guessed Guessed is offline Help in trying to identify class string in each stage of URL Windows 10 Help in trying to identify class string in each stage of URL Office 2016
Expert
 
Join Date: Mar 2010
Location: Canberra/Melbourne Australia
Posts: 4,176
Guessed has a brilliant futureGuessed has a brilliant futureGuessed has a brilliant futureGuessed has a brilliant futureGuessed has a brilliant futureGuessed has a brilliant futureGuessed has a brilliant futureGuessed has a brilliant futureGuessed has a brilliant futureGuessed has a brilliant futureGuessed has a brilliant future
Default

I'm not sure what you are asking on the levels question but this variation returns the values you found along the way so you can derive the subsequent URLs. I suspect getting an empty string back will answer your levels question.
Code:
Sub GetClass()
  Dim url1 As String, url2 As String, url3 As String, url4 As String
  Dim toSearch1 As String, toSearch2 As String, toSearch3 As String, toSearch4 As String
  Dim HTMLDoc As New HTMLDocument

    'URL levels
    url1 = "https://locations.bojangles.com/"
    
    'Text to search in each level
    toSearch1 = "Alabama"
    toSearch2 = "Huntsville"
    toSearch3 = "South Memorial Pkwy"
    toSearch4 = "11375 South Memorial Pkwy"
    
    'Print className for each level
    url2 = url1 & LoopElements(url1, toSearch1, "Level1")
    url3 = url1 & LoopElements(url2, toSearch2, "Level2")
    url4 = url1 & LoopElements(url3, toSearch3, "Level3")
    url4 = Replace(url4, "../", "")
    Call LoopElements(url4, toSearch4, "Level4")

End Sub

Function LoopElements(url As String, toSearch As String, level As String) As String
  Dim HTMLDoc As New HTMLDocument
  Dim links As Object
  Dim i As Integer

  Debug.Print "Now searching: " & url, toSearch, level
  
  With New ServerXMLHTTP60
      .Open "Get", url, False
      .send
      HTMLDoc.body.innerHTML = .responseText
  End With
      
  Set links = HTMLDoc.body.getElementsByTagName("a")
  
  With links
    For i = 0 To .Length - 1
      If .Item(i).innerText Like "*" & toSearch & "*" Then
        Debug.Print "", .Item(i), .Item(i).innerText
        Debug.Print "", level & " ClassName: " & .Item(i).className
        LoopElements = Split(.Item(i), ":")(1)
      End If
    Next i
  End With
End Function
__________________
Andrew Lockton
Chrysalis Design, Melbourne Australia
Reply With Quote
  #3  
Old 05-20-2024, 06:08 AM
Rasec Rasec is offline Help in trying to identify class string in each stage of URL Windows 10 Help in trying to identify class string in each stage of URL Office 2021
Novice
Help in trying to identify class string in each stage of URL
 
Join Date: May 2024
Posts: 3
Rasec is on a distinguished road
Default

Quote:
Originally Posted by Guessed View Post
I'm not sure what you are asking on the levels question but this variation returns the values you found along the way so you can derive the subsequent URLs. I suspect getting an empty string back will answer your levels question.
Thanks for your help, it works in a better way. My issue with the levels is:

What I mean by levels is, if you see this test is againts city "Huntsville", that have 4 locations and to get the class of the "adress", is needed to drilldown until 4th level.

Level1=State=Alabama,
level2=city=Hunstville,
level3=locations=11375 South Memorial Pkwy (in this case the anchor is an address but is a link)
level4=addess=11375 South Memorial Pkwy (the actual address that is not a link <a></a>, but a text)

Now if we select Albertville instead of Huntsville, we can see that only have one location and is needed to click only 3 times to reach the address. Click in state alabama(level1), click in city Albertville (level2) and takes us directly to the same window that happens in 4th level when city is Huntsville, only 3 levels exist (1,2 and 4).

Then, how to get the classname when my input is a city that only has one location (3 levels)? (handle both cases, 3 and 4 levels)

And how to get the classname of the 4th level (url4)? since is not printing anything because is not a link, is text.

Thanks again
Reply With Quote
  #4  
Old 05-20-2024, 07:39 AM
Guessed's Avatar
Guessed Guessed is offline Help in trying to identify class string in each stage of URL Windows 10 Help in trying to identify class string in each stage of URL Office 2016
Expert
 
Join Date: Mar 2010
Location: Canberra/Melbourne Australia
Posts: 4,176
Guessed has a brilliant futureGuessed has a brilliant futureGuessed has a brilliant futureGuessed has a brilliant futureGuessed has a brilliant futureGuessed has a brilliant futureGuessed has a brilliant futureGuessed has a brilliant futureGuessed has a brilliant futureGuessed has a brilliant futureGuessed has a brilliant future
Default

OK, now I understand what you are saying about levels but I don't see the point of trying to apply that logic. If you working with hard coded search parameters (toSearchX) then you already know all four levels and the code is just revealing something you already know.

Regardless, looking at this line shows you have a key piece of info in the innerText value
Code:
Debug.Print "", .Item(i), .Item(i).innerText
The result in your immediate window shows the count of found items eg
Code:
about:al.html Alabama(33)
about:al/huntsville.html    Huntsville(4)
I would expect that if the brackets show (1) then the result has only one possible location
__________________
Andrew Lockton
Chrysalis Design, Melbourne Australia
Reply With Quote
  #5  
Old 05-20-2024, 02:32 PM
Rasec Rasec is offline Help in trying to identify class string in each stage of URL Windows 10 Help in trying to identify class string in each stage of URL Office 2021
Novice
Help in trying to identify class string in each stage of URL
 
Join Date: May 2024
Posts: 3
Rasec is on a distinguished road
Default

Yes, if the state or city shows "(number)" is the number of cities or locations in that city.

Let me explain, there is already a macro (macro 2) that needs, as manual user input, the classes of each level, then the macro gets the addresses/street only with those inputs. Now, I'm using this logic because I'm trying to do something a kind of "generic" macro (macro 1) that could work for other few sites that have similar structure (with same levels) only introducing some pure text content (state, city, location, address) present in website (nothing of html). This macro I'm trying to do, will extract the classes that would be the input for "macro 2". Macro 1 would feed with the classes Macro2, and Macro 2 would get the text info needed.

One of the goals is only give text input to "macro 2" that any without knowledge of html, could look the desired website and write in 4 cells of worksheet, the text for state, city, location, address.

I know each site needs a custom code when we are trying to do some web scraping, but in this case, my idea is to do this logic to take advantage of visual similarities between some websites. I hope make sense.

Regards
Reply With Quote
Reply



Similar Threads
Thread Thread Starter Forum Replies Last Post
Help in trying to identify class string in each stage of URL Turn off an Initialized Class ilcaa72 Word VBA 3 05-01-2017 07:13 PM
Returned to the stage to enter a license code product Rahayu Sinuraya Office 0 01-09-2017 03:28 AM
Help in trying to identify class string in each stage of URL Way to search for a string in text file, pull out everything until another string? omahadivision Excel Programming 12 11-23-2013 12:10 PM
Evernote--Class Notes markg2 Outlook 0 05-10-2012 05:50 PM
Word forces a white border at the Print stage niceguyjin Word 1 08-13-2011 01:46 AM

Other Forums: Access Forums

All times are GMT -7. The time now is 04:53 AM.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2025, vBulletin Solutions Inc.
Search Engine Optimisation provided by DragonByte SEO (Lite) - vBulletin Mods & Addons Copyright © 2025 DragonByte Technologies Ltd.
MSOfficeForums.com is not affiliated with Microsoft