Microsoft Office Forums

Go Back   Microsoft Office Forums > >

Reply
 
Thread Tools Display Modes
  #1  
Old 11-10-2022, 09:15 PM
KenseyB KenseyB is offline Deleting pictures, ads, social media links, etc. from news articles copy/pasted from the web Windows 10 Deleting pictures, ads, social media links, etc. from news articles copy/pasted from the web Office 2021
Novice
Deleting pictures, ads, social media links, etc. from news articles copy/pasted from the web
 
Join Date: Nov 2022
Posts: 2
KenseyB is on a distinguished road
Default Deleting pictures, ads, social media links, etc. from news articles copy/pasted from the web

Would love any help on how to copy/paste articles like the ones in the links below into Word and then run a macro that only preserves the text of the news article and changes it to a standard format (such as Arial font in size 11 in the color black - with paragraphs and spacing all consistent). The articles online are full of pictures, ads, columns, links, etc. and I just need to preserve the actual content of the article for 100+ pages every day. Also including what I've written so far for the macro below the article links. Would appreciate any help - thank you so much.


Analysis: Silicon Valley'''s greatest minds misread pandemic demand. Now their employees are paying for it. | CNN Business

UN Expert Group Proposes Rules for Net Zero Commitments - ESG Today

https://apnews.com/article/challenge...94f142fea26126

5 facts drivers need to know about tech-enabled safety features




Sub Macro1()
'
' Macro1 Macro
'
'
Selection.WholeStory
With Selection.Font
.Name = "Arial"
.Size = 11
.Italic = False
.Underline = wdUnderlineNone
.UnderlineColor = wdColorAutomatic
.StrikeThrough = False
.DoubleStrikeThrough = False
.Outline = False
.Emboss = False
.Shadow = False
.Hidden = False
.SmallCaps = False
.AllCaps = False
.Engrave = False
.Superscript = False
.Subscript = False
.Spacing = 0
.Scaling = 100
.Position = 0
.Animation = wdAnimationNone
.Ligatures = wdLigaturesNone
.NumberSpacing = wdNumberSpacingDefault
.NumberForm = wdNumberFormDefault
.StylisticSet = wdStylisticSetDefault
.ContextualAlternates = 0
End With
With Selection.ParagraphFormat
.LeftIndent = InchesToPoints(0)
.RightIndent = InchesToPoints(0)
.SpaceBefore = 0
.SpaceBeforeAuto = False
.SpaceAfter = 12
.SpaceAfterAuto = False
.LineSpacingRule = wdLineSpaceSingle
.Alignment = wdAlignParagraphLeft
.WidowControl = True
.KeepWithNext = False
.KeepTogether = False
.PageBreakBefore = False
.NoLineNumber = False
.Hyphenation = True
.FirstLineIndent = InchesToPoints(0)
.CharacterUnitLeftIndent = 0
.CharacterUnitRightIndent = 0
.CharacterUnitFirstLineIndent = 0
.LineUnitBefore = 0
.LineUnitAfter = 0
.MirrorIndents = False
.TextboxTightWrap = wdTightNone
.CollapsedByDefault = False
End With
Selection.Find.ClearFormatting
Selection.Find.Replacement.ClearFormatting
With Selection.Find
.Text = "( ){2,}"
.Replacement.Text = " "
.Forward = True
.Wrap = wdFindAsk
.Format = False
.MatchCase = False
.MatchWholeWord = False
.MatchAllWordForms = False
.MatchSoundsLike = False
.MatchWildcards = True
End With
Selection.Find.Execute Replace:=wdReplaceAll
Selection.Find.ClearFormatting
Selection.Find.Replacement.ClearFormatting
With Selection.Find
.Text = "^m"


.Replacement.Text = ""
.Forward = True
.Wrap = wdFindAsk
.Format = False
.MatchCase = False
.MatchWholeWord = False
.MatchAllWordForms = False
.MatchSoundsLike = False
.MatchWildcards = True
End With
Selection.Find.Execute Replace:=wdReplaceAll
Selection.Find.ClearFormatting
Selection.Find.Replacement.ClearFormatting
With Selection.Find
.Text = "^p^p"
.Replacement.Text = "^p"
.Forward = True
.Wrap = wdFindAsk
.Format = False
.MatchCase = False
.MatchWholeWord = False
.MatchWildcards = False
.MatchSoundsLike = False
.MatchAllWordForms = False
End With
Selection.Find.Execute Replace:=wdReplaceAll
Selection.Find.ClearFormatting
Selection.Find.Replacement.ClearFormatting
With Selection.Find
.Text = "^g"
.Replacement.Text = ""
.Forward = True
.Wrap = wdFindAsk
.Format = False
.MatchCase = False
.MatchWholeWord = False
.MatchWildcards = False
.MatchSoundsLike = False
.MatchAllWordForms = False
End With
Selection.Find.Execute Replace:=wdReplaceAll
End Sub
Reply With Quote
  #2  
Old 11-11-2022, 01:01 PM
BrianHoard BrianHoard is offline Deleting pictures, ads, social media links, etc. from news articles copy/pasted from the web Windows 10 Deleting pictures, ads, social media links, etc. from news articles copy/pasted from the web Office 2019
Advanced Beginner
 
Join Date: Jul 2022
Location: Haymarket, VA USA
Posts: 85
BrianHoard is on a distinguished road
Default

What about starting the process, by using the Firefox browser with a Ad Blocker, such as Ublock Origin and Adblock Plus, then hitting the Reader View button, which removes images and simplifies the page.
See screenshot.
Attached Images
File Type: png snap.png (73.9 KB, 16 views)
Reply With Quote
  #3  
Old 11-13-2022, 05:39 PM
KenseyB KenseyB is offline Deleting pictures, ads, social media links, etc. from news articles copy/pasted from the web Windows 10 Deleting pictures, ads, social media links, etc. from news articles copy/pasted from the web Office 2021
Novice
Deleting pictures, ads, social media links, etc. from news articles copy/pasted from the web
 
Join Date: Nov 2022
Posts: 2
KenseyB is on a distinguished road
Default

This is helpful, thank you! Only thing is that it is my work computer, so I need to make sure I can do that on it. Will try. Thank you again!
Reply With Quote
  #4  
Old 11-13-2022, 07:22 PM
BrianHoard BrianHoard is offline Deleting pictures, ads, social media links, etc. from news articles copy/pasted from the web Windows 10 Deleting pictures, ads, social media links, etc. from news articles copy/pasted from the web Office 2019
Advanced Beginner
 
Join Date: Jul 2022
Location: Haymarket, VA USA
Posts: 85
BrianHoard is on a distinguished road
Default

If that fails, my next thought would be to save the webpage out first as a plain ol' text file. That alone would remove all styling and images. Of course, with 100 plus per day, that definitely would need to be automated.
I find myself recommending Python a lot for questions on this forum. Python has some great tools to do web scraping, and text processing. Not sure yet about doing it all from Word.
Reply With Quote
  #5  
Old 11-14-2022, 10:37 AM
Italophile Italophile is offline Deleting pictures, ads, social media links, etc. from news articles copy/pasted from the web Windows 11 Deleting pictures, ads, social media links, etc. from news articles copy/pasted from the web Office 2021
Expert
 
Join Date: Mar 2022
Posts: 554
Italophile is just really niceItalophile is just really niceItalophile is just really niceItalophile is just really nice
Default

Why not just use the Paste as Text Only option?

Make sure the paragraph you are pasting into has the formatting you require. Simplest to create or modify a style to the settings you want and apply that first.

All this can be done without using any code.
Reply With Quote
Reply



Similar Threads
Thread Thread Starter Forum Replies Last Post
Default size of pasted pictures Corner_Boy Word 4 08-21-2021 06:01 AM
Problem copy/pasting internet articles to Word 2016 cricket1001 Word 1 03-16-2021 06:16 PM
Create News Summary log into Excel with the latest news starting in cell A2 Matrix2021 Excel 0 12-26-2020 06:37 AM
Deleting pictures, ads, social media links, etc. from news articles copy/pasted from the web Pasted pictures go all over the place, not at the cursor WaltR Word 9 09-20-2013 05:58 PM
Package for CD with links to Windows media player content and Flash Shockwave content hectorh PowerPoint 4 10-15-2009 12:22 PM

Other Forums: Access Forums

All times are GMT -7. The time now is 09:17 AM.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2025, vBulletin Solutions Inc.
Search Engine Optimisation provided by DragonByte SEO (Lite) - vBulletin Mods & Addons Copyright © 2025 DragonByte Technologies Ltd.
MSOfficeForums.com is not affiliated with Microsoft