Extract Internal and External Links
Mass Harvester Internal and Outbound Links from Webpages
This addon included free with ScrapeBox is able to extract all the internal, external or both internal and external links from a list of webpages. This is ideal if you want a count of the amount of links on every page of your website, or you need to extract all the outbound links for reasons such as scanning them in the ScrapeBox Malware Filter addon to ensure you are not linking to any bad pages from your website. It can also be used for things such as collecting all the internal links on your website and then creating a .xml or .html sitemap which ScrapeBox can create in seconds.
Once installed, you can load a list of domains harvested from ScrapeBox or load domains from a file, you can set how many concurrent threads you wish to use. The addon is multi-threaded and can retrieve the Internal and External links for 100’s of domains per minute. Once the results are fetched, you can export the data to a plain text file for use in ScrapeBox, other software.
Google and other search engines generally recommend not creating more than 100 links per page for usability and not linking to bad neighborhoods, this addon is useful for doing a checkup of your sites pages, it may also assist in discovering hidden links in your sites source code which are not visible when viewing the page normally in a browser.
Link Extractor Settings
The Link Extractor Addon also has a number of settings available to customize the extracted URL’s.
It’s able to run multi-threaded with up to 1,000 connections or it can be run single threaded with a delay as a human emulation option to avoid being blocked by some web servers.
You are also able to avoid extracting URL’s with long links like large query string advertising URL’s as well as being able to setup blacklist and whitelist keywords.
There’s even options to scrape or ignore https URL’s, randomize your lists and treat subdomains as external links or internal links.
Extracted Links
The Link Extractor Addon saves all extracted URL’s to a text file in real time while it’s running.
By extracting external links from well known sites like Wall Street Journal, BBC, New York Times and similar websites they can then be tested using the ScrapeBox Domain Availability Checker to find expired domains that are free to register linked from some of the most popular sites on the internet.
Some sites charge a large monthly fee for expired domain finding software, ScrapeBox can do the same thing for a low one time payment.
Link Extractor Tutorial
View our video tutorial showing the Internal and External Link Extractor in action. This is a free addon included with ScrapeBox, and is also compatible with our Automator Plugin.
We have hundreds of video tutorials for ScrapeBox.
View YouTube Channel