Skip to main content


Showing posts from January, 2017

Java Web Crawler using JSoup

A much needed program for a business application is the infamous web-crawler. There are a few paid programs that accomplish this task. I wanted to create a web-crawler that is expandable; currently it traverses the website and gathers the different links. Future updates will provide capabilities such as:

Checking broken image linksGathering useful information from each pageChecking broken linksChecking for repetitive information Each one of these is crucial for SEO and now there's going to be an automated way to check for each.
To begin, download the website-crawler from
Create a project and copy the files in the src folder to your IDE. Open the file and change the websiteAddress property. That's it. Compile and run.
To run through the program, instantiates In, the output.txt file gets created so that the links can be stored in the text file. The URL is passed to the storeL…