Find missing links and server errors with Screaming Frog SEO Spider


Every time I push a significant feature set, or prepare to publish a release, I do some simple checks to make sure I didn't forget to deploy a file or run a SQL script. A few months ago I was looking for a cheap web service to do some web page crawling, but I found Screaming Frog SEO Spider and it's been useful from the start.

The free version will support up to 500 pages in a single crawl. It captures h1 and h2 tags, along with meta data, titles, images, and all sorts of stuff that is useful to SEO efforts.

Here's a sample of some output from -

You can see that I had a few 500 errors. These are pages that would have failed to display due to a configuration issue or an unhandled exception in the server side processing.

There are entries with URLs matching /Build/Files/ID - these are links to view all files shipped by a build. Looking into the issue, it turns out that these three builds have zips in which all files are inside a subfolder inside the zip. So that's a new bug.

Then, the URL that's selected - /Build. Build Keeper.NET is designed to require an ID parameter for the default action in the "Build" area of the site: it needs to know which build the user wants to look at. It's completely wrong for anything in the site to link to /Build because the site needs a build ID.

This means that somewhere in my site, something is trying to link to a build but is failing to include the build's ID.

In the bottom panel of SEO Spider, I selected "In Links" which shows which pages link to the selected 500'd page. It should be pretty clear that there's a link in /Build/Delete/ID that has a problem.

When I go to the source of that page...

     <input type="submit" value="Delete" /> | @Html.ActionLink("Cancel", "Index")

The Cancel link returns to /Build/Index (which is the details page), but it doesn't include the build's ID.

And Bingo was his nameo. Another bug.

Anyway, you can see how this might be a very useful tool to have in the box. The free version is useful and for $99 bucks, you can remove the 500-page limit.