"before swearing at me." Followed by shitty" and "web crawlers" in all caps ^^ S...

scott_s · on June 7, 2012

Slow down there -- why would you care about invalid links?

Because Stack Overflow is a site whose purpose is to answer questions. People may provide links when asking or answering a question, and those links may be important in understanding either the question or the answer. So invalid links degrade the value of the site.

What they're doing is fundamentally different than web crawling. Web crawlers are about discovering content. That means starting at a root and crawling out to see what you can find. One URL can spawn many more URLs to look at. They are starting with a known URL, and seeing if they can visit that URL. They have one URL, and only visit one URL.