DRKSpiderJava is a legacy, stand-alone website crawler tool developed by drkbugs to map website architectures and detect broken links. It crawls specified domains to analyze inner links, analyze external domain mappings, and build structured page trees. Core Features
Hierarchical Site Mapping: It visualizes the internal structure of a target website by generating a tree that maps out the distribution of pages.
Link Validation: It analyzes all discovered links, including internal assets and external cross-domain pointers, to flag broken URLs.
Granular Crawl Controls: Users can configure specific limits, including maximum depth levels, custom URL exclusion lists, and options to strictly obey or ignore robots.txt definitions.
Global Content Searching: It features an optional in-memory storage setting, allowing users to run global search queries across all cached text content from the crawled site.
Data Exporting: Once a crawl completes, data can be exported into lists detailing comprehensive page links, a structured sitemap, and specific lists of broken links. Technical Limitations
Because it is a legacy Java-based application, it lacks native support for modern .xml format sitemap exports. It operates primarily as an analytical utility rather than a modern SEO automation platform.
If you are looking to deploy or configure this tool, let me know:
Are you using it for broken link discovery or SEO sitemap building?
Do you need help configuring its crawl parameters like the URL exclusion list? AI responses may include mistakes. Learn more DRKSpiderJava v0.83 – drkbugs
Leave a Reply