[Sigia-l] automated site mapping tools?
karl fast
karl.fast at pobox.com
Sat Jun 22 09:06:19 EDT 2002
> I'm presented with a site with 7,000+ static HTML files which has grown,
> um, organically over the years.
>
> What are the recommendations for software which will crawl the site and
> produce a list of all pages and all links on those pages? It doesn't
> necessarily need to produce pretty hierarchical diagrams (we're not even
> certain if the site is truly hierarchical).
You can always buy something, but there are lots of open source
tools that will do the same thing. I haven't used many of them but
here are a few that I found with a few minutes of digging:
The OPD has a list of these tools:
Site Management Tools
http://directory.google.com/Top/Computers/Software/Internet/Site_Management/
Link Management Tools
http://directory.google.com/Top/Computers/Software/Internet/Site_Management/Link_Management/
Freshmeat.net lists open source tools under Link Checking. There are
34, but only a few will meet your needs.
http://freshmeat.net/browse/244/?topic_id=244
Some possibles include:
SiteMapper (PHP)
http://agent-source.com/sitemapper/
SiteMapper.php was created to build a "site map" of a web site. It
takes a given URL and spiders/crawls the local links found from
there to build a single HTML page listing all links found. The
resultant page is useful in the following ways:
Sitemapper.pl (Perl)
http://www.cpan.org/modules/by-module/LWP/sitemapper-1.019.readme
http://www.cpan.org/modules/by-module/LWP/sitemapper-1.019.tar.gz
sitemapper.pl is a simple perl script which generated an HTML site
map from a given URL. It does this by traversing the site, getting
the home page, extracting links from it, getting all the pages
linked, and so on.
nSite (Perl?)
http://www.horsburgh.com/h_nsite.html
nSite generates site maps for a given WWW site. It walks a site
from the root URL and generates an HTML, TEXT, or XML link page
which illustrates the structure of the site.
Hope this helps....
--karl
More information about the Sigia-l
mailing list