|
You are here: Home / Documentation / Content Migration: Mapping Old Links Into Sitellite |
Content Migration: Mapping Old Links Into SitelliteOne of the big downsides of redeveloping a web site using new technologies or reorganizing web sites at all for that matter, is that often times the URL used to access a given page will have to change as a result. This can cause existing web site visitors to end up in limbo, or seeing undesirable server error messages, in cases where they've bookmarked internal pages on your web site. The following how-to discusses a method whereby you can reroute old page URLs to point to the proper pages on your newly redeveloped web site running Sitellite, so that your visitors never have to experience such things. Taking stockThe first step is to identify all of the old pages that need to be rerouted. An easy way to retrieve the list of active pages from your former site is to analyze your Apache access logs. In the following code, we'll be parsing through our Apache access log to find all of the .html and .htm pages and save them to a text file named old_links.txt. Note that if your web site uses a different file extension for web pages you need to reroute, you can add as many file extensions as you want to the $ext list at the top of the script. Also note that the path I've specified for my access log may not be correct for your web site. Check your server configuration or check with your web server administrator to find out the correct path for your site. <?php
// valid file extensions to map
$ext = array ('html', 'htm');
// open the access log file
$fp = fopen ('/var/log/httpd/access_log', 'r');
if (! $fp) {
die ('File open failed!');
}
$pages = array ();
// read the access log
while (! feof ($fp)) {
$line = fgets ($fp, 4096);
// parse the access log for relevant page requests
list ($pre, $request, $post) = explode ('"', $line);
if (preg_match (
'/^(GET|POST) (.+)\.(' . join ('|', $ext) . ') HTTP\/1/i',
$request,
$regs
)) {
$pages[] = $regs[2] . '.' . $regs[3];
}
}
fclose ($fp);
// save the links
$fp = fopen ('old_links.txt', 'w');
if (! $fp) {
die ('File write failed!');
}
foreach (array_unique ($pages) as $page) {
fwrite ($fp, $page . "\n");
}
fclose ($fp);
?>
You should be able to run this by saving it to a file called parse_log.php in your web site document root, then issuing the following command on your command line: php -f parse_log.php Once you've successfully created the old_links.txt file, you won't need this script again, so you may delete it if you like. Page 1: Taking stock |
|
Copyright © 2008, SIMIAN systems Inc. All rights reserved. Privacy policy Some of the icons on this site were created by the Gnome Project. |