Website backup questions for those whom are savvy....

Started by s/v necessity, January 01, 2014, 08:37:08 PM

Previous topic - Next topic

0 Members and 1 Guest are viewing this topic.

s/v necessity

I know several of the members here are good with computer related stuff, so I thought I might throw this query out there.  I have a good sailing friend who is now deceased, and his website is still up and running (but I do not know how much  longer).  Is there a way for me to backup his website, even though I'm not the account holder?  (Or should I just trust the "wayback machine" for this?).  I know the website represented alot of his hard work (and documented it too), and I just don't want it to disappear.  (I already regret loosing a lot of his old emails).

Travelnik

#1
One easy way to save the info from the site would be to go to each page individually, and save them to your computer using the drop down menu on your browser:  File>Save Page As>Web Page Complete

Put them all in a special folder, and it will keep all the information, pictures, HTML formatting, etc. on your computer.

Then, if you wanted, you could make a new website with all of the info and HTML intact. Even if you couldn't do it yourself, you can always get someone else to help you with that. The main thing is that you would have all the information, photos, etc. saved in the event that your friend's site goes away.

There may be some other ways of doing it, but this method has always worked for me when I backed up a website for future reference.
I'm Dean, and my boat is a 1969 Westerly Nomad. We're in East Texas (Tyler) for now.

cap-couillon

Quick and dirty for the linux geeks like moi... 
Open a "terminal" in the directory you want to save the files and enter the following command.
wget --limit-rate=200k --no-clobber --convert-links --random-wait  -p -E -e robots=off -U mozilla http://site-address-you-want-to-copy.com

I have been told this free program (Window$ / Linux) HTTrack works well if the above is beyond your geek level.

If you have issues drop me a PM with the site address and I will see if I can help you out.
Cap' Couillon

"It seemed like a good idea at the time"
SailingOffTheEdge.com

s/v necessity

I run Ubuntu on my laptop at home, but I frequently have a heck of a time remembering how to open a terminal, so that should give you an idea where I'm at :)

I'll give it a try with Linux.  I already knew how to save an individual page, but I worry I'll miss something.

cap-couillon

Quote from: s/v necessity on January 02, 2014, 04:36:55 PM
I run Ubuntu on my laptop at home, but I frequently have a heck of a time remembering how to open a terminal, so that should give you an idea where I'm at :)
I'll give it a try with Linux.  I already knew how to save an individual page, but I worry I'll miss something.
I have been using *nix since '93 and I still have a cheat sheet taped to the salon bulkhead.  The old AT&T printed Unix manual was the size of the Manhattan telephone directory.

The wget command is really useful for a number of things... In your case the long list of options allows us to look like a regular browser (-U mozilla) ignore the robots.txt file which restricts access to certain areas (-e robots=off ) and convert any relative links to point at your own directory, not the site directory. (All probably more info than you wanted to know, but posted for the geeks amongst us)

Like I said, If you have problems,  contact me and I will provide whatever help I can.   History should not be lost, else we are doomed to repeat it.



Cap' Couillon

"It seemed like a good idea at the time"
SailingOffTheEdge.com

SalientAngle


SalientAngle

ps necessity, I am more than happy to place it online in development space and also forward a zip file

SalientAngle

#7
just one more note, it will be raw html, no "CMS" functionality, just a snapshot, so to speak of all the pages... now, if it is a joomla, wordpress, drupal  or other cms, and, with permissions, a functional updateable site is easily accomplished... cheers, -jim (i hope this is helpful to remember your friend)

SalientAngle

have to hit the sack, will check friday for URL, necessity, will only take a little while to scrape, cheers, -jim, and best of new years to you and yours

CapnK

I'll offer the same as Jim - if we can get permission (if needed?), we can scrape the current site and mirror it on the sailFar.net server.
http://sailfar.net
Please Buy My Boats. ;)