Tag: shell script

How to download an entire Google Site

When using Google Sites, there is currently no way to make a backup of your site, or download the site so you can host it on another server.

This command uses a tool called wget to spider through a website and download all the public files to the local computer. Unix users will most likely have the wget tool already installed (if not, you can install it via your preferred package manager), while Windows users can get it from here.

Once wget is installed, run it with the following parameters:

1
2
3
#Downloads all public pages on a Google Site

wget -e robots=off -m -k -K -E -rH -Dsites.google.com http://sites.google.com/a/domain/site/

This tells wget to spider through all the links on your site and download the html files and linked content (such as images). Note that pages that aren't linked from anywhere on the site won't be downloaded.

This technique will also work for websites other than the ones hosted on Google Sites.


Backing up data to an external server via SSH

I recently needed to back up the contents of a website, but found that a disk quota was preventing me from doing so. What I really needed to do was find a way to compress all the files and, instead of storing the archive locally, pipe the output to another server.

After much Googling and messing about, I ended up with the following command:

1
2
3
#Uses the tar utility to backup files to an external server

tar zcvf - /path/to/backup | ssh user@server:port dd of="filename.tgz" obs=1024

Of course, this is only practical for a one-off data dump. If regular backups were needed, using rsync would be the best option, as it only transfers incremental changes. An excellent tutorial can be found here.

© Carey Metcalfe. Built using Pelican. Theme is subtle by Carey Metcalfe. Based on svbhack by Giulio Fidente.