A Bash Script to Quickly Download Entire PicasaWeb Albums

Magic Banana's picture

Google restricts the download of entire PicasaWeb albums to users of its proprietary (and Windows only) software Picasa. A fairly simple Bash script can do that and even more.

<mylife>
I finally succeeded in making my mother understand that there exists cleaner ways to share her Christmas pictures than filling up her contacts' mail boxes. Thus, she has got a PicasaWeb "gallery" she feeds via F-Spot. She had such a success with her "novel" approach that the rest of the family now does the same. Now, she wants to download a whole album without having to visualize the pictures one by one.
</mylife>

Searching solutions on the Web, I discovered an incorrect Bash script having in dependency a XML parser and a C# program which was reported to work with Mono. All in all, nothing to please me. Furthermore, both take in argument the RSS feed address (instead of the URL of the album you receive in the invitation mail) and cannot download at once all the public albums of a given user.

Hence, I looked at the source code of PicasaWeb pages and quickly understood it will be easy to achieve my goals in a few lines of Bash and without any dependency (the most complicated commands I use are wget and grep!). Here is the result:

#!/bin/bash
# Distributed under the terms of the GNU General Public License v3 or later
# AUTHOR: Loïc Cerf
# e-mail: magicbanana@gmail.com
WGET_OPT="-q -T 180 -t 3 -c"
EX_USAGE=64
EX_NOHOST=68
if [ -z "$1" -o "$1" = "--help" -o "$1" = "-h" ]
then
echo "Usage: $0 url [destination]"
exit
fi
page=${1#*picasaweb.google.*/}
if [ "$page" = "$1" ]
then
echo "\"$1\" is not the URL of a PicasaWeb album or gallery" 1>&2
exit $EX_USAGE
fi
temp=`mktemp`
trap "rm $temp" EXIT
if wget $WGET_OPT -O $temp "$1"
then
finalPage=${page#*/}
if [ -z "$finalPage" -o "$finalPage" = "$page" ]
then
# $temp is a gallery
if [ -z "$2" ]
then
destination=`grep -m 1 "^var _user" $temp`
destination=${destination##*nickname:\"}
set "$1" "${destination%%\"*}"
fi
mkdir -p "$2"
cd "$2"
grep -E -o "$1"[/]?[[:alnum:]:.%~_-]+ $temp | sort -u |
while read album
do
"$0" $album &
done
else
# $temp is an album
if [ -z "$2" ]
then
destination=`grep -m 1 "^var _album" $temp`
destination=${destination##*title:\"}
set "$1" "${destination%%\"*}"
fi
grep -E -o {id:\"[0-9]+\",s:\"[[:alnum:]:\\.%~_-]+ $temp |
while read picture
do
picture=${picture##*\"}
picture=${picture/\x2Fs144/}
wget $WGET_OPT -P "$2" ${picture//\x2F//} &
done
fi
else
exit $EX_NOHOST
fi

Using this script is very easy: picasaweb-download URL [destination]

The URL may either point to an album (public or private with the authentication key terminating the URL) or a whole gallery. For example, the following command will download the album titled "GPLv3-2006 Conference" from the PicasaWeb gallery of "Ramprasad B":
$ picasaweb-download http://picasaweb.google.com/ramprasad.i82/GPLv32006Conference/
To download all his public albums:
$ picasaweb-download http://picasaweb.google.com/ramprasad.i82/

The destination is optional. If present, it is the directory (if it does not exist, it will be created) where the pictures are downloaded. If absent, the album (respectively the gallery) is downloaded in a directory having its title (respectively the author's nickname). In the case of a gallery, every album is downloaded in a separate sub-directory bearing its name.

To be able to download galleries, this script must be saved in a directory of the PATH variable (typically /usr/bin).

Obtain the root permissions (*buntu users would use sudo instead):
$ su
Open your favourite text editor (here gedit) on a new file in a directory of the PATH variable (here /usr/bin):
# gedit /usr/bin/picasaweb-download
Copy-paste the script in your text editor. Close it and, finally, make the script executable:
# chmod 755 /usr/bin/picasaweb-download

Any feedback (even positive!) would be appreciated.

EDIT: Free Software is Good! A user, named Peter Woulfenberg, has just mailed me to suggest a very simple and clever improvement to my code. His proposal fits in one character: '&'. Placed at the end of the line downloading a picture, it makes this job run in the background. In this way, the script continues and starts downloading the remaining pictures. Following the same idea, I made an analogous modification to grab several albums in parallel. This script is now amazingly fast. Thank Peter for these performances. Thank Free Software too. Indeed, if this software was proprietary, you would still be stuck with poor performances.

EDIT2: Two small improvements.


Comments

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Bug

Thank you for your script, it is very useful!

But I found a bug. In Picasa Web Album it is possible to upload images with the same name in the same album; but when I download that album with your script, there is a problem: all the images with the same name mix themself in a wrong file.

Is it possible to avoid this behaviour?


I will look into that

It must an interesting way of visualizing parallelism!

Well, I did not think of this issue. Sorting the pictures' names, checking whether the current name differs from the previous one, and maintaining a count of the number of duplicate names should be the key. I will look into that when I have time (I am not sure, when it will be though!).


How to turn URIs into UTF-8 ?

Well... I basically tried what I suggested above. It works... except that any file containing "special" characters (space, accentuated characters, asiatic ones, etc.) now keeps a "URI form". Although It is not a problem if the picture's name is like img_2514.jpg, I would like to do the same magic as wget, turning the URIs into UTF-8. If anyone knows how to do that easily, please tell me! It looks like the tr command does not help here...

Anyway, if you want to avoid the duplication trouble to the cost of losing the translation URI to UTF-8, here is a new end for the script (the top does not change) :
# $temp is an album
if [ -z "$2" ]
then
destination=`grep -m 1 "^var _album" $temp`
destination=${destination##*title:\"}
set "$1" "${destination%%\"*}"
fi
mkdir -p "$2"
grep -E -o {id:\"[0-9]+\",s:\"[[:alnum:]:\\.%~_-]+ $temp |
while read picture
do
# Get the url from the picture line
picture=${picture##*\"}
picture=${picture/\x2Fs144/}
echo ${picture//\x2F//}
done | sort -t "/" -k 8 |
while read url
do
# Rename the picture if its name was previously encountered
name=${url##*/}
if [ "$name" = "$oldname" ]
then
let "counter += 1"
out="$2/$name.$counter"
else
oldname="$name"
counter=0
out="$2/$name"
fi
# Download the picture
wget $WGET_OPT -O "$out" $url &
done
fi
else
exit $EX_NOHOST
fi


I tested it

I just tried your new version and I like it because it avoid merging files with same name. Now I can download all pictures without having to worry about duplicated names and I thank you very much!

However it is true that there is a little problem with URI and UTF8 (but it is not serious), but I am not able to help you.

P.S. I have a doubt: the old script ends with "rm $temp", but there is not this line in the new end; have I to mantain it or to delete it? Now I have left it at the end of the updated script and it seems that it works good.


The removal of $temp is now traped

You had the previous version of the script. I change it a few weeks ago so that the temporary file is removed even if you terminate the application with Ctrl+C. The key is the trap built-it command:
temp=`mktemp`
trap "rm $temp" EXIT

Hence, if you use the script currently in the article as a base (and substitute its end), you do not need the rm command at the end.


Picasa is not windows-only

Picasa has a Linux version; see http://picasa.google.com/linux/


Anyway, it is proprietary...

Oops... It must be pretty new! When you go to your PicasaWeb homepage (at least on mine), you can still read that there are not any clean way to transfer pictures from/to GNU/Linux (approximative translation from French).

Anyway, I refuse to install a proprietary software on my GNU/Linux system.

PS: By the way, this Bash script is a very clean way to download albums from PicasaWeb and F-Spot (from version 0.4.0) does a good job in sending pictures to this site (tags included).