You can buy the book here. You can read the book on O'Reilly OFPS now. Work the chapter code examples as you go. Don't forget to initialize your python environment. Try linux (apt-get, yum) or OS X (brew, port) packages if any of the requirements don't install in your virtualenv.
# From project root
# Setup python virtualenv
virtualenv -p `which python2.7` venv --distribute
source venv/bin/activate
pip install -r requirements.txt
# From ch3
# Download your gmail inbox
cd gmail
./gmail.py -m automatic -u [email protected] -p 'my_password_' -s ./email.avro.schema -f '[Gmail]/All Mail' -o /tmp/test_mbox 2>&1 &
An example spreadsheet is available at ch02/Email Analysis.xlsb. Example Pig code is available at ch02/probability.pig.
Full tutorial in Chapter 3 README.
Highlight:
# From ch3
# Download your gmail inbox
cd gmail
./gmail.py -m automatic -u [email protected] -p 'my_password_' -s ./email.avro.schema -f '[Gmail]/All Mail' -o /tmp/test_mbox 2>&1 &
Chapter 4: To the Cloud!