TigerParse
Oct 23, 2000
Kees Cook <kees@outflux.net>


There are two steps to hauling all the plain-text data into a database:
1) create tables
2) parse each county you want

Step 1: Create the Database Tables
------
Since I didn't want to retype all the table specifications in the Tiger
technical manual, I used the "xpdf" (version 0.90 or better) package to
run "pdf2text" on the technical doc.  From there, I have a Perl parser
that extracts the Record Type information, and spits out a series of
mysql "create table" commands.

To do this for your Tiger data, you'll need the Technical Manual (I 
wrote my parser against the 1999 Technical Manual.  Here's how I used 
it:

	mysqladmin -u root -p create Tiger
	pdf2text tiger99-manual.pdf tiger99-manual.txt
	./parse-tech -mt tiger99-manual.txt | mysql -u root -p Tiger

That built all my tables.


Step 2: Parse Each County You Want Into the Database
------
Grab a county you're interested in, and unzip it somewhere.  To get an idea
of how much data you'll be parsing, run wordcount first:

	unzip ../tgr17031.zip
	wc -l TGR*

If you want, run "parse-tech" with the -v option to watch it count up
while reading in the files:

	./parse-tech -dvt tiger99-manual.txt -c 17031 | mysql -u root -p Tiger

This will populate the database with the county "17031"'s flatfile
information.



I hope this is useful to someone!  :)

- Kees
