1. Fetch more data from the print page ([url removed, login to view]) and store that data in the database. Pretty much get all the data found on that page. Here's a spreadsheet of the data we want. [url removed, login to view]
2. Automate the fetching so it can use a database of zip codes and go through the zip codes one by one scraping zillow for addresses. Store the addresses in a database table rather than a txt file that gets deleted so that we can run through the database again without having to fetch addresses (or add addresses to a zip code)
3. Ensure that the database has an id for each house so that the same data does not get stored twice in case the script runs over the same zip again. It can be the same house sold in a different year so perhaps identifier could be zillowid+year_sold.