Zillow Scraper

Hi, I'm new to Apify. I'm trying to scrape Zillow data for a school project. I found an actor created by maxcopell. I found the API python code to pull some basic info (https://apify.com/maxcopell/zillow-api-scraper/api#search). When I ran this code, I didn't get the school info. Under the ReadMe I found a reference to how to store the school data (see below). I'm unsure how to use this code. Any help would be appreciated. async ({ item, data }) => { if (!data.schools || !data.schools.length) { return null; // omit output } item.schools = data.schools; // add new array data item.photos = undefined; // remove the photos array from the output, making it CSV friendly delete item.photos; // works as well return item; // need to return the item here, otherwise your dataset willbe empty }
Apify
API · Scrape Zillow real estate listings 🏘️ · Apify
Our free Zillow scraper lets you extract data about properties for sale and rent on Zillow using the Zillow API, but with no daily call limits. Scrape millions of listings and download your data as HTML, JSON, CSV, Excel, XML, and RSS feed.
8 Replies
inland-turquoise
inland-turquoise3y ago
Hi @RMInnovate! Please share your run ID.
rising-crimson
rising-crimsonOP3y ago
jJ5p6H6VLrrz4ODsy
inland-turquoise
inland-turquoise3y ago
I see now, it's rather self-explanatory. You just need to navigate to actor input -> Extend scraper functionality -> Extend output function -> add the snippet you found there (actually better to slightly change it):
async ({ item, data }) => {
if (data.schools) item.schools = data.schools;
return item;
}
async ({ item, data }) => {
if (data.schools) item.schools = data.schools;
return item;
}
That's it - save the input -> run the actor again - now schools should be in output.
rising-crimson
rising-crimsonOP3y ago
Thanks! Right now the Extend output function has async ({ data, item, customData, Apify }) => { return item; } do I replace this or append to it? also, would you know the variable to grab the "walk score" or "bike score" that is on zillow....or perhaps a different way to ask the question, do you know how I can review all variable data available to grab from zillow?
inland-turquoise
inland-turquoise3y ago
You should replace it - the whole function. I am not familiar with zillow api, but you could try some run and add to the same function something like that (including the schools additions):
async ({ item, data }) => {
if (data.schools) item.schools = data.schools;
item.fullResponse = data;
return item;
}
async ({ item, data }) => {
if (data.schools) item.schools = data.schools;
item.fullResponse = data;
return item;
}
you are basically appending the whole item as it's received by the crawler to output, then you could adjust the same as with schools
rising-crimson
rising-crimsonOP3y ago
Thanks so much for your help!!
absent-sapphire
absent-sapphire3y ago
Hi @Andrey Bykov any idea why the scraper might not utilize all the zpids provided as input? Here is an example where 7 of the listings are not part of the output. Thanks in advance! 45 zpids from input Done with 38 listings! https://api.apify.com/v2/logs/TmQb9g01mUjwcbili Anyone who can help with this problem?
inland-turquoise
inland-turquoise3y ago
Hey there! Apologies, I was offline for quite some time. Is there any chance some of those IDs are returning empty page or something like that? I am not familiar much with the workflow of this particular actor, but I'll give it a try, and if would not notice anything - will pass it to the team. Checked, and indeed - some items are sold, some items are pending. If you would enable Debug log in Proxy and browser configuration - you would see that all pages are actually opened, but sold ones are not in output.

Did you find this page helpful?