Making API request with crawlee?

I need to make an api POST request to retrieve information in the body . I tried using the Basic crawler, since it uses Got-scrapping under the hood. I've had no success and I optedfor installing an axios package instead. The code
import { BasicCrawler} from 'crawlee';
// import { stored_data} from './cookies.js';


const crawler = new BasicCrawler({
async requestHandler ({ sendRequest , log }){
const response = await sendRequest ({
url: 'https://intouch.techdata.com/Intouch/ProductFE/api/ProductSearch/Search',
method: 'POST',
headers: {
hello:'hello',
referer:'https://intouch.techdata.com/InTouch/MVC/ProductSearch/Search?tscv=4294949918',
},
json: {"TaxonomySubClassValue":4294950346,"DimensionValueIds":[4294950346],"SpecialistId":"1","SpecialistName":"BroadLine","IsWestminster":false,"SortBy":0,"ResultPreview":0,"RecordsPageSize":20,"RecordsOffset":2,"ShowTopSellersFromEndeca":true},
responseType:'json'
});
log.info(response.body);
}
});

await crawler. Run()
import { BasicCrawler} from 'crawlee';
// import { stored_data} from './cookies.js';


const crawler = new BasicCrawler({
async requestHandler ({ sendRequest , log }){
const response = await sendRequest ({
url: 'https://intouch.techdata.com/Intouch/ProductFE/api/ProductSearch/Search',
method: 'POST',
headers: {
hello:'hello',
referer:'https://intouch.techdata.com/InTouch/MVC/ProductSearch/Search?tscv=4294949918',
},
json: {"TaxonomySubClassValue":4294950346,"DimensionValueIds":[4294950346],"SpecialistId":"1","SpecialistName":"BroadLine","IsWestminster":false,"SortBy":0,"ResultPreview":0,"RecordsPageSize":20,"RecordsOffset":2,"ShowTopSellersFromEndeca":true},
responseType:'json'
});
log.info(response.body);
}
});

await crawler. Run()
11 Replies
foreign-sapphire
foreign-sapphire•3y ago
And what is the exact problem you have?
like-gold
like-goldOP•3y ago
I just wanted to understand why it didn't work with BasicCrawler, is there a mistake in my code?
foreign-sapphire
foreign-sapphire•3y ago
I cannot tell because I would have to analyze that you are passing everything correctly. Look at the got library for correct params
complex-teal
complex-teal•3y ago
I'd recommend using HttpCrawler https://crawlee.dev/docs/examples/http-crawler
HTTP crawler | Crawlee
This example demonstrates how to use HttpCrawler to crawl a list of URLs from an external file, load each URL using a plain HTTP request, and save HTML.
like-gold
like-goldOP•3y ago
How can I use Http crawler for a post request with custom headers and body ?
MEE6
MEE6•3y ago
@Joaquim just advanced to level 2! Thanks for your contributions! 🎉
complex-teal
complex-teal•3y ago
Here's an example:
import { HttpCrawler } from 'crawlee';
import type { RequestOptions } from 'crawlee';

const requests: RequestOptions[] = [
{
url: 'https://foo.com',
method: 'POST',
payload: JSON.stringify({
foo: 'bar',
}),
headers: {
'Content-Type': 'application/json',
'cookie': 'key=value; other_key=other_value;'
},
},
];

const crawler = new HttpCrawler({
requestHandler: () => {
// ...
},
});

await crawler.run(requests);
import { HttpCrawler } from 'crawlee';
import type { RequestOptions } from 'crawlee';

const requests: RequestOptions[] = [
{
url: 'https://foo.com',
method: 'POST',
payload: JSON.stringify({
foo: 'bar',
}),
headers: {
'Content-Type': 'application/json',
'cookie': 'key=value; other_key=other_value;'
},
},
];

const crawler = new HttpCrawler({
requestHandler: () => {
// ...
},
});

await crawler.run(requests);
complex-teal
complex-teal•3y ago
I'd recommend checking out the RequestOptions interface to understand all the options available: https://crawlee.dev/api/core/interface/RequestOptions
RequestOptions | API | Crawlee
Specifies required and optional fields for constructing a {@apilink Request}.
like-gold
like-goldOP•3y ago
Thanks really helpful, I guess I won't need install axios in future projects after I master this
foreign-sapphire
foreign-sapphire•3y ago
You can also do any extra HTTP with context.sendRequest from the handler
like-gold
like-goldOP•3y ago
Isn't it more convenient or equal to use addRequest ?

Did you find this page helpful?