Scrapy Crawl URLs in Order

Scrapy Request has a priority attribute now.

If you have many Request in a function and want to process a particular request first, you can set:

def parse(self, response):
    url="http://www.example.com/first"
    yield Request(url=url, callback=self.parse_data, priority=1)

    url="http://www.example.com/second"
    yield Request(url=url, callback=self.parse_data)

Scrapy will process the one with priority=1 first.

Leave a Comment