command-not-found.com

Maintainer: Debian Python Modules Team &lt;<a href="/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="adddd4d9c5c2c380c0c2c9d8c1c8de80d9c8ccc0edc1c4ded9de83ccc1c4c2d9c583c9c8cfc4ccc383c2dfca">[email&#160;protected] &gt;
Homepage: http://scrapy.org/
Section: python

scrapy

Web-crawling framework. More information: <https://scrapy.org>.

Create a project:
scrapy startproject project_name
Create a spider (in project directory):
scrapy genspider spider_name website_domain
Edit spider (in project directory):
scrapy edit spider_name
Run spider (in project directory):
scrapy crawl spider_name
Fetch a webpage as Scrapy sees it and print the source to `stdout`:
scrapy fetch url
Open a webpage in the default browser as Scrapy sees it (disable JavaScript for extra fidelity):
scrapy view url
Open Scrapy shell for URL, which allows interaction with the page source in a Python shell (or IPython if available):
scrapy shell url