4. Deploy spiders
Author: Luis Rosenstrauch
Last updated
Author: Luis Rosenstrauch
Last updated
This will be normally just for internal use.
Scrapyd is a daemon that can be started to schedule runs
configure your live instance hostname in scrapy.cfg once you tested everything locally you can deploy to live scrapyd and schedule crawls using scrapyd-client
docker exec -ti cli bash
scrapyd-deploy live
once deployed you can interact directly with scrapyd through the webapi, either using the client
docker exec -ti cli bash scrapyd-client -t
schedule -p Hoaxlyspiders climatefeedback.org
or from anywhere else.
curl
-d project=Hoaxlyspiders -d spider=climatefeedback.org curl
curl
A crawl can be scheduled to run regularly by deploying it to a dedicated server.
For portia spiders deployment should work normally but currently requires a workaround in our settings.