You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When the spider is running and suddenly redis is breakdown, the spider will crash. Is there a parameter in settings.py to let the spider waiting for redis's comming back?
The text was updated successfully, but these errors were encountered:
Add the try ... except in spider.RedisSpider's next_requests() method may be one of the solution. Or you have another Pythonic way.
def next_requests(self):
"""Returns a request to be scheduled or none."""
use_set = self.settings.getbool('REDIS_START_URLS_AS_SET', defaults.START_URLS_AS_SET)
fetch_one = self.server.spop if use_set else self.server.lpop
# XXX: Do we need to use a timeout here?
found = 0
# TODO: Use redis pipeline execution.
while found < self.redis_batch_size:
try:
data = fetch_one(self.redis_key)
except Exception:
# according to the parameter in settings to determine
# what to do here, raise an exception or just wait and retry.
if not data:
# Queue empty.
break
req = self.make_request_from_data(data)
if req:
yield req
found += 1
else:
self.logger.debug("Request not made from data: %r", data)
if found:
self.logger.debug("Read %s requests from '%s'", found, self.redis_key)
When the spider is running and suddenly redis is breakdown, the spider will crash. Is there a parameter in
settings.py
to let the spider waiting for redis's comming back?The text was updated successfully, but these errors were encountered: