久久久久久久av_日韩在线中文_看一级毛片视频_日本精品二区_成人深夜福利视频_武道仙尊动漫在线观看

<legend id='YiBE9'><style id='YiBE9'><dir id='YiBE9'><q id='YiBE9'></q></dir></style></legend>

  • <small id='YiBE9'></small><noframes id='YiBE9'>

        <bdo id='YiBE9'></bdo><ul id='YiBE9'></ul>
      <tfoot id='YiBE9'></tfoot>
      1. <i id='YiBE9'><tr id='YiBE9'><dt id='YiBE9'><q id='YiBE9'><span id='YiBE9'><b id='YiBE9'><form id='YiBE9'><ins id='YiBE9'></ins><ul id='YiBE9'></ul><sub id='YiBE9'></sub></form><legend id='YiBE9'></legend><bdo id='YiBE9'><pre id='YiBE9'><center id='YiBE9'></center></pre></bdo></b><th id='YiBE9'></th></span></q></dt></tr></i><div class="qwawimqqmiuu" id='YiBE9'><tfoot id='YiBE9'></tfoot><dl id='YiBE9'><fieldset id='YiBE9'></fieldset></dl></div>
      2. 是否可以從 Scrapy spider 運行另一個蜘蛛?

        Is it possible to run another spider from Scrapy spider?(是否可以從 Scrapy spider 運行另一個蜘蛛?)
        1. <legend id='YCTlj'><style id='YCTlj'><dir id='YCTlj'><q id='YCTlj'></q></dir></style></legend>

              <tbody id='YCTlj'></tbody>

            <small id='YCTlj'></small><noframes id='YCTlj'>

          • <tfoot id='YCTlj'></tfoot>

                  <bdo id='YCTlj'></bdo><ul id='YCTlj'></ul>

                  <i id='YCTlj'><tr id='YCTlj'><dt id='YCTlj'><q id='YCTlj'><span id='YCTlj'><b id='YCTlj'><form id='YCTlj'><ins id='YCTlj'></ins><ul id='YCTlj'></ul><sub id='YCTlj'></sub></form><legend id='YCTlj'></legend><bdo id='YCTlj'><pre id='YCTlj'><center id='YCTlj'></center></pre></bdo></b><th id='YCTlj'></th></span></q></dt></tr></i><div class="qwawimqqmiuu" id='YCTlj'><tfoot id='YCTlj'></tfoot><dl id='YCTlj'><fieldset id='YCTlj'></fieldset></dl></div>
                  本文介紹了是否可以從 Scrapy spider 運行另一個蜘蛛?的處理方法,對大家解決問題具有一定的參考價值,需要的朋友們下面隨著小編來一起學(xué)習(xí)吧!

                  問題描述

                  限時送ChatGPT賬號..

                  現(xiàn)在我有 2 只蜘蛛,我想做的是

                  For now I have 2 spiders, what I would like to do is

                  1. Spider 1 轉(zhuǎn)到 url1 并且如果出現(xiàn) url2 ,用 url2<調(diào)用蜘蛛 2/代碼>.也使用管道保存url1的內(nèi)容.
                  2. 蜘蛛2url2做點什么.
                  1. Spider 1 goes to url1 and if url2 appears, call spider 2 with url2. Also saves the content of url1 by using pipeline.
                  2. Spider 2 goes to url2 and do something.

                  由于兩種蜘蛛的復(fù)雜性,我想將它們分開.

                  Due to the complexities of both spiders I would like to have them separated.

                  我使用 scrapy crawl 的嘗試:

                  def parse(self, response):
                      p = multiprocessing.Process(
                          target=self.testfunc())
                      p.join()
                      p.start()
                  
                  def testfunc(self):
                      settings = get_project_settings()
                      crawler = CrawlerRunner(settings)
                      crawler.crawl(<spidername>, <arguments>)
                  

                  它會加載設(shè)置但不會抓取:

                  It does load the settings but doesn't crawl:

                  2015-08-24 14:13:32 [scrapy] INFO: Enabled extensions: CloseSpider, LogStats, CoreStats, SpiderState
                  2015-08-24 14:13:32 [scrapy] INFO: Enabled downloader middlewares: DownloadTimeoutMiddleware, UserAgentMiddleware, RetryMiddleware, HttpAuthMiddleware, DefaultHeadersMiddleware, MetaRefreshMiddleware, HttpCompressionMiddleware, RedirectMiddleware, CookiesMiddleware, ChunkedTransferMiddleware, DownloaderStats
                  2015-08-24 14:13:32 [scrapy] INFO: Enabled spider middlewares: HttpErrorMiddleware, OffsiteMiddleware, RefererMiddleware, UrlLengthMiddleware, DepthMiddleware
                  2015-08-24 14:13:32 [scrapy] INFO: Spider opened
                  2015-08-24 14:13:32 [scrapy] INFO: Crawled 0 pages (at 0 pages/min), scraped 0 items (at 0 items/min)
                  

                  文檔中有一個關(guān)于從腳本啟動的示例,但我想做的是在使用 scrapy crawl 命令時啟動另一個蜘蛛.

                  The documentations has a example about launching from script, but what I'm trying to do is launch another spider while using scrapy crawl command.

                  完整代碼

                  from scrapy.crawler import CrawlerRunner
                  from scrapy.utils.project import get_project_settings
                  from twisted.internet import reactor
                  from multiprocessing import Process
                  import scrapy
                  import os
                  
                  
                  def info(title):
                      print(title)
                      print('module name:', __name__)
                      if hasattr(os, 'getppid'):  # only available on Unix
                          print('parent process:', os.getppid())
                      print('process id:', os.getpid())
                  
                  
                  class TestSpider1(scrapy.Spider):
                      name = "test1"
                      start_urls = ['http://www.google.com']
                  
                      def parse(self, response):
                          info('parse')
                          a = MyClass()
                          a.start_work()
                  
                  
                  class MyClass(object):
                  
                      def start_work(self):
                          info('start_work')
                          p = Process(target=self.do_work)
                          p.start()
                          p.join()
                  
                      def do_work(self):
                  
                          info('do_work')
                          settings = get_project_settings()
                          runner = CrawlerRunner(settings)
                          runner.crawl(TestSpider2)
                          d = runner.join()
                          d.addBoth(lambda _: reactor.stop())
                          reactor.run()
                          return
                  
                  class TestSpider2(scrapy.Spider):
                  
                      name = "test2"
                      start_urls = ['http://www.google.com']
                  
                      def parse(self, response):
                          info('testspider2')
                          return
                  

                  我希望是這樣的:

                  1. scrapy 抓取測試1(例如,當(dāng) response.status_code 為 200 時:)
                  2. 在test1中,調(diào)用scrapy crawl test2

                  推薦答案

                  我不會深入給出,因為這個問題真的很老,但我會繼續(xù)從官方 Scrappy 文檔中刪除這個片段......你非常接近!哈哈

                  I won't go in depth given since this question is really old but I'll go ahead drop this snippet from the official Scrappy docs.... You are very close! lol

                  import scrapy
                  from scrapy.crawler import CrawlerProcess
                  
                  class MySpider1(scrapy.Spider):
                      # Your first spider definition
                      ...
                  
                  class MySpider2(scrapy.Spider):
                      # Your second spider definition
                      ...
                  
                  process = CrawlerProcess()
                  process.crawl(MySpider1)
                  process.crawl(MySpider2)
                  process.start() # the script will block here until all crawling jobs are finished
                  

                  https://doc.scrapy.org/en/latest/topics/實踐.html

                  然后使用回調(diào),你可以在你的蜘蛛之間傳遞項目做你所說的邏輯函數(shù)

                  And then using callbacks you can pass items between your spiders do do w.e logic functions your talking about

                  這篇關(guān)于是否可以從 Scrapy spider 運行另一個蜘蛛?的文章就介紹到這了,希望我們推薦的答案對大家有所幫助,也希望大家多多支持html5模板網(wǎng)!

                  【網(wǎng)站聲明】本站部分內(nèi)容來源于互聯(lián)網(wǎng),旨在幫助大家更快的解決問題,如果有圖片或者內(nèi)容侵犯了您的權(quán)益,請聯(lián)系我們刪除處理,感謝您的支持!

                  相關(guān)文檔推薦

                  What exactly is Python multiprocessing Module#39;s .join() Method Doing?(Python 多處理模塊的 .join() 方法到底在做什么?)
                  Passing multiple parameters to pool.map() function in Python(在 Python 中將多個參數(shù)傳遞給 pool.map() 函數(shù))
                  multiprocessing.pool.MaybeEncodingError: #39;TypeError(quot;cannot serialize #39;_io.BufferedReader#39; objectquot;,)#39;(multiprocessing.pool.MaybeEncodingError: TypeError(cannot serialize _io.BufferedReader object,)) - IT屋-程序員軟件開
                  Python Multiprocess Pool. How to exit the script when one of the worker process determines no more work needs to be done?(Python 多進程池.當(dāng)其中一個工作進程確定不再需要完成工作時,如何退出腳本?) - IT屋-程序員
                  How do you pass a Queue reference to a function managed by pool.map_async()?(如何將隊列引用傳遞給 pool.map_async() 管理的函數(shù)?)
                  yet another confusion with multiprocessing error, #39;module#39; object has no attribute #39;f#39;(與多處理錯誤的另一個混淆,“模塊對象沒有屬性“f)
                  <i id='VmOGD'><tr id='VmOGD'><dt id='VmOGD'><q id='VmOGD'><span id='VmOGD'><b id='VmOGD'><form id='VmOGD'><ins id='VmOGD'></ins><ul id='VmOGD'></ul><sub id='VmOGD'></sub></form><legend id='VmOGD'></legend><bdo id='VmOGD'><pre id='VmOGD'><center id='VmOGD'></center></pre></bdo></b><th id='VmOGD'></th></span></q></dt></tr></i><div class="qwawimqqmiuu" id='VmOGD'><tfoot id='VmOGD'></tfoot><dl id='VmOGD'><fieldset id='VmOGD'></fieldset></dl></div>
                      <tbody id='VmOGD'></tbody>

                    <small id='VmOGD'></small><noframes id='VmOGD'>

                      <tfoot id='VmOGD'></tfoot>
                      • <bdo id='VmOGD'></bdo><ul id='VmOGD'></ul>
                        <legend id='VmOGD'><style id='VmOGD'><dir id='VmOGD'><q id='VmOGD'></q></dir></style></legend>
                          1. 主站蜘蛛池模板: 国产精品综合一区二区 | 亚洲一区 中文字幕 | 色网站入口 | 国产免费观看久久黄av片涩av | 欧美成视频在线观看 | 欧美激情一区二区 | 国产综合精品一区二区三区 | 日韩精品一区二区三区在线 | 国产91丝袜在线播放 | 国产在线精品一区二区 | 国产高清免费在线 | 国产一区二区在线视频 | 伊人伊成久久人综合网站 | 欧美影院 | 亚洲精品在线免费播放 | 五月天天丁香婷婷在线中 | 二区欧美| 日韩在线一区二区三区 | 男女网站免费 | 日韩av电影院 | 精品一区二区三区在线观看 | 午夜国产| 欧美性tv | 国产一级电影在线观看 | 日韩精品在线网站 | 国产免费一区二区三区 | 99精品久久久国产一区二区三 | 91精品国产综合久久久久 | 久久国产高清 | 久久99久久99久久 | 国产91视频免费 | 婷婷在线网站 | 日本91av视频 | 亚洲午夜精品在线观看 | 久草在线中文888 | 日韩免费高清视频 | 欧美性猛交一区二区三区精品 | 成人免费区一区二区三区 | 精品欧美一区二区精品久久久 | 91最新入口 | www精品美女久久久tv |