processor.coprocessor.phantomjs Module¶
PhantomJS page loading and scrolling.
-
class
wpull.processor.coprocessor.phantomjs.PhantomJSCoprocessor(phantomjs_driver_factory: typing.Callable, processing_rule: wpull.processor.rule.ProcessingRule, phantomjs_params: wpull.processor.coprocessor.phantomjs.PhantomJSParamsType, warc_recorder=None, root_path='.')[source]¶ Bases:
objectPhantomJS coprocessor.
Parameters: - phantomjs_driver_factory – Callback function that accepts
paramsargument and returns PhantomJSDriver - processing_rule – Processing rule.
- warc_recorder – WARC recorder.
- root_dir (str) – Root directory path for temp files.
- phantomjs_driver_factory – Callback function that accepts
-
class
wpull.processor.coprocessor.phantomjs.PhantomJSCoprocessorSession(phantomjs_driver_factory, root_path, processing_rule, file_writer_session, request, response, item_session: wpull.pipeline.session.ItemSession, params, warc_recorder)[source]¶ Bases:
objectPhantomJS coprocessor session.
-
exception
wpull.processor.coprocessor.phantomjs.PhantomJSCrashed[source]¶ Bases:
ExceptionPhantomJS exited with non-zero code.
-
wpull.processor.coprocessor.phantomjs.PhantomJSParams¶ PhantomJS parameters
-
wpull.processor.coprocessor.phantomjs.snapshot_type¶ list
File types. Accepted are html, pdf, png, gif.
-
wpull.processor.coprocessor.phantomjs.wait_time¶ float
Time between page scrolls.
-
wpull.processor.coprocessor.phantomjs.num_scrolls¶ int
Maximum number of scrolls.
-
wpull.processor.coprocessor.phantomjs.smart_scroll¶ bool
Whether to stop scrolling if number of requests & responses do not change.
-
wpull.processor.coprocessor.phantomjs.snapshot¶ bool
Whether to take snapshot files.
-
wpull.processor.coprocessor.phantomjs.viewport_size¶ tuple
Width and height of the page viewport.
-
wpull.processor.coprocessor.phantomjs.paper_size¶ tuple
Width and height of the paper size.
-
wpull.processor.coprocessor.phantomjs.load_time¶ float
Maximum time to wait for page load.
-
wpull.processor.coprocessor.phantomjs.custom_headers¶ dict
Default HTTP headers.
-
wpull.processor.coprocessor.phantomjs.page_settings¶ dict
Page settings.
alias of
PhantomJSParamsType-