processor.coprocessor.phantomjs
Module¶
PhantomJS page loading and scrolling.
-
class
wpull.processor.coprocessor.phantomjs.
PhantomJSCoprocessor
(phantomjs_driver_factory: typing.Callable, processing_rule: wpull.processor.rule.ProcessingRule, phantomjs_params: wpull.processor.coprocessor.phantomjs.PhantomJSParamsType, warc_recorder=None, root_path='.')[source]¶ Bases:
object
PhantomJS coprocessor.
Parameters: - phantomjs_driver_factory – Callback function that accepts
params
argument and returns PhantomJSDriver - processing_rule – Processing rule.
- warc_recorder – WARC recorder.
- root_dir (str) – Root directory path for temp files.
- phantomjs_driver_factory – Callback function that accepts
-
class
wpull.processor.coprocessor.phantomjs.
PhantomJSCoprocessorSession
(phantomjs_driver_factory, root_path, processing_rule, file_writer_session, request, response, item_session: wpull.pipeline.session.ItemSession, params, warc_recorder)[source]¶ Bases:
object
PhantomJS coprocessor session.
-
exception
wpull.processor.coprocessor.phantomjs.
PhantomJSCrashed
[source]¶ Bases:
Exception
PhantomJS exited with non-zero code.
-
wpull.processor.coprocessor.phantomjs.
PhantomJSParams
¶ PhantomJS parameters
-
wpull.processor.coprocessor.phantomjs.
snapshot_type
¶ list
File types. Accepted are html, pdf, png, gif.
-
wpull.processor.coprocessor.phantomjs.
wait_time
¶ float
Time between page scrolls.
-
wpull.processor.coprocessor.phantomjs.
num_scrolls
¶ int
Maximum number of scrolls.
-
wpull.processor.coprocessor.phantomjs.
smart_scroll
¶ bool
Whether to stop scrolling if number of requests & responses do not change.
-
wpull.processor.coprocessor.phantomjs.
snapshot
¶ bool
Whether to take snapshot files.
-
wpull.processor.coprocessor.phantomjs.
viewport_size
¶ tuple
Width and height of the page viewport.
-
wpull.processor.coprocessor.phantomjs.
paper_size
¶ tuple
Width and height of the paper size.
-
wpull.processor.coprocessor.phantomjs.
load_time
¶ float
Maximum time to wait for page load.
-
wpull.processor.coprocessor.phantomjs.
custom_headers
¶ dict
Default HTTP headers.
-
wpull.processor.coprocessor.phantomjs.
page_settings
¶ dict
Page settings.
alias of
PhantomJSParamsType
-