pipeline.session Module¶
-
class
wpull.pipeline.session.ItemSession(app_session: wpull.pipeline.app.AppSession, url_record: wpull.pipeline.item.URLRecord)[source]¶ Bases:
objectItem for a URL that needs to processed.
-
add_child_url(url: str, inline: bool=False, link_type: typing.Union=None, post_data: typing.Union=None, level: typing.Union=None, replace: bool=False)[source]¶ Add links scraped from the document with automatic values.
Parameters: - url – A full URL. (It can’t be a relative path.)
- inline – Whether the URL is an embedded object.
- link_type – Expected link type.
- post_data – URL encoded form data. The request will be made using POST. (Don’t use this to upload files.)
- level – The child depth of this URL.
- replace – Whether to replace the existing entry in the database table so it will be redownloaded again.
This function provides values automatically for:
inlinelevelparent: The referrering page.root
See also
add_url().
-
child_url_record(url: str, inline: bool=False, link_type: typing.Union=None, post_data: typing.Union=None, level: typing.Union=None)[source]¶ Return a child URLRecord.
This function is useful for testing filters before adding to table.
-
is_processed¶ Return whether the item has been processed.
-
is_virtual¶
-
request¶
-
response¶
-