protocol.http.web Module¶
Advanced HTTP Client handling.
-
class
wpull.protocol.http.web.LoopType[source]¶ Bases:
enum.EnumIndicates the type of request and response.
-
authentication= None¶ Response to a HTTP authentication.
-
normal= None¶ Normal response.
-
redirect= None¶ Redirect.
-
robots= None¶ Response to a robots.txt request.
-
-
class
wpull.protocol.http.web.WebClient(http_client: typing.Union=None, request_factory: typing.Callable=<class 'wpull.protocol.http.request.Request'>, redirect_tracker_factory: typing.Union=<class 'wpull.protocol.http.redirect.RedirectTracker'>, cookie_jar: typing.Union=None)[source]¶ Bases:
objectA web client handles redirects, cookies, basic authentication.
Parameters: - An HTTP client. (http_client.) –
- requets_factory – A function that returns a new
http.request.Request - redirect_tracker_factory – A function that returns a new
http.redirect.RedirectTracker - cookie_jar – A cookie jar.
Return the Cookie Jar.
-
http_client¶ Return the HTTP Client.
-
redirect_tracker_factory¶ Return the Redirect Tracker factory.
-
request_factory¶ Return the Request factory.
-
session(request: wpull.protocol.http.request.Request) → wpull.protocol.http.web.WebSession[source]¶ Return a fetch session.
Parameters: request – The request to be fetched. Example usage:
client = WebClient() session = client.session(Request('http://www.example.com')) with session: while not session.done(): request = session.next_request() print(request) response = yield from session.start() print(response) if session.done(): with open('myfile.html') as file: yield from session.download(file) else: yield from session.download()
Returns: WebSession
-
class
wpull.protocol.http.web.WebSession(request: wpull.protocol.http.request.Request, http_client: wpull.protocol.http.client.Client, redirect_tracker: wpull.protocol.http.redirect.RedirectTracker, request_factory: typing.Callable, cookie_jar: typing.Union=None)[source]¶ Bases:
objectA web session.
-
done() → bool[source]¶ Return whether the session has finished.
Returns: If True, the document has been fully fetched. Return type: bool
-
download(file: typing.Union=None, duration_timeout: typing.Union=None)[source]¶ Download content.
Parameters: - file – An optional file object for the document contents.
- duration_timeout – Maximum time in seconds of which the entire file must be read.
Returns: An instance of
http.request.Response.Return type: See
WebClient.session()for proper usage of this function.Coroutine.
-
loop_type() → wpull.protocol.http.web.LoopType[source]¶ Return the type of response.
Seealso: LoopType.
-
redirect_tracker¶ Return the Redirect Tracker.
-