Introduction

Wpull is a Wget-compatible (or remake/clone/replacement/alternative) web downloader and crawler.

A dog pulling a box via a harness.

Notable Features:

  • Written in Python: lightweight, modifiable, robust, & scriptable
  • Graceful stopping; on-disk database resume
  • PhantomJS & youtube-dl integration (experimental)

Wpull is designed to be (almost) a drop-in replacement for Wget with minimal changes to options. It is designed to run on much larger crawls rather than speedily downloading a single file.

Wpull’s behavior is not an exact duplicate of Wget’s behavior. As such, you should not expect exact output and operation out of Wpull. However, it aims to be a very useful alternative as its source code can be easily modified to fix, change, or extend its behaviors.

For instructions, read on to the next sections. Confused? Check out the Frequently Asked Questions.