Downloading Imgur Posts Linked From Reddit with Python

UPDATE – I have updated this article to use BeautifulSoup to parse the HTML rather than regular expressions. This makes it much easier.

Reddit is a popular site that allows users to post and vote on interesting web links. It is divided into several topical subreddits. Many Redditors use Imgur to host their images (and I highly recommend it: Imgur is free and easy to use). This tutorial tells you how to write a Python script that can scan Reddit and download images from Imgur submissions you find. This tutorial is for beginner-level programmers with a small amount of Python experience.

You can download the source code directly or view the GitHub repo.

This post will cover:

  • Basic web scraping concepts.
  • Command line options.
  • Accessing Reddit with the PRAW module.
  • Using regular expressions to find text patterns in a web page.
  • Downloading files with the Requests module.
  • Detecting which files are on our computer with the os and glob modules.
  • Opening files using Python’s with statement.

(more…)

Read More