Today I wanted to download a course from an RSS feed. The media files (mp3) were attached to the RSS entries. So instead of downloading tens of files one by one, I wrote a script!
With the help of Alvin's post to parse RSS with Python, I attached a download method and the script was ready to do the work for me.
First, here is the script:
[rss_downloader.py]
You just call the script, pass RSS url, and the directory to save the files.
This will parse the RSS feed, print the available links with type "audio/mpeg" (you can change this to be a value passed by the user), and download them with a progress display
This it. But there is more...
After doing this, I realized that I could simply print the output, copy the links all together to my download manager as a batch download, and enjoy the features of my download manager. After all, I wanted to download the files, not make a full application. Dummy me :D
However you choose to go with the script, I hope you find it useful.
With the help of Alvin's post to parse RSS with Python, I attached a download method and the script was ready to do the work for me.
First, here is the script:
[rss_downloader.py]
#!/usr/bin/python import feedparser import sys import urllib2 # # Takes a url and a directory for saving the file. Directory must exist. # def download(url, dir_name): file_name = url.split('/')[-1] u = urllib2.urlopen(url) f = open(dir_name+'/'+file_name, 'wb') meta = u.info() file_size = int(meta.getheaders("Content-Length")[0]) print "Downloading File: %s (Size: %s Bytes)" % (file_name, file_size) file_size_dl = 0 block_sz = 8192 while True: buffer = u.read(block_sz) if not buffer: break file_size_dl += len(buffer) f.write(buffer) status = r"%10d [%3.2f%%]" % (file_size_dl, file_size_dl * 100. / file_size) status = status + chr(8)*(len(status)+1) print status, f.close() # # Take url and directory parameters from user call # url = sys.argv[1] dir_name = sys.argv[2] # # Get the feed data from the url # feed = feedparser.parse(url) # # Collect urls to download # urls_to_download = [] for entry in feed.entries: links = entry.links for link in links: if link.type == u'audio/mpeg': urls_to_download.append(link.href) print("Files count: %s" % (len(urls_to_download))) # # Download files # for url in urls_to_download: download(url, dir_name) # print(url)
You just call the script, pass RSS url, and the directory to save the files.
python rss_downloader.py http://rss.dw.com/xml/DKpodcast_dwn1_en /home/madly/DeutschWarumNicht/serie1
This will parse the RSS feed, print the available links with type "audio/mpeg" (you can change this to be a value passed by the user), and download them with a progress display
This it. But there is more...
After doing this, I realized that I could simply print the output, copy the links all together to my download manager as a batch download, and enjoy the features of my download manager. After all, I wanted to download the files, not make a full application. Dummy me :D
However you choose to go with the script, I hope you find it useful.
No comments:
Post a Comment