After a few months of blogging photos from my life events, I realised that I should have kept a local version (dummy me). I kept everything else, so why did I forget this time?!
Anyways, I realised that this is not an easy job if you are free hosting on Wordpress.com. All I could have is either an XML with posts and references to images, or a full migration to another Wordpress blog.
So I rolled up my sleeves and reused a python script I used before.
Here is the script:
I simply call the script and pass the backup XML file I've just downloaded from my Wordpress.com admin panel (/wp-admin) and the path to the folder where I want to save the images.
One more update can be to use a better parsing technique and use an element called wp:post_id to sort the images into folders by their post. But I am satisfied with the current result. I can do the rest manually as it will be faster in my case :)
Anyways, I realised that this is not an easy job if you are free hosting on Wordpress.com. All I could have is either an XML with posts and references to images, or a full migration to another Wordpress blog.
So I rolled up my sleeves and reused a python script I used before.
Here is the script:
#!/usr/bin/python # -*- coding: utf-8 -*- # This script parses a wordpress backup XML and downloads the contained media files. # Example: # python wordpress-media-downloader.py /Users/madly/Downloads/mycoolblog.wordpress.2016-05-02.xml /Users/madly/Downloads/media-library import sys import urllib2 import re # # Takes a url and a directory for saving the file. Directory must exist. # def download(url, dir_name): file_name = url.split('/')[-1] u = urllib2.urlopen(url) f = open(dir_name+'/'+file_name, 'wb') meta = u.info() file_size = int(meta.getheaders("Content-Length")[0]) print "Downloading File: %s (Size: %s Bytes)" % (file_name, file_size) file_size_dl = 0 block_sz = 8192 while True: buffer = u.read(block_sz) if not buffer: break file_size_dl += len(buffer) f.write(buffer) status = r"%10d [%3.2f%%]" % (file_size_dl, file_size_dl * 100. / file_size) status = status + chr(8)*(len(status)+1) print status, f.close() # # Take file name and directory parameters from the user call # file_name = sys.argv[1] dir_name = sys.argv[2] img_regex = '(http:\/\/.*files\.wordpress\.com.*\.(?:jpeg|jpg|png))' # # Get the images urls from the xml file # urls_to_download = [] file = open(file_name, "r") for line in file.readlines(): m = re.search(img_regex, line) if m: img_url = m.group(1) urls_to_download.append(img_url) urls_count = len(urls_to_download) print("Images to download: %s images." % (urls_count)) # # Download images # i = 0 for url in urls_to_download: i += 1 print("File %s/%s" % (i, urls_count)) download(url, dir_name) # print(url) # use this line instead of download() if you want to export the output to a download manager
I simply call the script and pass the backup XML file I've just downloaded from my Wordpress.com admin panel (/wp-admin) and the path to the folder where I want to save the images.
One more update can be to use a better parsing technique and use an element called wp:post_id to sort the images into folders by their post. But I am satisfied with the current result. I can do the rest manually as it will be faster in my case :)