My wife has a beautiful pink iPod shuffle which she loves to listen music from, on her daily commute. She got bored with the songs and wanted a new playlist. I was entrusted with downloading new songs. Yep, all pirated. Now, downloading songs these days is a really boring stuff given the amount of bandwidth and apps like Spotify. So, I decided to automate the whole affair.

I was downloading from BengaliMp3Downloads. After some messing around with their html, I wrote this quick script to download the files from the site.

import os, urlparse
 from bs4 import BeautifulSoup
 import requests
 
 start_url = 'https://bengalimp3downloads.com/Anjan_Dutta~songs_by_artists-opencontents-1-1.html'
 domain = 'https://bengalimp3downloads.com'
 
 def download_files_from(url):
         response = requests.get(url)
 
         try:
                 doc = BeautifulSoup(response.text, 'html.parser')
 
                 for link in doc.find_all('div', style="vertical-align:bottom;"):
                     for li in link.find_all("a"):
                         download_page_url = domain + li['href']
                         res = requests.get(download_page_url)
                         soup = BeautifulSoup(res.text, 'html.parser')
                         for l in soup.find_all('source'):
                             print 'MP3 URL is ' + l['src']
                             a = urlparse.urlparse(l['src'])
                             filename = os.path.basename(a.path)
                             print 'Saving to file ' + filename + '...'
                             d = requests.get(l['src'])
                             with open(filename, 'wb') as f:
                                 f.write(d.content)
                             f.close()
                 for nxt in doc.find_all(text='[Next]'):
                     print 'Next page found. Continuing...'
                     nxt_pg_link = domain + nxt.parent.attrs['href']
                     print nxt_pg_link
                     download_files_from(nxt_pg_link)
         except:
                 pass
 
 download_files_from(start_url)

It’s not something to show off but felt like writing about it and enjoying the extra time while the files are being downloaded!

P.S. Always use VirtualEnv!

A very complicated simple person. Jack of all trades, master of a few. Loves to read books and well-written code. Fluently speaks several languages including JavaScript, PHP, Java, English, Bengali etc. Listens to loads of music everyday. PhD in mood-swing. Chess Grandmaster of disaster.