How to divide and conquer a problem the UNIX way
Finding and getting the names of the files to download
My TorrentFlux saves torrents to the folder /torrents/incoming. Once the file is ripe for download, I can take a gander at that folder and find the file sitting there.
But how do I get the file name, when all I have is the torrent file name? More importantly, where is that all-important torrent file when I need to pry crucial info out of its sneaky little hands?
A cursory find /torrents revealed its true colors:
[rudd-o@tobey torrents]$ find . ./.transfers ./.transfers/queue ./.transfers/DragonFear_dance_MegaMixx_2006_volume_3.torrent ./.transfers/dragonfear_dance_megamixx_2006_volume_3.stat ./.transfers/dragonfear_dance_megamixx_2006_volume_3.stat.pid ./incoming ./.BitTornado find: ./.BitTornado: Permiso denegado ./.rsscache ./.fluxd ./.fluxd/rssad
Ooohhh, shiny! The torrents are stored right next to the incoming folder, in a hidden folder named .transfers. Great news!
Time to unleash a can of whoopass onto the torrent in question. After a good Googling while, turns out the BitTorrent package for my Linux distribution on the server has an interesting command named torrentinfo-console. It also dumps some information out as text:
[rudd-o@tobey torrents]$ torrentinfo-console /torrents/.torrents/DragonFear_dance_MegaMixx_2006_volume_3.torrent torrentinfo-console 4.2.2 - decodifica BitTorrent archivos de metainfo
archivo de metainfo.......: DragonFear_dance_MegaMixx_2006_volume_3.torrent trozo de info.............:467c1a7c04f9b658c5dca05900d31bfe8950db7e nombre del archivo.........: DragonFear dance MegaMixx 2006, volume 3.mp3 tamaño de archivo........: 553417496 (2111 * 262144 + 31512) rastreador de url announce: http://tpb.tracker.thepiratebay.org/announce comentario................:
Oh, crap. It’s all in Spanish! That’s useless — what will happen if I use this program tomorrow on a computer set up in English!
Wrong again — a little-known environment variable comes to my rescue:
[rudd-o@tobey torrents]$ LANG=C torrentinfo-console ./.transfers/DragonFear_dance_MegaMixx_2006_volume_3.torrent torrentinfo-console 4.2.2 - decode BitTorrent metainfo files
metainfo file.......: DragonFear_dance_MegaMixx_2006_volume_3.torrent info hash...........: 467c1a7c04f9b658c5dca05900d31bfe8950db7e file name...........: DragonFear dance MegaMixx 2006, volume 3.mp3 archive size........: 553417496 (2111 * 262144 + 31512) tracker announce url: http://tpb.tracker.thepiratebay.org/announce comment.............:
I win. Now it’s text processing time:
def get_file_name(torrentname):
cmd = "LANG=C %s %s/.transfers/%s"%(torrentinfo,torrentflux_base_dir,torrentname)
stdout = getssh(cmd).splitlines()
filenames = [ l[22:] for l in stdout if l.startswith("file name...........: ") ]
assert len(filenames) is 1
return filenames[0]
Very cool. Now I can pop the file out, add it to a path name, and start downloading it.
Downloading the files
How do you usually download your files? I bet using a Web server or an FTP server, combined with the right client on your computer.
Though I don’t have an FTP server, I already have a Web server. I don’t think a Web server is the right choice. Sure, I could pop the files on a Web shared folder in my server, but that would expose potentially sensitive content to the world.
Let’s reuse SSH! Normally, I would use scp (secure copy) or sftp (secure FTP), but this solution is (in my experience) a bit harder to script, and big files can’t easily be resumed. So a better solution — rsync — is more appropriate.
rsync is fantastic. It can synchronize files on two different computers, and it exclusively transfers only the portions of the files that have changed, incrementally downloading any previously interrupted download. It’s also marvelously easy to automate. All I need to do is invoke a command like this on my home computer:
rsync -avz rudd-o.com:/torrents/incoming/filename.avi /home/rudd-o/downloads/.
and filename.avi would be diligently downloaded to my downloads folder. Neat, huh? What’s more, with an extra --remove-source-files argument, once the file is done and cryptographically verified, the file is removed from the server, freeing disk space!
Let’s code:
def dorsync(filename):
cmdline = ["rsync","-avzP","--remove-source-files",
"%s:%s/%s"%(torrentflux_server,torrentflux_download_dir,filename),"."]
return passthru(cmdline)
Good. Great!
How does the actual main program take shape? Keep reading!
November 5th, 2007 at 0:10
Hi there,
thank you so much for sharing that information on how you created your script and solved that problem. Believe it or not, there are people out in the world (like me) who like to read such things; as the process you are following does not necessarily come naturally. Its great to learn how to break up a problem into pieces like that.
thanks
Michaelg