How to automate torrent downloads using TorrentFlux-b4rt, cron and rsync

by Rudd-O published 2007/10/31 12:40:00 GMT+0, last modified 2013-06-26T03:24:20+00:00

TorrentFlux-b4rt is an awesome masterpiece of engineering. You install it on your Web server, and then you can start downloading BitTorrent torrents right away. The catch is that those torrents are saved in your Web server until you actually download them to your PC. And having to schedule downloads separately is a pain. Well, no more.

Awesome awesomeness: start your downloads minutes after TorrentFlux finishes them

I've created a very, very complicated and interwoven (translation: awesome and straightforward) script that you may use and modify on your home computer. It works like this:

  • Using SSH, it queries your TorrentFlux server's transfer list, asking which torrents are completely downloaded.
  • For each of the completed torrents, it grabs the file name of the downloaded file.
  • Then, it uses rsync to incrementally transfer the downloaded file to your PC.
  • Again using rsync, it removes the file from your TorrentFlux server, freeing disk space.


What you need to do before using the script

For you to use it, you will need to do the following:

  • Enable SSH access to your (soon-to-be) TorrentFlux server.
  • Set up public key (passwordless) authentication so that your home computer's user account can freely log in to your server without asking for a password.
  • Install TorrentFlux-b4rt and set it up so that your SSH user has read/write access to the torrents and downloads folders/files. Umask/permissions, you know the drill.
  • Enable fluxcli usage on your TorrentFlux setup.
  • Install rsync on both your server and your home computer.
  • Install the BitTorrent package that contains the program named torrentinfo-console on your server.
  • Create a destination directory on your home computer.
  • Install a mail program such as mailx on your home computer.
  • Install cron on your home computer.
  • Configure the script altering the variables at the top.
Yeah, it's a lot of work. Go bitch somewhere else, we're here for the fun.


Installing the script

Put the script somewhere in your PATH. Make it executable. Set up a cron job for the purpose:

# m h  dom mon dow   command
   0,10,20,30,40,50 * * * * /home/rudd-o/bin/torrentleecher -q
Make sure the cron job line ends with a line ending (carriage return) -- otherwise it doesn't run.


The script itself

It's a stunning, hackish example of subprocessing, SSH remoting, pipeline integration, text processing, POSIX daemonizing and file locking, and samizdat logging. It gets almost all of them wrong, and yet it manages to stay standing. If you're interested in reading a complete account of how this script grew out of nothing, well, click here.

It's a miracle:

#!/usr/bin/env python

from subprocess import Popen,PIPE,STDOUT,call
import fcntl
import re
import os
import sys
import signal
import time
from threading import Thread

#FIXME: this script doesn't deal with multifile torrents
# vars!
# wherever they are used, it's MOST LIKELY they need quoting
# FIXME whatever calls SSH remoting needs to protect/quote the commands for spaces or else this might turn out to be a bitch
torrentflux_base_dir = "/home/rudd-o/vhosts/"
torrentflux_download_dir = "/home/rudd-o/vhosts/"
torrentflux_server = ""
torrentleecher_destdir = "/home/rudd-o/download/Autodownload"
fluxcli = "fluxcli"
torrentinfo = "torrentinfo-console"
email_address = "rudd-o"

def shell_quote(shellarg):
def getstdout(cmdline):
	p = Popen(cmdline,stdout=PIPE)
	output = p.communicate()[0]
	if p.returncode != 0: raise Exception, "Command %s return code %s"%(cmdline,p.returncode)
	return output
def getstdoutstderr(cmdline,inp=None): # return stoud and stderr in a single string object
	p = Popen(cmdline,stdin=PIPE,stdout=PIPE,stderr=STDOUT)
	output = p.communicate(inp)[0]
	if p.returncode != 0: raise Exception, "Command %s return code %s"%(cmdline,p.returncode)
	return output
def passthru(cmdline): return call(cmdline) # return status code, pass the outputs thru
def getssh(cmd): return getstdout(["ssh","-o","BatchMode yes","-o","ForwardX11 no",torrentflux_server] + [cmd]) # return stdout of ssh.  doesn't return stderr
def sshpassthru(cmd): return call(["ssh","-o","BatchMode yes","-o","ForwardX11 no",torrentflux_server] + [cmd]) # return status code from a command executed using ssh
def mail(subject,text): return getstdoutstderr(["mail","-s",subject,email_address],text)

def get_finished_torrents():
	stdout = getssh("%s transfers"%fluxcli)
	stdout = stdout.splitlines()[2:-5]
	stdout = [ re.match("^- (.+) - [0123456789\.]+ [KMG]B - (Seeding|Done)",line) for line in stdout ]
	return [ (, ) for match in stdout if match ]

def get_file_name(torrentname):
	path = shell_quote("%s/.transfers/%s"%(
	cmd = "LANG=C %s %s"%(torrentinfo,path)
	stdout = getssh(cmd).splitlines()
	filenames = [ l[22:] for l in stdout if l.startswith("file name...........: ") ]
	if not len(filenames):
		filelistheader = stdout.index("files...............:")
		# we disregard the actual filenames, we now want the dir name
		#filenames = [ l[3:] for l in stdout[filelistheader+1:] if l.startswith("   ") ]
		filenames = [ l[22:] for l in stdout if l.startswith("directory name......: ") ]
	assert len(filenames) is 1
	return filenames[0]

def dorsync(filename,delete=False):
	# need to single-quote the *path* for the purposes of the remote shell so it doesn't fail, because the path is used in the remote shell
	path = "%s/%s"%(torrentflux_download_dir,filename)
	path = shell_quote(path)
	path = "%s:%s"%(torrentflux_server,path)
	opts = ["-arvzP"]
	if delete: opts.append("--remove-source-files")
	cmdline = [ "rsync" ] + opts + [ path , "." ]
	return passthru(cmdline)

def exists_on_server(filename):
	path = shell_quote("%s/%s"%(torrentflux_download_dir,filename))
	cmd = "test -f %s -o -d %s"%(path,path)
	returncode = sshpassthru(cmd)
	if returncode == 1: return False
	elif returncode == 0: return True
	else: raise AssertionError, "exists on server returned %s"%returncode

def remove_dirs_only(filename):
	path = shell_quote("%s/%s"%(torrentflux_download_dir,filename))
	cmd = "find %s -type d -depth -print0 | xargs -0 rmdir"%(path,)
	returncode = sshpassthru(cmd)
	if returncode == 0: return
	else: raise AssertionError, "remove dirs only returned %s"%returncode

def remove_remote_download(filename):
	path = shell_quote("%s/%s"%(torrentflux_download_dir,filename))
	cmd = "rm -fr %s"%path
	returncode = sshpassthru(cmd)
	if returncode == 0: return
	else: raise AssertionError, "remove dirs only returned %s"%returncode

def get_files_to_download():
	torrents = get_finished_torrents()
	for name,status in torrents:
		yield (name,status,get_file_name(name))

def speak(text):
	return passthru(["/usr/local/bin/swift","-n","William",text])

def lock():
	global f
	except: pass
		f=open(os.path.join(torrentleecher_destdir,".torrentleecher.lock"), 'w')
		fcntl.lockf(f.fileno(),fcntl.LOCK_EX | fcntl.LOCK_NB)
	except IOError,e:
		if e.errno == 11: return False
		else: raise
	return True

def daemonize():
	"""Detach a process from the controlling terminal and run it in the
	background as a daemon.
	try: pid = os.fork()
	except OSError, e: raise Exception, "%s [%d]" % (e.strerror, e.errno)

	if (pid == 0):		 # The first child.
		try: pid = os.fork()		  # Fork a second child.
		except OSError, e: raise Exception, "%s [%d]" % (e.strerror, e.errno)

		if (pid == 0):	 # The second child.
			# exit() or _exit()?  See below.
			os._exit(0)	 # Exit parent (the first child) of the second child.
	else: os._exit(0)		 # Exit parent of the first child.

	import resource				  # Resource usage information.
	maxfd = resource.getrlimit(resource.RLIMIT_NOFILE)[1]
	if (maxfd == resource.RLIM_INFINITY):
		maxfd = 1024
	# Iterate through and close all file descriptors.
	for f in [ sys.stderr, sys.stdout, sys.stdin ]:
		try: f.flush()
		except: pass

	for fd in range(0, 2):
		try: os.close(fd)
		except OSError: pass

	for f in [ sys.stderr, sys.stdout, sys.stdin ]:
		try: f.close()
		except: pass

	sys.stdin = file("/dev/null", "r")
	sys.stdout = file(os.path.join(torrentleecher_destdir,".torrentleecher.log"), "a",0)
	sys.stderr = file(os.path.join(torrentleecher_destdir,".torrentleecher.log"), "a",0)
	os.dup2(1, 2)


sighandled = False
def sighandler(signum,frame):
	global sighandled
	if not sighandled:
		print "Received signal %s"%signum
		# temporarily immunize from signals
		oldhandler = signal.signal(signum,signal.SIG_IGN)
		sighandled = True
def report_file_failed(filename):
	try: os.symlink(os.path.join(torrentleecher_destdir,".torrentleecher.log"),"%s.log"%filename)
	except OSError,e:
		if e.errno != 17: raise #file exists should be ignored of course
	errortext = """Please take a look at the log files in
	mail("Leecher: error -- %s"%filename,errortext)
def set_dir_icon(filename,iconname):
	text ="""[Desktop Entry]
	try: file(os.path.join(filename,".directory"),"w").write(text)
	except: pass

def mark_dir_complete(filename): set_dir_icon(filename,"dialog-ok-apply.png")
def mark_dir_downloading(filename): set_dir_icon(filename,"document-open-remote.png")
def mark_dir_error(filename): set_dir_icon(filename,"dialog-cancel.png")

def mark_dir_downloading_when_it_appears(filename):
	def dowatch():
		starttime = time.time()
		while not os.path.isdir(filename) and time.time() - starttime < 60:
		if os.path.isdir(filename): mark_dir_downloading(filename)
	t = Thread(target=dowatch)

def speakize(filename):
		filename,extension = os.path.splitext(filename)
		if len(extension) != 3: filename = filename + "." + extension
	except ValueError: pass
	for char in "[]{}.,-_": filename = filename.replace(char," ")
	return filename

def main():
	if not ( len(sys.argv) > 1 and "-D" in sys.argv[1:] ): daemonize()
	if not lock(): # we need to lock the file after the daemonization
		if not ( len(sys.argv) > 1 and "-q" in sys.argv[1:] ):
			print "Other process is downloading the file -- add -q argument to command line to squelch this message"
	if not ( len(sys.argv) > 1 and "-q" in sys.argv[1:] ):
		print "Starting download of finished torrents"
		for torrent,status,filename in get_files_to_download():
			# Set loop vars up
			download_lockfile = ".%s.done"%filename
			fully_downloaded = os.path.exists(download_lockfile)
			seeding = status == "Seeding"
			# If the remote files don't exist, skip
			print "Checking if %s from torrent %s exists on server"%(filename,torrent)
			if not exists_on_server(filename):
				print "%s from %s is no longer available on server, continuing to next torrent"%(filename,torrent)
			# If the download to this machine is complete, but the torrent's still seeding, skip
			if fully_downloaded:
				if seeding:
					print "%s from %s is complete but still seeding, continuing to next torrent"%(filename,torrent)
					print "Removal of %s complete"%filename
					speak ("Removal of %s complete"%speakize(filename))
				# Start download.
				print "Downloading %s from torrent %s"%(filename,torrent)
				retvalue = dorsync(filename)
				if retvalue != 0: # rsync failed
					if retvalue == 20:
						print "Download of %s stopped -- rsync process interrupted"%(filename,)
						print "Finishing by user request"
					elif retvalue < 0:
						print "Download of %s failed -- rsync process killed with signal %s"%(filename,-retvalue)
						print "Aborting"
						print "Download of %s failed -- rsync process exited with return status %s"%(filename,retvalue)
						print "Aborting"
				# Rsync successful
				# mark file as downloaded
				try: file(download_lockfile,"w").write("Done")
				except OSError,e:
					if e.errno != 17: raise
				# report successful download
				print "Download of %s complete"%filename
				speak ("Download of %s complete"%speakize(filename))
				mail("Leecher: done -- %s"%filename,"The file is at %s"%torrentleecher_destdir)
				# is it seeding?
				if not seeding:
					print "Removal of %s complete"%filename
					speak ("Removal of %s complete"%speakize(filename))

	except Exception,e:
		report_file_failed("00 - General error")

	if not ( len(sys.argv) > 1 and "-q" in sys.argv[1:] ):
		print "Download of finished torrents complete"

if __name__ == "__main__": main()