Python 101: An Intro to ftplib
Need to do FTP in Python? Read on!
Join the DZone community and get the full member experience.Join For Free
The File Transfer Protocol (FTP) is used by many companies and organizations for sharing data. Python provides a File Transfer Protocol module in its standard library called ftplib that implements the client side of the FTP protocol. You can learn all about the File Transfer Protocol by reading the RFC 959 document on the Internet. However, the full specification is outside the scope of this article. Instead, we will focus on the following topics:
- Connecting to an FTP server
- Navigating its structure
- Downloading files from the FTP server
- Uploading files to an FTP server
Let's get started!
Connecting to an FTP Server
The first thing we need to do is find an FTP server to connect to. There are many free ones you can use. For example, most Linux distributions have FTP mirrors that are publicly accessible. If you go to Fedora's website, you will find a long list of mirrors that you can use. They aren't just FTP though, so be sure that you choose the correct protocol, or you will receive a connection error.
For this example, we will use ftp.cse.buffalo.edu. The official Python documentation uses ftp.debian.org, so feel free to try that as well. Let's try to connect to the server now. Open up the Python interpreter in your terminal or use IDLE to follow along:
>>> from ftplib import FTP >>> ftp = FTP('ftp.cse.buffalo.edu') >>> ftp.login() '230 Guest login ok, access restrictions apply.'
Let's break this down a bit. Here we import the **FTP** class from ftplib. Then, we create an instance of the class by passing it the host that we want to connect to. Since we did not pass a username or password, Python assumes we want to log in anonymously. If you happen to need to connect to the FTP server using a non-standard port, then you can do so using the connect method. Here's how:
>>> from ftplib import FTP >>> ftp = FTP() >>> HOST = 'ftp.cse.buffalo.edu' >>> PORT = 12345 >>> ftp.connect(HOST, PORT)
This code will fail as the FTP server in this example doesn't have port 12345 open for us. However, the idea is to convey how to connect to a port that differs from the default.
If the FTP server that you're connecting to requires TLS security, then you will want to import the FTP_TLS class instead of the FTP class. The FTP_TLS class supports a keyfile and a certfile. If you want to secure your connection, then you will need to call prot_p to do so.
Navigating Directories With ftplib
Let's learn how to see what's on the FTP server and change directories! Here is some code that demonstrates the normal method of doing so:
>>> from ftplib import FTP >>> ftp = FTP() >>> ftp.login() >>> ftp.retrlines('LIST') total 28 drwxrwxrwx 2 0 0 4096 Sep 6 2015 .snapshot drwxr-xr-x 2 202019 5564 4096 Sep 6 2015 CSE421 drwxr-xr-x 2 0 0 4096 Jul 23 2008 bin drwxr-xr-x 2 0 0 4096 Mar 15 2007 etc drwxr-xr-x 6 89987 546 4096 Sep 6 2015 mirror drwxrwxr-x 7 6980 546 4096 Jul 3 2014 pub drwxr-xr-x 26 0 11 4096 Apr 29 20:31 users '226 Transfer complete.' >>> ftp.cwd('mirror') '250 CWD command successful.' >>> ftp.retrlines('LIST') total 16 drwxr-xr-x 3 89987 546 4096 Sep 6 2015 BSD drwxr-xr-x 5 89987 546 4096 Sep 6 2015 Linux drwxr-xr-x 4 89987 546 4096 Sep 6 2015 Network drwxr-xr-x 4 89987 546 4096 Sep 6 2015 X11 '226 Transfer complete.'
Here we get logged in, and then we send the LIST command to the FTP server. This is done by calling our FTP object's retrlines method. The retrlines method prints out the result of the command we called. In this example, we called LIST which retrieves a list of files and/or folders, along with their respective information, and prints them out. Then we used the cwd command to change our working directory to a different folder and then re-ran the LIST command to see what was in it. You could also use your FTP object's dir function to get a listing of the current folder.
Downloading a File via FTP
Just viewing what's on an FTP server isn't all that useful. You will almost always want to download a file from the server. Let's find out how to download a single file:
>>> from ftplib import FTP >>> ftp = FTP('ftp.debian.org') >>> ftp.login() '230 Login successful.' >>> ftp.cwd('debian') '250 Directory successfully changed.' >>> out = '/home/mike/Desktop/README' >>> with open(out, 'wb') as f: ... ftp.retrbinary('RETR ' + 'README.html', f.write)
For this example, we log into the Debian Linux FTP and change to the Debian folder. Then, we create the name of the file we want to save to and open it in write-binary mode. Finally, we use the FTP object's retrbinary to call RETR to retrieve the file and write it to our local disk. If you'd like to download all the files, then we'll need to a file listing.
import ftplib import os ftp = ftplib.FTP('ftp.debian.org') ftp.login() ftp.cwd('debian') filenames = ftp.nlst() for filename in filenames: host_file = os.path.join( '/home/mike/Desktop/ftp_test', filename) try: with open(host_file, 'wb') as local_file: ftp.retrbinary('RETR ' + filename, local_file.write) except ftplib.error_perm: pass ftp.quit()
This example is fairly similar to the previous one. You will need to modify it to match your own preferred download location, though. The first part of the code is pretty much the same, but then you will note that we call nlst, which gives us a list of filenames and directories. You can give it a directory to list or just call it without and it will assume you want a listing of the current directory. Note that the nlst command doesn't tell us how to differentiate between files and directories from its results. For this example, though, we simply don't care. This is more of a brute force script. So it will loop over the list returned and attempt to download them. If the "file" happens to actually be a directory, then we'll end up creating an empty file on our local disk with the same name as the directory on the FTP server.
There is an MLSD command that you can call via the mlsd method, but not all FTP servers support this command. If they do, then you might be able to differentiate between the two.
Uploading Files to an FTP Server
The other major task that you do with an FTP server is upload files to it. Python can handle this, too. There are actually two methods that you can use for uploading file:
- storlines – Used for uploading text files (TXT, HTML, RST).
- storbinary – Used for uploading binary files (PDF, XLS, etc).
Let's look at an example of how we might do this:
import ftplib def ftp_upload(ftp_obj, path, ftype='TXT'): """ A function for uploading files to an FTP server @param ftp_obj: The file transfer protocol object @param path: The path to the file to upload """ if ftype == 'TXT': with open(path) as fobj: ftp.storlines('STOR ' + path, fobj) else: with open(path, 'rb') as fobj: ftp.storbinary('STOR ' + path, fobj, 1024) if __name__ == '__main__': ftp = ftplib.FTP('host, 'username', 'password') ftp.login() path = '/path/to/something.txt' ftp_upload(ftp, path) pdf_path = '/path/to/something.pdf' ftp_upload(ftp, pdf_path, ftype='PDF') ftp.quit()
In this example, we create a function for uploading files. It takes an FTP object, the path of the file we want to upload and the type of the file. Then, we do a quick check on the file type to determine if we should use storlines or storbinary for our upload process. Finally, in our conditional statement at the bottom, we connect to the FTP server, log in and upload a text file and a PDF file. An easy enhancement to add to this is some logic for changing to a specific directory once we're logged in as we probably don't want to just upload files to the root location.
At this point, you should know enough to get started using Python's ftplib. It has a lot of other methods that are well worth checking out in Python’s documentation on the module. But you now know the basics of listing a directory, navigating the folder structure, as well as downloading and uploading files.
Published at DZone with permission of Mike Driscoll, DZone MVB. See the original article here.
Opinions expressed by DZone contributors are their own.