Over a million developers have joined DZone.

Python 101: How to Traverse a Directory

Python is so handy for operating system and administration work. Review how to traverse a directory here.

· Web Dev Zone

Start coding today to experience the powerful engine that drives data application’s development, brought to you in partnership with Qlik.

Every so often you will find yourself needing to write code that traverses a directory. They tend to be one-off scripts or clean up scripts that run in cron in my experience. Anyway, Python provides a very useful method of walking a directory structure that is aptly called os.walk. I usually use this functionality to go through a set of folders and sub-folders where I need to remove old files or move them into an archive directory. Let’s spend some time learning about how to traverse directories in Python!

Using os.walk

Using os.walk takes a bit of practice to get it right. Here’s an example that just prints out all your file names in the path that you passed to it:

import os

def pywalker(path):
    for root, dirs, files in os.walk(path):
        for file_ in files:
            print( os.path.join(root, file_) )

if __name__ == '__main__':
    pywalker('/path/to/some/folder')

By joining the root and the file_ elements, you end up with the full path to the file. If you want to check the date of when the file was made, then you would use os.stat. I’ve used this in the past to create a cleanup script, for example.

If all you want to do is check out a listing of the folders and files in the specified path, then you’re looking for os.listdir. Most of the time, I usually need drill down to the lowest sub-folder, so listdir isn’t good enough and I need to use os.walk instead.

Using os.scandir() Directly

Python 3.5 recently added os.scandir(), which is a new directory iteration function. You can read about it in PEP 471. In Python 3.5, os.walk is implemented using os.scandir “which makes it 3 to 5 times faster on POSIX systems and 7 to 20 times faster on Windows systems” according to the Python 3.5 announcement.

Let’s try out a simple example using os.scandir directly.

import os

folders = []
files = []

for entry in os.scandir('/'):
    if entry.is_dir():
        folders.append(entry.path)
    elif entry.is_file():
        files.append(entry.path)

print('Folders:')
print(folders)

Scandir returns an iterator of DirEntry objects which are lightweight and have handy methods that can tell you about the paths you’re iterating over. In the example above, we check to see if the entry is a file or a directory and append the item to the appropriate list. You can also a stat object via the DirEntry’s stat method, which is pretty neat!

Wrapping Up

Now you know how to traverse a directory structure in Python. If you’d like the speed improvements that os.scandir provides in a version of Python older than 3.5, you can get the scandir package on PyPI.

Create data driven applications in Qlik’s free and easy to use coding environment, brought to you in partnership with Qlik.

Topics:
python ,dictionary ,directory

Published at DZone with permission of Mike Driscoll, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

The best of DZone straight to your inbox.

SEE AN EXAMPLE
Please provide a valid email address.

Thanks for subscribing!

Awesome! Check your inbox to verify your email so you can start receiving the latest in tech news and resources.
Subscribe

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}