Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

A Tool to Generate Customizable Test Data with Python

DZone's Guide to

A Tool to Generate Customizable Test Data with Python

· Big Data Zone ·
Free Resource

The open source HPCC Systems platform is a proven, easy to use solution for managing data at scale. Visit our Easy Guide to learn more about this completely free platform, test drive some code in the online Playground, and get started today.

Sometimes you need a dataset to run some tests - just a bunch of data, anything - and it can be unexpectedly difficult to find something that works. There are some useful and readily-available options out there; for example, Matthew Dubins has worked with the Enron email dataset and a complete list of 9/11 victims.

However, if you have more specific needs, particularly when it comes to format and fitting within the structure of a database, and you want to customize your dataset to test one thing or another in particular, take a look at this Python package called python-testdata used to generate customizable test data. It can be set up to generate names in various forms, companies, addresses, emails, and more. The Github also includes some help to get started, as well as examples for use cases.

So, if you find that you have more solutions than you have problems to apply them to, python-testdata might be what you need.

Managing data at scale doesn’t have to be hard. Find out how the completely free, open source HPCC Systems platform makes it easier to update, easier to program, easier to integrate data, and easier to manage clusters. Download and get started today.

Topics:

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}