Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

A Tool to Generate Customizable Test Data with Python

DZone's Guide to

A Tool to Generate Customizable Test Data with Python

· Big Data Zone ·
Free Resource

Hortonworks Sandbox for HDP and HDF is your chance to get started on learning, developing, testing and trying out new features. Each download comes preconfigured with interactive tutorials, sample data and developments from the Apache community.

Sometimes you need a dataset to run some tests - just a bunch of data, anything - and it can be unexpectedly difficult to find something that works. There are some useful and readily-available options out there; for example, Matthew Dubins has worked with the Enron email dataset and a complete list of 9/11 victims.

However, if you have more specific needs, particularly when it comes to format and fitting within the structure of a database, and you want to customize your dataset to test one thing or another in particular, take a look at this Python package called python-testdata used to generate customizable test data. It can be set up to generate names in various forms, companies, addresses, emails, and more. The Github also includes some help to get started, as well as examples for use cases.

So, if you find that you have more solutions than you have problems to apply them to, python-testdata might be what you need.

Hortonworks Sandbox for HDP and HDF is your chance to get started on learning, developing, testing and trying out new features. Each download comes preconfigured with interactive tutorials, sample data and developments from the Apache community.

Topics:

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}