DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports Events Over 2 million developers have joined DZone. Join Today! Thanks for visiting DZone today,
Edit Profile Manage Email Subscriptions Moderation Admin Console How to Post to DZone Article Submission Guidelines
View Profile
Sign Out
Refcards
Trend Reports
Events
Zones
Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Partner Zones AWS Cloud
by AWS Developer Relations
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Partner Zones
AWS Cloud
by AWS Developer Relations
The Latest "Software Integration: The Intersection of APIs, Microservices, and Cloud-Based Systems" Trend Report
Get the report

Random Sampling From a File

See the Linux commands in action that will set up a random sampling of values from a file, either with or without replacement.

John Cook user avatar by
John Cook
·
May. 28, 19 · Tutorial
Like (2)
Save
Tweet
Share
2.87K Views

Join the DZone community and get the full member experience.

Join For Free

I recently learned about the Linux command line utility shuf from browsing The Art of Command Line. This could be useful for random sampling.

Given just a file name, shuf randomly permutes the lines of the file.

With the option -n, you can specify how many lines to return. So it’s doing sampling without replacement. For example...

shuf -n 10 foo.txt


... would select 10 lines from foo.txt.

Actually, it would select at most 10 lines. You can’t select 10 lines without replacement from a file with fewer than 10 lines. If you ask for an impossible number of lines, the -n option is ignored.

You can also sample with replacement using the -r option. In that case, you can select more lines than are in the file since lines may be reused. For example, you could run ...

shuf -r -n 10 foo.txt


... to select 10 lines drawn with replacement from foo.txt, regardless of how many lines foo.txt has. For example, when I ran the command above on a file containing

alpha
beta
gamma


I got the output:

beta
gamma
gamma
beta
alpha
alpha
gamma
gamma
beta


I don’t know how shuf seeds its random generator. Maybe from the system time. But if you run it twice you will get different results. Probably.

  • Randomization that will stand up in court
  • Volatility in adaptive randomization clinical trials
  • How to test a random number generator

Published at DZone with permission of John Cook, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

Popular on DZone

  • Host Hack Attempt Detection Using ELK
  • Documentation 101: How to Properly Document Your Cloud Infrastructure Project
  • Reliability Is Slowing You Down
  • Create a CLI Chatbot With the ChatGPT API and Node.js

Comments

Partner Resources

X

ABOUT US

  • About DZone
  • Send feedback
  • Careers
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 600 Park Offices Drive
  • Suite 300
  • Durham, NC 27709
  • support@dzone.com
  • +1 (919) 678-0300

Let's be friends: