DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports
Events Video Library
Over 2 million developers have joined DZone. Join Today! Thanks for visiting DZone today,
Edit Profile Manage Email Subscriptions Moderation Admin Console How to Post to DZone Article Submission Guidelines
View Profile
Sign Out
Refcards
Trend Reports
Events
View Events Video Library
Zones
Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

Integrating PostgreSQL Databases with ANF: Join this workshop to learn how to create a PostgreSQL server using Instaclustr’s managed service

Mobile Database Essentials: Assess data needs, storage requirements, and more when leveraging databases for cloud and edge applications.

Monitoring and Observability for LLMs: Datadog and Google Cloud discuss how to achieve optimal AI model performance.

Automated Testing: The latest on architecture, TDD, and the benefits of AI and low-code tools.

Related

  • What Is Encryption and How Does It Work?
  • Why Is Fuzzy Matching Software a Key for Deduplication?
  • 5 Cyber Security Risks Every Freelancer Must Avoid While Working With Foreign Clients
  • When Is It Time to Retire Your Legacy System and Go Cloud?

Trending

  • Supercharge Your Communication With Twilio and Ballerina
  • How To Learn Secure Software Development Lifecycle (SDLC)
  • Beyond the Prompt: Unmasking Prompt Injections in Large Language Models
  • A Continuous Testing Approach to Performance
  1. DZone
  2. Data Engineering
  3. AI/ML
  4. Combinatorics Beyond the Basics

Combinatorics Beyond the Basics

Stats, probability, and combinatorics come in handy for devs and engineers more than you might think. Let's take a deeper-than-average (but quick) look at combinatorics.

John Cook user avatar by
John Cook
·
May. 24, 18 · Presentation
Like (3)
Save
Tweet
Share
5.73K Views

Join the DZone community and get the full member experience.

Join For Free

Most basic combinatorial problems can be solved in terms of multiplication, permutations, and combinations. The next step beyond the basics, in my experience, is counting selections with replacement. Often when I run into a problem that is not quite transparent, it boils down to this.

Examples of Selection With Replacement

Here are three problems that reduce to counting selections with replacement.

Looking Ahead in an Experiment

For example, suppose you're running an experiment in which you randomize to n different treatments and you want to know how many ways the next k subjects can be assigned to treatments. So if you had treatments A, B, and C, and five subjects, you could assign all five to A, or four to A and one to B, etc. for a total of 21 possibilities. You're choosing 5 things from a set of 3, with replacement because you can (and will) assign the same treatment more than once.

Partial Derivatives

For another example, if you're taking the kth order partial derivatives of a function of n variables, you're choosing k things (variables to differentiate with respect to) from a set of n (the variables). Equality of mixed partials for smooth functions says that all that matters is how many times you differentiate with respect to each variable. The order doesn't matter: differentiating with respect to x and then with respect to y gives the same result as taking the derivatives in the opposite order, as long as the function you're differentiating has enough derivatives. I wrote about this example here.

Sums Over Even Terms

I recently had an expression come up that required summing over n distinct variables, each with a non-negative even value, and summing to 2 k. How many ways can that be done? As many as dividing all the variable values in half and asking that they sum to k. Here the thing being chosen, the variable, and since the indexes sum to k, I have a total of k choices to make, with replacement since I can choose a variable more than once. So again I'm choosing k things with replacement from a set of size n.

Formula

I wrote up a set of notes on sampling with replacement that I won't duplicate here, but in a nutshell:

The symbol on the left is Richard Stanley's suggested notation for the number of ways to select k things with replacement from a set of size n. It turns out that this equals the expression on the right side. The derivation isn't too long, but it's not completely obvious either. See the aforementioned notes for details.

I started by saying basic combinatorial problems can be reduced to multiplication, permutations, and combinations. Sampling with replacement can be reduced to a combination, the right side of the equation above, but with non-obvious arguments. Hence the need to introduce a new symbol, the one on the right, that maps directly to problem statements.

Enumerating Possibilities

Sometimes you just need to count how many ways one can select k things with replacement from a set of size n. But often you need to actually enumerate the possibilities, say to loop over them in software.

In conducting a clinical trial, you may want to enumerate all the ways pending data could turn out. If you're going to act the same way, no matter how the pending data work out, there's no need to wait for the missing data before proceeding. This is something I did many times when I worked at MD Anderson Cancer Center.

When you evaluate a multivariate Taylor series, you need to carry out a sum that has a term for corresponding to each partial derivative. You could naively sum over all possible derivatives, not taking into account equality of mixed partials, but that could be very inefficient, as described here.

The example above summing over even partitions of an even number also comes out of a problem with multivariate Taylor series, one in which odd terms drop out by symmetry.

To find algorithms for enumerating selections with replacement, see Knuth's TAOCP, volume 4, Fascicle 3.

Related Posts

  • Richard Stanley’s 12-fold way (pdf)
  • Big derivatives
  • Irrelevant uncertainty
Data (computing) Enumerate (project) Uncertainty IT Drops (app) Partition (database) POST (HTTP) Software Algorithm

Published at DZone with permission of John Cook, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

Related

  • What Is Encryption and How Does It Work?
  • Why Is Fuzzy Matching Software a Key for Deduplication?
  • 5 Cyber Security Risks Every Freelancer Must Avoid While Working With Foreign Clients
  • When Is It Time to Retire Your Legacy System and Go Cloud?

Comments

Partner Resources

X

ABOUT US

  • About DZone
  • Send feedback
  • Careers
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • support@dzone.com

Let's be friends: