A threshold test is a test inserted into a deployment pipeline that monitors some measurable phenomenon by comparing the value in the current build against a threshold value. Should the current build's value pass the threshold, the test fails, failing the build.
A common example use of threshold tests is in performance. The team takes a representative set of operations and times them. They then set a threshold test to fail should these operations take some significant amount of time greater than this current value. Thresholds like this are handy for spotting cases where a commit introduces a performance degradation. In data-intensive applications, these often occur due to badly written queries or poor use of indexing.
By having a threshold test, you can spot this kind of problem when it's first introduced. This makes it easier to fix, since you know it's this commit that caused the failure, which cuts down where you have to search for the problem. Furthermore, maintaining a threshold test prevents these kinds of problems from building up in the code base. If you have many of these poor queries, then it's easier for more to creep in since their effects are obscured by the ones already present.
Threshold tests can be used with ratcheting, where you steadily tighten the threshold over time to improve the value. So, each time you add a commit that improves performance, you'd lower the threshold. This should help gradually improve performance over time. Using threshold tests with ratcheting is particularly helpful when you begin in a poor place and start a program of improvement.
There are times when a threshold test will fail and the team decides the threshold is too aggressive and relaxes the threshold. Depending on the circumstances, that may be the right thing to do -- the threshold test helps here by making this decision a conscious one.