In this week's Whiteboard Walkthrough, Ted Dunning, Chief Application Architect at MapR, gets you up to speed on the t-digest, an algorithm you can add to any anomaly detector to set the number of alarms that you get as a percentage of the total samples. It estimates percentiles very accurately–especially high or low percentiles–and allows you to set a threshold for alarms.
Take a look in Ted's GitHub repository on t-digest.
We also found a great blog post on the t-digest.
The concept is also described in great detail in Ted Dunning and Ellen Friedman's book Practical Machine Learning: A New Look at Anomaly Detection.