Identifying outliers
An outlier is a data point in your dataset that is out of place or doesn’t fit well with other data. For example, we might collect data about our daily revenue and see that each day we consistently have $1,000 in total sales. If we then have a day where our daily revenue is about $6,000, then that would be an outlier if the daily revenue then goes back down to $1,000. Outliers can be either positive or negative; in fact, they are just any deviation from a typical or expected result. It is important to identify and deal with outliers because they can cause problems when we try to make business decisions as they skew decisions; when outliers have been properly identified, they will help ensure the higher accuracy of insights gained from your data. You can define your calculations on what an outlier is. You can create measures to define what would be considered anomalous in your dataset.
Math
In mathematics, outliers are often defined as being more than 2 standard deviations away from the mean. As this is not a math course, I will not go into any more detail about it.
Calculated columns can be used to identify outliers; however, since we know calculated columns are only updated when data in the model is refreshed, the more dynamic approach would be using a visualization or DAX formula. After outliers have been identified, the most common next step would be to visualize them, likely using a scatter chart, as shown in Figure 13.1. Outliers can be highlighted or made easier to notice using filters or slicers:

Figure 13.1 – Scatter plots can quickly identify outliers across multiple dimensions
By combining DAX measures and visualizations, you will be able to spot outliers and address them quickly. Just remember, not every outlier is a bad data point. Sometimes the real world conspires against our mathematical models. For example, as we previously discussed, you may have daily sales revenue data that is typically about $1,000 each day. On a day the daily sales are $6,000, it might have actually been the start of a huge new promotion, so just because it looks like an anomaly, doesn’t immediately mean it’s an outlier. We have to take other factors into consideration that might have caused that anomaly.
Now that we have learned all about the basics of finding outliers, we will explore a unique way to do so, using anomaly detection.