Skip to contents

A function for finding outliers in a dataset. Uses Median Absolute Deviation (MAD) to detect values which are a certain distance away from the median value. See stats::mad() for more information.

Usage

find_outliers(x, col, group_col = NULL, threshold = 10)

Arguments

x

A data.frame with the data

col

The column to find outliers from

group_col

The column to group by

threshold

The threshold value for finding outliers. Outliers are threshold * MAD away from the median.

Value

A modified version of the input data.frame, which includes columns for median, MAD and a logical column to indicate outliers.

Examples

silk_data1 |>
  find_outliers("y", group_col = "group") |>
  head()
#>   time         y   group  .median     .mad .outlier
#> 1    1  8.584244 series1 8.065949 2.713192    FALSE
#> 2    2  9.159694 series1 8.065949 2.713192    FALSE
#> 3    3  9.717704 series1 8.065949 2.713192    FALSE
#> 4    4 10.249923 series1 8.065949 2.713192    FALSE
#> 5    5 10.748432 series1 8.065949 2.713192    FALSE
#> 6    6 11.205878 series1 8.065949 2.713192    FALSE