h2(#description). Description This template will check if provided variable has any outliers. h3(#introduction). Introduction An outlying observation, or outlier, is one that appears to deviate markedly from other members of the sample in which it occurs. There are several ways to detect the outliers of our data. However, we cannot say one of them is the perfect method for that, thus it could be useful to take different methods into consideration. We present here four of them, one by a chart (a Box Plot based on IQR) and three by statistical descriptions (Lund Test, Grubb's test, Dixon's test). h4(#references). References * Grubbs, F. E.: 1969, Procedures for detecting outlying observations in samples. Technometrics 11, pp. 1-21. h3(#charts). Charts Among the graphical displays the Box plots are quite widespread, because of their several advantages. For example, one can easily get approximately punctual first impression from the data and one can visually see the positions of the (possible) outliers, with the help of them. The Box Plot we used here is based on IQR (Interquartile Range), which is the difference between the higher and the lower quartiles. On the chart the blue box shows the "middle-half" of the data, the so-called whiskers shows the border where from the possible values can be called outliers. The lower whisker is placed 1.5 times below the first quartile, similarly the higher whisker 1.5 times above the third quartile. "!plots/OutlierTest-1.png(Boxplot: edu)!":plots/OutlierTest-1-hires.png h4(#references-1). References * Chambers, John, William Cleveland, Beat Kleiner, and Paul Tukey, (1983), Graphical Methods for Data Analysis, Wadsworth. * Upton, Graham; Cook, Ian (1996). Understanding Statistics. Oxford University Press. p. 55. h3(#lund-test). Lund test It seems that _4_ extreme values can be found in "Internet usage for educational purposes (hours per day)". These are: _10_, _0.5_, _1.5_ and _0.5_. h4(#explanation). Explanation The above test for outliers was based on _lm(edu ~ 1)_:
Linear model: edu ~ 1
Estimate Std. Error t value Pr(>|t|)
*(Intercept)* 2.048 0.07797 26.27 7.939e-105
h4(#references-2). References * Lund, R. E. 1975, "Tables for An Approximate Test for Outliers in Linear Models", Technometrics, vol. 17, no. 4, pp. 473-476. * Prescott, P. 1975, "An Approximate Test for Outliers in Linear Models", Technometrics, vol. 17, no. 1, pp. 129-132. h3(#grubbs-test). Grubb's test Grubbs test for one outlier shows that highest value 12 is an outlier (p=_0.0001964_). h4(#references-3). References * Grubbs, F.E. (1950). Sample Criteria for testing outlying observations. Ann. Math. Stat. 21, 1, 27-58. h3(#dixons-test). Dixon's test chi-squared test for outlier shows that highest value 12 is an outlier (p=_7.441e-07_). h4(#references-4). References * Dixon, W.J. (1950). Analysis of extreme values. Ann. Math. Stat. 21, 4, 488-506. h2(#description-1). Description This template will check if provided variable has any outliers. h3(#introduction-1). Introduction An outlying observation, or outlier, is one that appears to deviate markedly from other members of the sample in which it occurs. There are several ways to detect the outliers of our data. However, we cannot say one of them is the perfect method for that, thus it could be useful to take different methods into consideration. We present here four of them, one by a chart (a Box Plot based on IQR) and three by statistical descriptions (Lund Test, Grubb's test, Dixon's test). h4(#references-5). References * Grubbs, F. E.: 1969, Procedures for detecting outlying observations in samples. Technometrics 11, pp. 1-21. h3(#charts-1). Charts Among the graphical displays the Box plots are quite widespread, because of their several advantages. For example, one can easily get approximately punctual first impression from the data and one can visually see the positions of the (possible) outliers, with the help of them. The Box Plot we used here is based on IQR (Interquartile Range), which is the difference between the higher and the lower quartiles. On the chart the blue box shows the "middle-half" of the data, the so-called whiskers shows the border where from the possible values can be called outliers. The lower whisker is placed 1.5 times below the first quartile, similarly the higher whisker 1.5 times above the third quartile. "!plots/OutlierTest-1.png(Boxplot: edu)!":plots/OutlierTest-1-hires.png h4(#references-6). References * Chambers, John, William Cleveland, Beat Kleiner, and Paul Tukey, (1983), Graphical Methods for Data Analysis, Wadsworth. * Upton, Graham; Cook, Ian (1996). Understanding Statistics. Oxford University Press. p. 55. h3(#lund-test-1). Lund test It seems that _4_ extreme values can be found in "Internet usage for educational purposes (hours per day)". These are: _10_, _0.5_, _1.5_ and _0.5_. h4(#explanation-1). Explanation The above test for outliers was based on _lm(edu ~ 1)_:
Linear model: edu ~ 1
Estimate Std. Error t value Pr(>|t|)
*(Intercept)* 2.048 0.07797 26.27 7.939e-105
h4(#references-7). References * Lund, R. E. 1975, "Tables for An Approximate Test for Outliers in Linear Models", Technometrics, vol. 17, no. 4, pp. 473-476. * Prescott, P. 1975, "An Approximate Test for Outliers in Linear Models", Technometrics, vol. 17, no. 1, pp. 129-132. h3(#grubbs-test-1). Grubb's test Grubbs test for one outlier shows that highest value 12 is an outlier (p=_0.0001964_). h4(#references-8). References * Grubbs, F.E. (1950). Sample Criteria for testing outlying observations. Ann. Math. Stat. 21, 1, 27-58. h3(#dixons-test-1). Dixon's test chi-squared test for outlier shows that highest value 12 is an outlier (p=_7.441e-07_). h4(#references-9). References * Dixon, W.J. (1950). Analysis of extreme values. Ann. Math. Stat. 21, 4, 488-506. h2(#description-2). Description This template will check if provided variable has any outliers. h3(#introduction-2). Introduction An outlying observation, or outlier, is one that appears to deviate markedly from other members of the sample in which it occurs. There are several ways to detect the outliers of our data. However, we cannot say one of them is the perfect method for that, thus it could be useful to take different methods into consideration. We present here four of them, one by a chart (a Box Plot based on IQR) and three by statistical descriptions (Lund Test, Grubb's test, Dixon's test). h4(#references-10). References * Grubbs, F. E.: 1969, Procedures for detecting outlying observations in samples. Technometrics 11, pp. 1-21. h3(#charts-2). Charts Among the graphical displays the Box plots are quite widespread, because of their several advantages. For example, one can easily get approximately punctual first impression from the data and one can visually see the positions of the (possible) outliers, with the help of them. The Box Plot we used here is based on IQR (Interquartile Range), which is the difference between the higher and the lower quartiles. On the chart the blue box shows the "middle-half" of the data, the so-called whiskers shows the border where from the possible values can be called outliers. The lower whisker is placed 1.5 times below the first quartile, similarly the higher whisker 1.5 times above the third quartile. "!plots/OutlierTest-1.png(Boxplot: edu)!":plots/OutlierTest-1-hires.png h4(#references-11). References * Chambers, John, William Cleveland, Beat Kleiner, and Paul Tukey, (1983), Graphical Methods for Data Analysis, Wadsworth. * Upton, Graham; Cook, Ian (1996). Understanding Statistics. Oxford University Press. p. 55. h3(#lund-test-2). Lund test It seems that _4_ extreme values can be found in "Internet usage for educational purposes (hours per day)". These are: _10_, _0.5_, _1.5_ and _0.5_. h4(#explanation-2). Explanation The above test for outliers was based on _lm(edu ~ 1)_:
Linear model: edu ~ 1
Estimate Std. Error t value Pr(>|t|)
*(Intercept)* 2.048 0.07797 26.27 7.939e-105

This report was generated with "R":http://www.r-project.org/ (3.0.1) and "rapport":https://rapporter.github.io/rapport/ (0.51) in _1.082_ sec on x86_64-unknown-linux-gnu platform. !images/logo.png!