Statistics for It Managers

Autor: Kishore R • July 29, 2016 • Study Guide • 7,268 Words (30 Pages) • 1,078 Views

Page 1 of 30

STATISTICS FOR IT MANAGERS

STUDY GUIDE

This study guide is intended to provide you an idea of the content coverage that I have in mind when writing the comprehensive final exam. The study guide is not meant to be perfectly exhaustive of topics or ideas that can appear but is designed to provide a general guideline. That said, it will only occur by unintentional accident if a question on the final pertains to content that was not covered in this guide.

Descriptive Statistics

Measures of central tendency pertain to the data’s tendency to cluster or center about certain values

The mean is the average of values: We sum the observations and divide by the number of observations
We label the n observations from a sample as x1, x2, ....xn where x1 is first, x2 is second and xn is last

Sample mean:

_ n[pic 3]

Population mean:

x = i=1 xi n

µ = i=1 xi N[pic 4][pic 5]

Median: The middle observation when data are placed in ascending or descending

order

Mode: This is the observation that occurs with greatest frequency
A dataset is skewed if one tail has more extreme observations than the other tail

Right-skewed data: The right tail or the high end of the distribution has more extreme observations
Left-skewed data: The left tail or the low end of the distribution has more extreme observations.
The extreme values tend to pull the mean leftward away from the median.

Measures of variability pertain to the spread of the data.

The two most important measures of variability are variance and standard deviation

The sample variance for a sample of n measurements is equal to the sum of squared deviations from the mean divided by (n − 1).

n 2

s2 =

The population variance:

σ2 =

i=1(xi − x)

n − 1[pic 6]

i=1(xi − µ)2[pic 7][pic 8]

The standard deviation is the square root of the variance

s = √s2[pic 9]

Empirical Rule:

(1) Approximately 68% of all observations fall within 1 s.d. of the mean
(2) Approximately 95% of all observations fall within 2 s.d. of the mean
(2) Approximately 99.7% of all observations fall within 3 s.d. of the mean
If the data are not bell-shaped, the empirical rule does not apply

Regardless the shape of the distribution we can use Chebysheff’s Theorem: the proportion of observations in any sample or population that lies within k standard

deviations of the mean is at least 1 − 1[pic 10]

for k > 1

Ex. If k = 2, then 75% of observations are within 2 s.d. of the mean
Ex. If k = 3, then 88.9% of observations are within 3 s.d. of the mean

Box plot:

A box plot uses five statistics to represent the data:
The minimum and maximum observations help us construct vertical (horizon- tal) lines called “whiskers”
The 1st, 2nd, and 3rd quartiles are represented by 3 horizontal (vertical) lines.

Any point outside the whiskers is an outlier

The whiskers extend to the smaller of 1.5 times the IQR or to the most extreme observation that is not an outlier.

To build a box plot, draw the ends (hinges) at the lower and upper quartiles,

Q1 and Q3 (or Ql and Qu), respectively

The points at distances of 1.5× IQR from each hinge define the inner fences

Lines (whiskers) are drawn from each hinges to the most extreme measurement inside the inner fence:

Lower inner fence: Q1 − 1.5(IQR)
Upper inner fence: Q3 + 1.5(IQR)

Outer fences lie at a range of 3 IQR from the hinges.

Lower outer fence: Q1 − 3(IQR)
Upper outer fence: Q3 + 3(IQR)

No lines are drawn. A symbol like ∗ might indicate an observation between inner and outer fences.

A symbol like 0 might indicate observations beyond outer fences.

We can identify suspected outliers using the boxplot:

Observations between 1.5(IQR) and 3(IQR) are suspicious

Observations beyond outer fences 3(IQR) are very suspicious

We can also use the z−score to find outliers:

Sample z−score:[pic 11]

Population z−score:

z = x − x

z = x − µ

|z| > 2 indicates possible outlier
|z| > 3 indicates an outlier

Normal Distribution

The normal distribution is symmetric about its mean,µ
Its spread is determined by the standard deviation, σ
In order to graph, we need µ and σ.
To find the probability a normal random variable falls into an interval we must compute the area in the interval under the curve

We reduce the number of tables by standardizing: subtract the mean and divide by the standard deviation
When the variable is normal, the transformed variable is called a standard normal random variable and denoted Z i.e. Z = X−µ[pic 12]
The probability statement about X is transformed into one about Z.
The standard normal distribution is the normal distribution with µ = 0 and

σ = 1.

A random variable with the standard normal distribution is denoted z and called a standard normal random variable.

∗ Ex. P (x ≤ 12) = P (z ≤ 1.25) = .8944

∗ Ex. P (x > 12) = P (z > 1.25) = 0.1056

∗ Ex. P (x ≤ −12) = P (z ≤ −1.25) = 0.1056

∗ Ex. P (−1.04 ≤ x ≤ 12) = P (−0.38 ≤ z ≤ 1.25) = 0.8944

...

Download as: txt (22.7 Kb) pdf (434.3 Kb) docx (1.6 Mb)

Continue for 29 more pages »

Read Full Essay Save