Friday, July 13, 2018

Measures of Spread, Outliers, and Histogram

Measures of Spread
1. Range
2. Interquartile Range (IQR)
3. Standard Deviation
4. Variance

Histograms
- center
- spread
- shape
- outliers

Five Number Summary
Minimum: The smallest number in the dataset.
Q1: The value such that 25% of the data fall below.
Q2(= Median): The value such that 50% of the data fall below.
Q3: The value such that 75% of the data fall below.
Maximum: The largest value in the dataset.


Range
Max- Min

Interquartile Range
Q3 - Q1

Box Plot
Useful for quickly comparing the spread of two data sets




















Standard Deviation
is one of the most common measures for talking about the spread of data.
It is defined as the average distance of each observation from the mean.


Variance:
The average squared difference of each observation from the mean
n1 i=1n (xi x¯)2



Shape of Histogram









Left Skewed: mean < median
-> "Skewed left" means that there are more points (well, more heavily weighted points) left of the mean.
Right Skewed: mean > median
-> more data points on the right

  • When we have data that follows a normal distribution, we can completely understand our dataset using the mean and standard deviation.
  • However, if our dataset is skewed, the 5 number summary (and measures of center associated with it) might be better to summarize our dataset.


Outliers
Points that fall very far from the rest of the data points.

Common Techniques for Outliers
1. Noting they exist and the impact on summary statistics.
2. If typo - remove or fix.
3. Understanding why they exist, and the impact on questions we are trying to answer about our data.
4. Reporting the 5 number summary values is often a better indication than measures like the mean and standard deviation when we have outliers.
5. Be careful in reporting. Know how to ask the right questions.

6 comments:

  1. Get the best SEO services from the leading seo company in India, Flymedia Technology. For more information you should follow SEO Company in India

    ReplyDelete
  2. Film108 is one of the best movie promotion and marketing agency in Mumbai, always trying to adapt to innovative approaches for each brand and mark a difference in the digital world.

    ReplyDelete
  3. Excellent blog! Thanks for this amazing post. Very appreciated! Get the top SEO Service Providers in India that help you with a advanced technologies and innovations can reach out to the latest information.

    ReplyDelete

  4. Indeed it was a great read on SEO services as said all points were accurate and were self explainatry. Must read for anyone who is interested in graphic desinghing and check out.
    seo services delhi

    ReplyDelete

SQL Joins

Database Normalization:  Are the tables storing logical groupings of the data? Can I make changes in a single location, rather than in...