Wednesday, July 23, 2025

x̄ - > Statistics Overview

Statistics Overview

Statistics Overview

Statistics is the science of collecting, analyzing, interpreting, and presenting data. Below is an overview of its main subtopics, each with a brief explanation and a practical example.

1. Descriptive Statistics

Descriptive statistics summarize and describe the main features of a dataset using numerical measures or visualizations.

a. Measures of Central Tendency

These describe the center of a dataset.

  • Mean (Average): Sum of values divided by the number of values.
    • Example: The test scores of 5 students are 85, 90, 78, 92, and 88. The mean is \( \frac{85 + 90 + 78 + 92 + 88}{5} = 86.6 \).
  • Median: The middle value when data is ordered.
    • Example: For the scores 78, 85, 88, 90, 92, the median is 88.
  • Mode: The most frequent value in a dataset.
    • Example: In the dataset {3, 5, 5, 7, 8}, the mode is 5 (appears twice).

b. Measures of Dispersion

These describe the spread or variability of the data.

  • Range: Difference between the maximum and minimum values.
    • Example: For the scores 78, 85, 88, 90, 92, the range is \( 92 - 78 = 14 \).
  • Variance: Average of squared deviations from the mean.
    • Example: For the dataset {2, 4, 6}, the mean is 4. Variance = \( \frac{(2-4)^2 + (4-4)^2 + (6-4)^2}{3} = \frac{4 + 0 + 4}{3} = 2.67 \).
  • Standard Deviation: Square root of variance.
    • Example: Using the variance 2.67, the standard deviation is \( \sqrt{2.67} \approx 1.63 \).
  • Interquartile Range (IQR): Difference between the 75th percentile (Q3) and 25th percentile (Q1).
    • Example: For the dataset {1, 3, 5, 7, 9}, Q1 = 2, Q3 = 8, so IQR = \( 8 - 2 = 6 \).

c. Data Visualization

Graphical representations to summarize data.

  • Histogram: Displays the distribution of a continuous variable.
    • Example: A histogram of student exam scores (e.g., 60–70, 70–80, etc.) shows how many students fall into each score range.
  • Bar Chart: Represents categorical data with bars.
    • Example: A bar chart showing the number of students in each major (e.g., 50 in Biology, 30 in Physics, 20 in Math).
  • Box Plot: Visualizes the spread and identifies outliers using quartiles.
    • Example: A box plot of salaries in a company shows the median salary, IQR, and outliers (e.g., an unusually high CEO salary).
  • Scatter Plot: Shows relationships between two continuous variables.
    • Example: A scatter plot of hours studied vs. exam scores to see if more study time correlates with higher scores.

2. Inferential Statistics

Inferential statistics use sample data to make generalizations or predictions about a population.

a. Hypothesis Testing

Tests claims about a population based on sample data.

  • Example: A company claims its new drug lowers blood pressure. A t-test compares the average blood pressure of a sample taking the drug (e.g., mean = 120 mmHg) vs. a control group (e.g., mean = 130 mmHg) to determine if the difference is statistically significant (\( p < 0.05 \)).

b. Confidence Intervals

Estimates a range within which a population parameter likely lies.

  • Example: A survey finds that 60% of 1,000 voters support a candidate, with a 95% confidence interval of 57% to 63%.

c. Regression Analysis

Models the relationship between variables.

  • Simple Linear Regression: Predicts a dependent variable based on one independent variable.
    • Example: Predicting house prices based on square footage. A regression model might find: \( \text{Price} = 50,000 + 200 \times \text{SquareFeet} \).
  • Multiple Regression: Uses multiple independent variables.
    • Example: Predicting house prices using square footage, number of bedrooms, and location.

d. Analysis of Variance (ANOVA)

Compares means across multiple groups.

  • Example: Testing whether three different teaching methods (lecture, online, hybrid) lead to different average test scores among students.

e. Chi-Square Tests

Tests relationships between categorical variables.

  • Example: A study tests if gender (male/female) is associated with voting preference (Candidate A/B).

3. Probability

Probability quantifies the likelihood of events, forming the foundation for inferential statistics.

a. Basic Probability

Calculates the chance of an event occurring.

  • Example: The probability of rolling a 6 on a fair die is \( \frac{1}{6} \approx 0.167 \).

b. Conditional Probability

Probability of an event given that another event has occurred.

  • Example: In a class, 40% of students are female, and 25% of females wear glasses. The probability a student is female and wears glasses is \( 0.40 \times 0.25 = 0.10 \).

c. Probability Distributions

Describes how probabilities are distributed over values of a random variable.

  • Binomial Distribution: Models the number of successes in a fixed number of trials.
    • Example: If 70% of customers buy a product, the probability that exactly 3 out of 5 customers buy it follows a binomial distribution.
  • Normal Distribution: A bell-shaped curve describing many natural phenomena.
    • Example: IQ scores are normally distributed with a mean of 100 and a standard deviation of 15. About 68% of people have IQs between 85 and 115.
  • Poisson Distribution: Models the number of events in a fixed interval.
    • Example: If a call center receives an average of 5 calls per hour, the Poisson distribution gives the probability of receiving exactly 3 calls in an hour.

4. Data Collection

Methods for gathering data to ensure accuracy and reliability.

a. Sampling

Selecting a subset of a population for analysis.

  • Simple Random Sampling: Every individual has an equal chance of selection.
    • Example: Randomly selecting 100 students from a school of 1,000 to survey their lunch preferences.
  • Stratified Sampling: Dividing the population into subgroups and sampling from each.
    • Example: Dividing a city into districts and sampling 50 people from each district to study voting behavior.
  • Cluster Sampling: Dividing the population into clusters and sampling entire clusters.
    • Example: Selecting 5 schools from a city and surveying all students in those schools.

b. Experimental Design

Structuring experiments to test hypotheses.

  • Randomized Controlled Trial (RCT): Randomly assigns subjects to treatment or control groups.
    • Example: Testing a new drug by randomly assigning patients to receive the drug or a placebo and comparing outcomes.
  • Factorial Design: Tests multiple factors simultaneously.
    • Example: Studying the effect of fertilizer type and watering frequency on plant growth, testing all combinations.

c. Observational Studies

Analyzing data without manipulating variables.

  • Example: Studying the relationship between smoking and lung cancer by observing smokers and non-smokers over time without assigning treatments.

5. Statistical Modeling

Creating mathematical representations of data relationships.

a. Time Series Analysis

Analyzes data points collected over time.

  • Example: Forecasting monthly sales for a store based on historical sales data, accounting for seasonal trends.

b. Bayesian Statistics

Uses probability to update beliefs based on new data.

  • Example: Estimating the probability a patient has a disease based on prior prevalence (prior probability) and a positive test result (new evidence).

c. Machine Learning Models

Uses statistical techniques to predict or classify data.

  • Example: A logistic regression model predicts whether a customer will buy a product based on age, income, and browsing history.

Example Visualization: Bar Chart

No comments:

Meet the Authors
Zacharia Maganga’s blog features multiple contributors with clear activity status.
Active ✔
πŸ§‘‍πŸ’»
Zacharia Maganga
Lead Author
Active ✔
πŸ‘©‍πŸ’»
Linda Bahati
Co‑Author
Active ✔
πŸ‘¨‍πŸ’»
Jefferson Mwangolo
Co‑Author
Inactive ✖
πŸ‘©‍πŸŽ“
Florence Wavinya
Guest Author
Inactive ✖
πŸ‘©‍πŸŽ“
Esther Njeri
Guest Author
Inactive ✖
πŸ‘©‍πŸŽ“
Clemence Mwangolo
Guest Author

x̄ - > Bloomberg BS Model - King James Rodriguez Brazil 2014

Bloomberg BS Model - King James Rodriguez Brazil 2014 πŸ”Š Read ⏸ Pause ▶ Resume ⏹ Stop ⚽ The Silent Kin...

Labels

Data (3) Infographics (3) Mathematics (3) Sociology (3) Algebraic structure (2) Environment (2) Machine Learning (2) Sociology of Religion and Sexuality (2) kuku (2) #Mbele na Biz (1) #StopTheSpread (1) #stillamother #wantedchoosenplanned #bereavedmothersday #mothersday (1) #university#ai#mathematics#innovation#education#education #research#elearning #edtech (1) ( Migai Winter 2011) (1) 8-4-4 (1) AI Bubble (1) Accrual Accounting (1) Agriculture (1) Algebra (1) Algorithms (1) Amusement of mathematics (1) Analysis GDP VS employment growth (1) Analysis report (1) Animal Health (1) Applied AI Lab (1) Arithmetic operations (1) Black-Scholes (1) Bleu Ranger FC (1) Blockchain (1) CATS (1) CBC (1) Capital markets (1) Cash Accounting (1) Cauchy integral theorem (1) Coding theory. (1) Computer Science (1) Computer vision (1) Creative Commons (1) Cryptocurrency (1) Cryptography (1) Currencies (1) DISC (1) Data Analysis (1) Data Science (1) Decision-Making (1) Differential Equations (1) Economic Indicators (1) Economics (1) Education (1) Experimental design and sampling (1) Financial Data (1) Financial markets (1) Finite fields (1) Fractals (1) Free MCBoot (1) Funds (1) Future stock price (1) Galois fields (1) Game (1) Grants (1) Health (1) Hedging my bet (1) Holormophic (1) IS–LM (1) Indices (1) Infinite (1) Investment (1) KCSE (1) KJSE (1) Kapital Inteligence (1) Kenya education (1) Latex (1) Law (1) Limit (1) Logic (1) MBTI (1) Market Analysis. (1) Market pulse (1) Mathematical insights (1) Moby dick; ot The Whale (1) Montecarlo simulation (1) Motorcycle Taxi Rides (1) Mural (1) Nature Shape (1) Observed paterns (1) Olympiad (1) Open PS2 Loader (1) Outta Pharaoh hand (1) Physics (1) Predictions (1) Programing (1) Proof (1) Python Code (1) Quiz (1) Quotation (1) R programming (1) RAG (1) RL (1) Remove Duplicate Rows (1) Remove Rows with Missing Values (1) Replace Missing Values with Another Value (1) Risk Management (1) Safety (1) Science (1) Scientific method (1) Semantics (1) Statistical Modelling (1) Stochastic (1) Stock Markets (1) Stock price dynamics (1) Stock-Price (1) Stocks (1) Survey (1) Sustainable Agriculture (1) Symbols (1) Syntax (1) Taroch Coalition (1) The Nature of Mathematics (1) The safe way of science (1) Travel (1) Troubleshoting (1) Tsavo National park (1) Volatility (1) World time (1) Youtube Videos (1) analysis (1) and Belbin Insights (1) competency-based curriculum (1) conformal maps. (1) decisions (1) over-the-counter (OTC) markets (1) pedagogy (1) pi (1) power series (1) residues (1) stock exchange (1) uplifted (1)

Followers