Home Statistic Review Path
Post
Cancel

Statistic Review Path

Probability:

Description: Probability is the foundation of statistics and data science. You should have a strong understanding of basic probability concepts such as independence, conditional probability, Bayes’ theorem, and probability distributions.

Core Concepts:

  • Probability rules: Probability of an event occurring, complement, union, intersection, conditional probability.
  • Random variables: Discrete and continuous random variables, probability mass functions and probability density functions, expected value and variance, moment generating functions.
  • Common distributions: Bernoulli, binomial, Poisson, normal, exponential, uniform, etc.

Review Pathway: Start by understanding basic concepts such as sample spaces, events, and probability rules. Learn about probability distributions such as the normal distribution, binomial distribution, and Poisson distribution. Practice calculating probabilities using formulas and simulation techniques.

Descriptive statistics:

Description: Descriptive statistics involves summarizing and describing data using measures such as mean, median, mode, variance, standard deviation, skewness, and kurtosis. These measures are important for understanding the distribution of your data.

Core Concepts:

  • Measures of central tendency: Mean, median, mode.
  • Measures of variability: Range, variance, standard deviation, interquartile range.
  • Skewness and kurtosis: Measures of the shape of the distribution.
  • Graphical representations: Histograms, box plots, stem-and-leaf plots, scatterplots.

Review Pathway: Learn about measures of central tendency such as mean, median, and mode, and measures of variability such as variance and standard deviation. Understand the concept of skewness and kurtosis. Practice calculating these measures and interpreting their meanings.

Inferential statistics:

Description: Inferential statistics involves making inferences about a population based on a sample. You should be familiar with hypothesis testing, confidence intervals, and p-values.

Core Concepts:

  • Sampling: Random sampling, sampling distributions.
  • Estimation: Point estimates, interval estimates, confidence intervals.
  • Hypothesis testing: Null and alternative hypotheses, p-values, type I and type II errors, power.
  • One and two-sample t-tests, z-tests, ANOVA, chi-square tests.

Review Pathway: Learn about hypothesis testing, confidence intervals, and p-values. Understand the difference between null and alternative hypotheses, and how to conduct a hypothesis test using a t-test or z-test. Learn about the Central Limit Theorem and its importance in inferential statistics.

Regression analysis:

Description: Regression analysis involves modeling the relationship between two or more variables. You should understand linear regression, logistic regression, and generalized linear models.

Core Concepts:

  • Simple linear regression: Relationship between two variables, least squares method, residuals, coefficient of determination.
  • Multiple linear regression: Relationship between multiple variables, interpretation of coefficients, multicollinearity, model selection.
  • Logistic regression: Modeling binary outcomes, odds and odds ratios.

Review Pathway: Start with simple linear regression and understand the concepts of regression coefficients, residuals, and R-squared. Learn about multiple linear regression and logistic regression. Practice building regression models, interpreting their results, and checking assumptions.

Machine learning algorithms:

Description: Machine learning algorithms are used to build predictive models from data. You should be familiar with popular machine learning algorithms such as decision trees, random forests, k-nearest neighbors, support vector machines, and neural networks.

Core Concepts:

  • Supervised learning: Regression, classification, decision trees, random forests, support vector machines, k-nearest neighbors.
  • Unsupervised learning: Clustering, dimensionality reduction, principal component analysis, k-means clustering, hierarchical clustering.

Review Pathway: Start with supervised learning algorithms such as decision trees, random forests, and k-nearest neighbors. Learn about unsupervised learning algorithms such as clustering and dimensionality reduction. Practice building models, evaluating their performance, and tuning hyperparameters.

Experimental design:

Description: Experimental design involves planning and executing experiments to test hypotheses. You should understand the basic principles of experimental design, such as randomization, replication, and control.

Core Concepts:

  • Randomization: Random assignment, sampling.
  • Replication: Repeating experiments to reduce variability.
  • Control: Holding variables constant to isolate the effect of the treatment.
  • Experimental designs: Completely randomized design, randomized block design, factorial design.

Review Pathway: Learn about the basic principles of experimental design, such as randomization, replication, and control. Understand the difference between observational studies and experiments. Practice designing experiments and interpreting their results.

Data visualization:

Description: Data visualization involves representing data visually using graphs and charts. You should be familiar with common visualization techniques and tools such as histograms, scatterplots, box plots, and ggplot.

Core Concepts:

  • Chart types: Bar charts, pie charts, histograms, box plots, scatterplots, line charts.
  • Principles of effective visualizations: Clear labeling, avoiding clutter, using color effectively, choosing the right chart type, emphasizing important features.

Review Pathway: Learn about the principles of effective data visualization, such as choosing the right chart type, labeling axes and legends, and using color effectively. Practice creating visualizations using tools such as Excel, ggplot, and Tableau.

This post is licensed under CC BY 4.0 by the author.

Exploring Categorical Data

Machine Learning Notes