The Data Science training allows you to implement the methods and tools intended to interpret the data.
Pedagogical objectives of the Data Science Training:
- Use R to clean/analyze and visualize data
- Navigate the entire data science pipeline from data acquisition to publication
- Use GitHub to manage data science projects
- Perform regression/least squares analyzes and inferences using regression models
Data Science Training Course:
- Module 1 of the Data Science training: The tools of the data scientist
- Set up R, R-Studio, Github and other useful tools
- Explain the essential concepts of study design
- Understand data, issues and tools used by data analysts
- Create a Github repository
- Module 2 of the Data Science training: Programming in R
- The essential concepts of the programming language
- R loop functions and debugging tools
- Configure statistical programming software
- Gather detailed information using the R profiler
- Module 3 of the Data Science training: Obtaining and sorting data
- Understand common data storage systems
- Use R for text and date manipulation
- Apply basic principles of data cleansing to make data/clean.
- Obtain usable data from the web, APIs and databases
- Module 4 of the Data Science training: Analytical exploration of data
- Understand analytical graphs and the basic plotting system in R
- Realize very high-dimensional graphical representations of data
- Use advanced graphics systems such as the Lattice system
- Apply cluster analysis techniques to locate patterns in data
- Module 5 of the Data Science course: Reproducible research
- Organize data analysis to make it more reproducible
- Determine the reproducibility of the analysis project
- Write a reproducible data analysis using knitting
- Publish reproducible web documents using the Markdown feature
- Module 6 of the Data Science training: Statistical inference
- Understand the process of drawing conclusions about populations or scientific truths from data
- Describe variability, distributions, limits and confidence intervals
- Use p-values, confidence intervals and permutation tests
- Make informed data analysis decisions
- Data Science Training Module 7: Regression Models
- Use regression analysis, least squares, and inference
- Understand model cases of ANOVA and ANCOVA
- Review residuals and variability analysis
- Describe new uses for regression models such as scatterplot smoothing
- Module 8 of the Data Science training: Practical mechanical learning
- Use the basics of constructing and applying prediction functions
- Understand concepts such as training and testing sets, over-equipment, and error rates
- Describe machine learning methods such as regression or classification trees
- Explain the full process of constructing prediction functions
- Module 9 of the Data Science course: Development of data products
- Develop basic applications and interactive graphics using GoogleVis
- Use the flyer to create annotated interactive maps
- Build an R Markdown presentation that includes a data visualization
- Create a data product that tells a story to a mass audience
- Data Science Training Module 10: Final Data Science Project
- Create a useful data product for the public
- Apply your exploratory data analysis skills
- Build an efficient and accurate prediction model
- Create a presentation folder to showcase your results