Course

Practical data science

Practical Data Science with R

This course aims to give you insight into practically usable methods for analyzing and modeling data and the use of these models for making predictions and in decision making. You learn to work in a statistically correct way and know how to interpret datasets. Essential knowledge for every business analyst and organization that wants to work with Big Data.

'Practical Data Science with R' starts with a short repetition of the knowledge base of the fields of statistics and probability. This provides a good starting point for the data analysis methods within the important areas of Data Mining (Big Data) and Time Series Analysis. You also become acquainted with the rapidly emerging software package R (as opposed to eg SPSS this is freeware). 

At the end of the course you have insight and operational skills in the use of statistical techniques for processing and analyzing data.

Methods for making predictions and decision making

The course 'Practical Data Science with R' consists of three modules:

  1. Applied Probability and Statistics Revisited
    intended as a refresher for the principles and techniques that are used within applied probability and statistics for the analysis and modeling of data.
  2. Data Mining in a Nutshell
    provides an introduction and overview of common methods in the field of Predictive Analytics to generate predictions and classifications based on adequate models and to recognize patterns in large data sets.
  3. Time Series Analysis in a Nutshell
     deals with methods for modeling time-dependent data and for making forecasts for the future based on these models.


As a result of the explosive increase in data, the use of statistical methods has been greatly broadened and its importance and significance increased. In addition to the traditional applications in the laboratory and the monitoring of the quality of products in the industry, statistics have now also found application in areas such as marketing and business intelligence. This course offers the essential insights and skills for all applications in order to work responsibly with the analysis of data.

This course also gives you a strong base for our specialist follow-up courses in the field of statistics such as Data Mining & Business Analytics and Time series analysis

Intended for

This course is intended for academics and higher professional education students who want to make use of modern applied statistical techniques in their work and who want to familiarize themselves with the relevant skills, and also want to get acquainted with the latest statistical freeware. have mathematics at VWO level. Some knowledge in the statistical field is desirable.

Course material

At the beginning of the course a course book is provided with copies of the hand-outs used during the course and with assignments and additional information about the open source software used: R and R Commander. Participants are invited to bring their own laptop during the course, with the software installed on it. You will receive further information about the installation (free of charge) of this software.  

At the request of the participants this course can be taught in Dutch or English.

Share this page

  • Information
    Trainer: Dhr. Dr. J.J.M. Rijpkema (Technische Universiteit Eindhoven (TU/e))
    Course data: June 5, 12, 14 and 19 2019
    Location: TU Eindhoven
    Price: € 2,295.00 ex. vat
    In coorperation with: TU/e, faculteit Wiskunde & Informatica
    Language
    The program can (partially) be taught in English.
  • Program

    I Applied Probability and Statistics Revisited

    • Introduction and overview of the course and the software to be used.
    • Exploratory Data Analysis.
    • Introduction to probability and probability distributions.
    • Exercises Exploratory Data Analysis and Probability.
    • Statistical Testing & Estimation in a nutshell.
    • Selection, validation and use of probability distributions in practice.
    • Exercises Probability distributions and Principles of testing & estimating.

    II. Data Mining in a Nutshell

    • Predictive modeling based on linear regression methods.
    • Selection, validation and use of regression models in practice.
    • Exercises regression models. 
    • Classification modeling based on logistic regression methods.
    • Selection, validation and use of logistic regression models in practice.
    • Exercises logistic regression models.
    • Alternative methods for predictive and classification modeling.
    • Selection, validation and use of predictive and classification models in practice.
    • Cluster analysis.
    • Exercises predicting and classification models and cluster analysis.

    III. Time Series Analysis in a Nutshell

    • Introduction, characterization and exploratory analysis of time series data.
    • Time series models based on Exponential Smoothing.
    • Selection, validation and use of exponential smoothing models in practice.
    • Exercises exponential smoothing models.
    • Box-Jenkins models for time series data.
    • Selection, validation and use of Box-Jenkins models in practice.
    • Exercises Box-Jenkins models.
  • Reviews
    This course is assessed with a 8.4
    “ Good start to use R. You really get tools to analyze data.”
    participant of Royal HaskoningDHV
    “Excellent course, meets all expectations.”
    participant of NXP Semiconductors BV Nijmegen
    “Very informative and applicable.”
    participant of Albemarle Catalysts BV
    “Very informative: good starting point for using R.”
    participant of Albemarle Catalysts BV