Practical data science

Statistics for big data & business intelligence (with R)

As a result of the large scale availability of data nowadays, the use of statistical methods has broadened considerably and the importance and meaning of data science has increased, not only in the laboratory and industry but also in marketing and business intelligence.

For such applications, this course offers essential insights into statistical concepts and skills needed to apply data analysis techniques responsibly. You will learn how to work statistically sound and interpret datasets and models correctly.

The course starts with a review of basic principles from the fields of statistics and probability theory. This provides a good starting point for the data analysis methods within the important fields of data mining (big data) and time series analysis, discussed thereafter. You will also gain experience in working effectively with the statistical software R.

A well-prepared start with data science

The course consists of three topics:

  1. Applied probability and statistics revisited
    intended as a refresher for the principles and techniques that are used within applied probability and statistics for the analysis and modeling of data.
  2. Data mining in a nutshell
    provides an introduction and overview of commonly used methods in the field of predictive analytics to generate predictions and classifications based on adequate models and to recognize patterns in large data sets.
  3. Time series analysis in a nutshell
     deals with methods for modeling time-dependent data and for making forecasts for the future based on these models.

This course also gives you a strong base for our specialist follow-up courses in the field of statistics such as Data mining & business analytics and Time series analysis and forecasting.

Intended for

Academics and higher professionals who want to make use of modern applied statistical techniques in their work and who want to familiarize themselves with the relevant skills, and want to get acquainted with the latest statistical freeware. The course is also suited for lecturers at universities or colleges of higher education who want to be informed on actual methods for data analysis and data science.

You have mathematics at at least secondary education level. Basic knowledge in the field of statistics is desirable.

In consultation with the participants this course can be taught in Dutch or English.

Share this page

  • Information
    Trainer: Dhr. Dr. J.J.M. Rijpkema (Eindhoven University of Technology (TU/e))
    Course data: June 5, 12, 14 and 19 - 2019
    Location: Campus Eindhoven University of Technology
    Price: € 2,295.00 ex. vat
    In cooperation with: TU/e, department of Mathematics & Computer Science
    The program can (partially) be taught in English.
  • Program

    I. Statistics and applied probability theory, including an introduction to the software program ‚Äč

    • Introduction and overview of the course and the software to be used R
    • Exploratory data analysis
    • Review of probability, probability calculation and probability distributions
    • Statistical testing and estimation
    • Selection, validation and use of probability distributions in practice
    • Exercises probability distributions and statistical testing & estimation

    II. Data mining

    • Prediction models, based on regression techniques
    • Selection, validation and use of regression models in practice
    • Classification models based on logistic regression techniques
    • Selection, validation and use of logistic regression models in practice
    • Alternative methods for prediction and classification
    • Selection, validation and use of prediction and classification models in practice
    • Cluster analysis
    • Exercises prediction & classification models and cluster analysis

    III. Time series analysis

    • Introduction, characterization and exploratory analysis of time series data
    • Time series models based on exponential smoothing
    • Selection, validation and use of exponential smoothing models in practice
    • Box-Jenkins models for time series data
    • Selection, validation and use of Box-Jenkins models in practice
    • Exercises exponential smoothing and Box-Jenkins models
  • Reviews
    This course is assessed with a 8.4
    “Good start to use R. You really get tools to analyze data.”
    participant of Royal HaskoningDHV
    “Excellent course, meets all expectations.”
    participant of NXP Semiconductors BV Nijmegen
    “Very informative and applicable.”
    participant of Albemarle Catalysts BV
    “Very informative: good starting point for using R.”
    participant of Albemarle Catalysts BV
    “Very interesting. Despite the level of difficulty a good start to apply in my daily practice.”
    participant of Abbott