Business Analytics - Course Notes and Materials

General Course Description and Objectives

This is a practical and skill-focused introduction to using open-source programming software (R, RStudio, and R Markdown) in several aspects of Business Analytics. The course covers basic scripting/coding in R, data-wrangling, advanced graphing and machine learning. You will learn how to load data, assemble and disassemble data objects, navigate R’s environment system, write your own functions, and use all of R’s programming tools to apply machine learning techniques. There is no prerequisite knowledge needed in R Programming, data science or machine learning. The chapters in this course are arranged according to 14 practical projects with concrete examples. These examples are short, easy to understand, cover everything you need to know and provide you with immediate practice. Learning to program is like learning to speak another language — you progress faster when you practice.

Course objectives

After completing this module, you will be able to:

  • Obtain large amounts of data via APIs or web scraping from the Internet
  • Clean and transform data
  • Explore and visualize data in a goal-oriented way
  • Model data using modern machine learning techniques with respect to classifications and predictive predictions
  • Communicate data and results in the form of products and applications

Course structure

Over the course of seven days you will complete 14 sessions. Each session will involve a small amount of lecturing on R concepts, and a large amount of time for students to complete coding and analysis problems.

DataCamp

If you have R-studio working and your github page set up (will be explained in detail in the corresponding chapter), you can get started with online tutorials from datacamp and you can begin messing around in R. In order to do so, join the NIT data science team on datacamp via the following link (Please register with your tuhh email address):

These tutorials are optional and you can choose whatever courses you want. In accordance with the content of the sessions, I will recommend you to complete tutorials at the end of each session.

Schedule

SessionThema
1Introduction to R, RStudio IDE & GitHub
2Introduction to the tidyverse
3Data Acquisition
4Data Wrangling
5Data Visualization
6Fundamentals of Machine Learning
7Supervised ML: Regression (I)
8Supervised ML: Regression (II)
9Automated ML with H20 (I)
10Automated ML with H20 (II)
11ML Performance Measures
12Explainable ML with LIME
13ML: Deep Learning
14Reporting with RMarkdown, Shiny, Flexdashboard