The world is quietly being reshaped by machine learning. We no longer need to teach computers how to perform complex tasks like image recognition or text translation: instead, we build systems that let them learn how to do it themselves. R is very powerful open source software, which is the best tool for data analytics and machine learning, used by giant corporates including Google. In this course you will be learning how to use R and Machine Learning algorithms to solve business problems and extracting insights to enable to companies to stay one step ahead of their competitors.

Getting the Hang of R

The R Website

Downloading and Installing R from CRAN

Running the R Program

The interpreter and the console

Tools to work efficiently with R

Finding Your Way with R

Getting Help via the CRAN Website and the Internet

The Help Command in R

Anatomy of a Help Item in R

Command Packages

Standard Command Packages

What Extra Packages Can Do for You

How to Get Extra Packages of R Commands

Numerics

Special values

Characters

Logicals

Vectors

Creating a vector and accessing its properties

Factors

Data frames

Creating a data frame and accessing its properties

Creating Data Frames

Accessing Data Frames

Extracting Subdata Frames

More on Treatment of NA Values

Using the rbind() and cbind() Functions and Alternatives

Applying apply()

Extended Example: WWE case Study

Efficient Data Frames

Tidying Data with tidyr and Regular Expressions

Make Wide Tables Long with gather()

Split Joint Variables with spread()

Other tidyr Functions

Regular Expressions

Merging Data Frames

Extended Example: An Employee Database

Efficient Data Processing with plyr

Renaming Columns

Changing Column Classes

Filtering Rows

Chaining Operations

Data Aggregation

Combining Datasets

Working with Databases

Data Processing with data.table

The if() Versus ifelse() Functions

If… else conditionals

The use of the if conditional statement

Extra tick of if conditional statement

Sorting and Ordering

Reversing Elements

Which Indices are TRUE?

Converting Factors to Numerics

Logical AND and OR

Row and Column Operations

is.na() and anyNA()

The cut() Function

Writing your first function in R

Writing functions with multiple arguments and use of default values

Handling data types in input arguments

Producing different output types and return values

Making a recursive call to a function

Handling exceptions and error messages

Text functions

Data cleaning with efficient text functions

Inbuilt Numeric functions of R

Inbuilt String functions of R

Inbuilt other functions of R

nchar() , paste(), substr(), strsplit() etc

Pivot Table of Excel in R

Table function

Count function of plyr package

Learning of SQL queries using R

Grouping numeric data

User defined functions (Macros) in R

Visualizing of Data

Date functions with Lubridate package

Apply functions

User defined functions (Macros) in R

Box-whisker Plots

Basic Boxplots

Customizing Boxplots

Horizontal Boxplots

Scatter Plots

Basic Scatter Plots

Adding Axis Labels

Plotting Symbols

Setting Axis Limits

Line Charts

Line Charts Using Numeric Data

Line Charts Using Categorical Data

Pie Charts

Bar Charts

Single-Category Bar Charts

Multiple Category Bar Charts

Horizontal Bars

Bar Charts from Summary Data

Means: The Lure of Averages

The Average in R: mean()

Medians: Caught in the Middle

The Median in R: median()

Statistics à la Mode

The Mode in R

Deviating from the Average

Measuring Variation

Back to the Roots: Standard Deviation

Standard Deviation in R

Conditions, Conditions, Conditions …

Meeting Standards and Standings

Catching Some Z’s

How Many?

The High and the Low

Living in the Moments

Tuning in the Frequency

Summarizing a Data Frame

Hitting the Curve

Working with Normal Distributions

A Distinguished Member of the Family

Drawing Conclusions from Data

Understanding Sampling Distributions

An EXTREMELY Important Idea: The Central Limit Theorem

Confidence: It Has Its Limits!

Fit to a t

Hypotheses, Tests, and Errors

Hypothesis Tests and Sampling Distributions

Catching Some Z's Again

Z Testing in R

t for One

t Testing in R

Working with t-Distributions

Visualizing t-Distributions

Testing a Variance

Working with Chi-Square Distributions

Visualizing Chi-Square Distributions

Hypotheses Built for Two

Sampling Distributions Revisited

t for Two

Like Peas in a Pod: Equal Variances

t-Testing in R

A Matched Set: Hypothesis Testing for Paired Samples

Paired Sample t-testing in R

Testing Two Variances

Testing More Than Two

ANOVA in R

Another Kind of Hypothesis, Another Kind of Test

Getting Trendy

Trend Analysis in R

Cracking the Combinations

Two-Way ANOVA in R

Two Kinds of Variables … at Once

After the Analysis

Uses and abuses of machine learning

Machine learning successes

How machines learn

Machine learning in practice

Machine learning with R

Understanding regression

Simple linear regression

Ordinary least squares estimation

Multiple Linear Regression

Regression: What a Line!

Linear Regression in R

Juggling Many Relationships at Once: Multiple Regression

exploring and preparing the data

ANOVA: Another Look

Formulae and Linear Models

Model Building

training a model on the data

evaluating model performance

improving model performance

Goodness of Fit with Data—The Perils of Overfitting

Root-Mean-Square Error

Model Simplicity and Goodness of Fit

Assumption checking

Assumption checking using packages

Case studies of Linear Regression

Estimation the quality of wines

Price prediction of real estate

Movie popularity prediction

Retail sales prediction

Understanding logistic regression

The logit model

Generalized Linear Model

Simple logistic regression

Multiple logistic regression

Customer satisfaction analysis with the multiple logistic regression

Multiple logistic regression with categorical data

The Dataset and the Data Dictionary

Data Import in R

EDD in R

Outlier Treatment in R

Missing Value treatment in R

Variable transformation and Deletion in R

Dummy variable creation in R

Automatic dummy variable creation

Formulae and Logistic Models

Model Building

training a model on the data

evaluating model performance

improving model performance

Goodness of Fit with Data—The Perils of Overfitting

Confusion Matrix

Creating Confusion Matrix in Python

Introduction to Time Series Data

Notation for Time Series Data

Peculiarities of Time Series Data

Setting the Frequency

Treatment of missing values

White Noise

Stationarity

Seasonality

Correlation Between Past and Present Values

The Autocorrelation Function (ACF)

The Partial Autocorrelation Function (PACF)

Picking the Correct Model

The Autoregressive (AR) Model

ARMA

ARIMA

Unsupervised Learning & Clustering: theory

K-Means Clustering: Theory

Example K-Means Clustering in R

Visualize K-Means Results in R

Model-based Unsupervised Clustering in R

How to assess a Clustering Tendency of the dataset

Selecting the number of clusters for unsupervised Clustering methods (K-Means)

Assessing the performance of unsupervised learning (clustering) algorithms

How to compare the performance of different unsupervised clustering algorithms?

A Simple Tree Model

Deciding How to Split Trees

The stopping criteria for controlling tree growth

Tree Entropy and Information Gain

Pros and Cons of Decision Trees

Tree Overfitting

Pruning Trees

Decision Trees for Classification

Conditional Inference Trees

Conditional Inference Tree Classification

Building a decision tree in R

Model Validation

Model Improvement

Model Interpretation

Ensemble technique

Random Forest Classification

Splitting Data into Test and Train Set in R

Choose the number of trees

Model Validation

Model Improvement

Model Interpretation

Accuracy of the model

Decision Vs Random Forest