Introduction to Data Science – Organized Syllabus

1. Association Rule Mining

1.1 Introduction to Association Rule Mining

  • Frequent Patterns

  • Associations

  • Correlations

  • Basic Concepts and Road Map

1.2 Association Rules

  • Definition and Concepts

  • Support and Confidence

  • Strong Association Rules

1.3 Apriori Algorithm

  • Working of Apriori Algorithm

  • Candidate Generation

  • Frequent Itemset Mining


2. Classification and Prediction

2.1 Classification

Introduction to Classification

  • Definition of Classification

  • Applications of Classification

Issues Regarding Classification

  • Data Quality

  • Overfitting and Underfitting

  • Bias and Variance

Classification Techniques

  • Classification by Decision Tree Induction

  • Bayesian Classification

  • Rule-Based Classification

Evaluation of Classifiers

  • Metrics for Evaluating Classifier Performance

    • Accuracy

    • Precision

    • Recall

    • F1-Score

  • Holdout Method

  • Random Subsampling


2.2 Prediction

Introduction to Prediction

  • Definition of Prediction

  • Applications of Prediction

Issues Regarding Prediction

  • Prediction Accuracy

  • Model Complexity

Accuracy and Error Measures

  • Mean Absolute Error (MAE)

  • Mean Squared Error (MSE)

  • Root Mean Squared Error (RMSE)

Evaluating Accuracy

  • Evaluating the Accuracy of a Classifier or Predictor

3. Clustering

3.1 Introduction to Clustering

  • Cluster Analysis

  • Applications of Clustering

3.2 Hierarchical Clustering

  • Agglomerative Hierarchical Clustering

  • Divisive Hierarchical Clustering

  • Comparison: Agglomerative vs Divisive

3.3 Distance Measures

  • Distance Measures in Algorithms

    • Euclidean Distance

    • Manhattan Distance

    • Cosine Similarity

3.4 Evaluation of Clustering

  • Cluster Validation

  • Cluster Quality Measures


4. Linear Regression

4.1 Introduction to Linear Regression

  • Prediction using Linear Regression

  • Regression Concepts

4.2 Gradient Descent

  • Cost Function

  • Gradient Descent Algorithm

4.3 Linear Regression Models

  • Linear Regression with One Variable

  • Linear Regression with Multiple Variables

4.4 Advanced Regression Concepts

  • Polynomial Regression

  • Feature Scaling

  • Feature Selection


5. Logistic Regression

5.1 Introduction to Logistic Regression

  • Classification using Logistic Regression

  • Logistic Regression vs Linear Regression

5.2 Logistic Regression Models

  • Logistic Regression with One Variable

  • Logistic Regression with Multiple Variables


6. Deep Learning

6.1 Introduction to Deep Learning

  • History of Deep Learning

  • Scope and Specifications

  • Why Deep Learning Now?

6.2 Neural Network Fundamentals

  • Building Blocks of Neural Networks

  • Neural Networks Overview

  • Units in Neural Networks

  • Layers in Neural Networks

  • Activation Functions

  • Normalization

6.3 Neural Network Architectures

  • Forward Neural Networks

  • Backward Neural Networks

  • XOR Model

6.4 Model Optimization

  • Cost Function Estimation

    • Maximum Likelihood
  • Hyper-Parameter Tuning

6.5 Deep Learning Hardware

  • GPUs and TPUs

  • Deep Learning Hardware Requirements

6.6 Convolution Neural Networks (CNN)

  • Introduction to CNN

  • CNN Architecture

  • Applications of CNN


7. Case Studies and Practical Applications

7.1 Classification and Prediction Datasets

  • Iris Dataset

  • Loan Dataset

  • Titanic Survival Dataset

7.2 Time Series and Real-World Datasets

  • Share Market Dataset

  • COVID-19 Dataset

7.3 Practical Data Science Workflow

  • Data Collection

  • Data Cleaning

  • Data Visualization

  • Model Building

  • Model Evaluation