Banner

Data Science with Python

Live Classroom
Duration: 14 days
Live Virtual Classroom
Duration: 14 days
Pattern figure

Overview

Harvard Business Review has termed data science as the sexiest job of the 21st century. And Python programming, in the recent years, has become one of the most preferred languages in the field of data science. When it comes to build machine learning systems, Python provides an ideally powerful and flexible platform.

Through a comprehensive, hands-on approach, Cognixia’s data science with Python training program provides learners with the opportunity to experiment with a wide variety of data science algorithms. The program integrates real-life exercises and activities throughout the training, helping you to ensure a promising career ahead.

What You'll Learn

  • Basics of data science and statistics
  • Advanced and applied statistics in data science
  • Python programming for data science
  • Applied statistics concepts in Python
  • Data visualization and data analytics in Python
  • Machine learning concepts
  • Real-world machine learning and data science use-cases

Curriculum

  • Data Science Introduction
  • Data Science Project Lifecycle – CRISP-DM Model
  • Data Science Toolkit
  • Job outlook
  • Prerequisite& Target Audience

  • Introduction to Python, Anaconda, Spyder & Jupyter Notebook
  • Installation & Configuration
  • Basic Python Programming Concepts
  • Data Structures in Python
    • List
    • Tuples
    • Dictionary

  • NumPy Array & it’s applications
  • Control Structures
  • Creating Custom Functions
  • Exception Handling

  • Random Variable
  • Type of Random variables
    • Discrete & Continuous
    • Nominal
    • Ordinal
    • Interval
    • Ratio
  • Central Tendencies
    • Mean
    • Mode
    • Median
  • Measurement of dispersion
    • Variance
    • Standard Deviation
  • Basic Statistics using NumPy

  • Introduction to Probability Theory
  • Probability Distribution Analysis
  • Probability Mass Function
  • Probability Density Function
  • Normal Distribution
  • Standard Normal Distribution
  • Covariance & Correlation

  • Pandas Dataframes & its applications
  • Importing tables from RDBMS
  • Analytics & Data Visualization using Matplotlib
  • Univariate & Bivariate Statistical Analysis using Matplotlib
    • Line Plot
    • Area Plot
    • Histogram
    • Box Plot
    • Scatter Plot

Sampling Analysis
  • Inferential Statistics
  • Sampling Distribution
  • Central Limit Theorem
  • Hypothesis Testing
  • 1 tail test and 2 tail test
  • Type I and Type II errors
  • P value
  • Level of Significance
  • Confidence Interval
    • Statistical Analysis using Seaborn
      • KDE Plot
      • RegPlot
      • Joint Plot
      • Heatmap
    • Data Sampling
    • Simulating Normal Distribution
    • Calculating PDF & CDF
    • Hypothesis Testing – Case Study

    • Introduction to Machine Learning
    • Estimation Function
    • Reducible & Irreducible errors
    • Supervised & Unsupervised
    • ML Algorithms ML Model Training & Testing
    • Parametric & Non- Parametric Algorithms
    • Regression Analysis
      • Simple Linear Regression
      • Multiple Linear Regression
    • Linear Regression methods
      • Ordinary Least Square
      • R Squared method
      • Adjusted R Square
    • Regression Evaluation Metrics – MSE,RMSE
    • Bias & Variance
    • Model Under fitting and Overfitting

    • Feature Engineering
    • Null Data Imputation Techniques
    • Outlier Analysis
    • Categorical Encoding
      • Label Encoding
      • One Hot Encoding
    • Feature Selection Techniques
      • Correlation Analysis
      • Chi Square Test
    • Machine Learning Case Study 1 – Multiple Linear Regression

    • Logistic Regression
      • Simple Logistic Regression
      • Multiple Logistic Regression
    • Logistic Regression Function
    • ROC AUC Analysis
    • Model Evaluation using Confusion Metrix
    • Accuracy, Precision, Recall & Specificity
    • Machine Learning Case Study 2– Multiple Logistic Regression

    • Feature Scaling
    • Addressing Imbalanced Data using SMOTE/MSMOTE
    • Model Cross Validation using K- Fold Cross Validation Classification Analysis
    • K Nearest Neighbor Classifier
    • Decision Trees
      • Classification and Regression Tree
    • Random Forest
    • Information Gain & Entropy
    • Machine Learning Case Study 3 – Classification Analysis using KNN,
    • Decision Tree & Random Forest

    • Clustering Algorithms
      • K Means Clustering
      • Hierarchical Clustering
    • Elbow Curve Graph
    • Machine Learning Case Study 4 – Clustering Analysis using K-Means
    • Clustering

    • Recommendation Engines
    • Collaborative filtering & Types
    • Machine Learning Case Study 5 – Recommendation Engine using
    • Collaborative filtering
    waves
    Ripple wave

    Who should attend

    This training program is highly recommended for current and aspiring –
    • Data Analysts
    • Data Scientists
    • Financial analysts
    • Software Developers
    • Programmers
    • Data Engineers
    • Python Developers
    • Data Architects
    • Software Engineers
    • Business Analytics Manager
    • Product Engineers
    • Data Analytics Engineers
    • Big Data Analysts

    Prerequisites

    There are no prerequisites for this program; however, having a background in mathematics, statistics or computer science could be helpful.

    Interested in this Course?

      Certification

      Participants will be awarded with an exclusive certificate upon successful completion of the program. Every learner is evaluated based on their attendance in the sessions, their scores in the course assessments, projects, etc. The certificate is recognized by organizations all over the world and lends huge credibility to your resume.

      Ready to recode your DNA for GenAI?
      Discover how Cognixia can help.

      Get in Touch
      Pattern figure
      Ripple wave