Overview
Data Science Certification
The Data Science course enables you to understand practical foundations, helping you effectively execute and take up Big Data and other analytics projects. The program covers topics from Big Data to the Data Analytics Life Cycle. Understanding these topics helps in addressing business challenges that leverage Big Data.
Another aspect of this course is that it covers basic as well as advanced analytic methods, and also introduces the participant to Big Data technologies with tools like MapR and Hadoop. Our state-of-the-art-infrastructure allows students to understand the applications of these methods and tools by getting hands-on experience working alongside real-time data scientists. This program has an open approach including a final lab session, which explains various Big Data Analytics challenges by applying the concepts covered during the program with respect to the Data Analytics Life Cycle.
Who can learn Data Science?
The course is designed for anyone who wishes to understand the concepts of Data Science from a Data Scientist’s perspective. Professionals who can benefit from this course include:
Managers from any field, as Analytics is the best tool for managers these days
Business Analysts and Data Analysts who wish to upscale their Data Analytics skills
Database professionals who aspire to venture into the field of Big Data by acquiring analytics skills
Fresh graduates who wish to make a career in the field of Big Data or Data Science
Schedule Classes
What You'll learn
Curriculum
- What is Data Science?
- Skill set required
- Job opportunities
- Continuous vs. Categorical variables
- Mean, Median, Mode, Standard Deviation, Quartile, IQR
- Hypothesis testing, z-test, t-test
Installation of R Studio
- Overview of R Studio components
- Data Structures
- Vector
- List
- Matrices
- Data Frame
- Factor
- Slicing and Sub-setting
- Vector
- List
- Matrix
- Data Frame
Functions in R
- In-built functions
- User-defined functions
Loops in R
- while
- for
- break
- next
Data Import in R
Apply Family of Functions
- lapply
- sapply
- tapply
Data Manipulation Using dplyr
Data Visualization Using ggplot2
What is Machine Learning?
Supervised vs. Unsupervised Learning
Exploratory Data Analysis
- Univariate analysis
- Boxplot
- Bivariate analysis
- Scatterplot
- Correlation
- Outliers
- Remove duplication
- Missing value imputation
Underfitting vs. Overfitting
Linear Regression
- Simple
- Multiple
- Assumptions of Linear Regression
- Evaluating Accuracy of model: k-Fold Cross validation
Logistic Regression
- Confusion Matrix
- ROC Curve
Time Series Forecasting
- Moving Average
- Exponential smoothing
- Holt Winter’s
- ARIMA
- Naïve Bayes
- Support Vector Machine
- K-Nearest Neighbor
- Decision tree
- Random Forest
- K-Means Clustering
- Introduction to Big Data
- Overview of Hadoop & its Ecosystem
- Introduction to NoSQL
- Overview of Apache Spark
Prerequisites
Interested in this course? Let’s connect!
Course features
FAQs
To attend the live virtual training, one would require at least 2 Mbps of internet speed.
Yes, Cognixia’s Virtual Machine can be installed on any local system. Cognixia’s training team will assist you with this.
To install the Hadoop environment, 8GB RAM, 64-bit OS, 100 GB free space on hard disk, and a Virtualization Technology-enabled processor is required within your system.
The Hadoop Administration course at Cognixia is a 6-week course.
The recorded session for the class will be available on the LMS for your reference. We also have a support team, in case you need any clarification on concepts or help in debug or installation, etc.
The access to the Learning Management System will be for a lifetime.