Data Science Micromasters
What is Data Science?
Data science is a multidisciplinary approach in finding, extracting, and surfacing patterns in data through a fusion of analytical methods, domain expertise, and technology. This approach generally includes the fields of data mining, forecasting, machine learning, predictive analytics, statistics, and text analytics. As data is growing at an alarming rate, the race is on for companies to harness the insights in their data. The Data Science Micromasters from EDX have conceptualized the whole process in this program.
How does Data Science work with different methods?
Conceptually, the data science process is very simple to understand and involves the following methods:
- Understand the business problem.
- Gather and integrate the raw data.
- Explore, transform, clean, and prepare the data.
- Create and select models based on the data.
- Test, tune, and deploy the models.
- Monitor, test, refresh and govern the models.
What you will learn from Data Science Micromasters?
In the Data Science Micromasters program, you will be able to learn how to perform the following tasks:
- Load and clean real-world data.
- Make reliable statistical inferences from noisy data.
- Use machine learning to learn models for data.
- Visualize complex data.
- Use Apache Spark to analyze data that does not fit within the memory of a single computer.
What is Micro-Masters Program?
Micro-Masters programs are a series of graduate-level courses from top universities planned to move forward your career. Micro-Masters program certificates showcase deep learning and in-demand skills to employers and can help you get started on a path toward completing an advanced degree.
Syllabus Data Science Micromasters
There are 4 Courses in the Data Science Micromasters program:
This course, part of the Data Science Micromasters program, will introduce you to a collection of powerful, open-source, tools needed to analyze data and to conduct data science. Specifically, you will be able to learn how to use: Python, Jupiter notebooks, pandas, numpy, matplotlib, git, and many other tools.
- Getting started with Data Science
- (Optional) Background in Python and Unix
- Jupyter Notebooks and Numpy
- Data Visualization
- Mini Project week
- Introduction to Machine Learning
- Working with Text and Databases
- Final Project in Part 1 and 2
In the Data Science Micromasters program, you will learn the foundations of probability and statistics. You will learn both the mathematical theory, and get a hands-on experience of applying this theory to actual data using Jupyter notebooks.
Concepts covered include: random variables, dependence, correlation, regression, PCA, entropy, and MDL also you will cover the following points:
- Introduction to Probability and Statistics.
- Probability introduction.
- Conditional probability.
- Random Variables, Expectation, and variance.
- Discrete and Continuous Distribution Families.
- Inequalities and Limit Theorems.
In this course, you will learn a variety of supervised and unsupervised learning algorithms and the theory behind those algorithms.
You will use real-world case studies also able to learn how to classify images, identify salient topics in a corpus of documents, partition people according to personality profiles, and automatically capture the semantic structure of words and use it to categorize documents and you will cover the following points:
- Prediction problems.
- Generative modeling 1 & 2.
- Linear regression and probability Estimation.
- Optimization and Geometry.
- Linear classification.
- Combining Simple classifier.
- Representation Learning 1 & 2.
- Deep learning.
In Data Science Micromasters program, you will learn what the bottlenecks are in massively parallel computation and how to use spark in big data analytics to minimize these bottlenecks also
You will learn how to perform supervised and unsupervised machine learning on massive datasets using the Machine Learning Library (MLlib). As in the other ones in this Micro-Masters program, you will gain hands-on experience using PySpark (Big data analytics) within the Jupyter notebooks environment. You will cover the following points:
- Introduction and Course information on Big Data Analytics Using Spark.
- Map-Reduce and Spark.
- PCA and Weather Analysis.
Note: Your review matters
If you have already done this course, kindly drop your review in our reviews section. It would help others to get useful information and better insight into the course offered.
- University of California, San Diego
- 3+ Months
- Paid Course (Paid certificate)
- Apache Spark Jupyter Notebook
- Intermediate Calculus Linear Algebra Previous Programming Experience
- Apache Spark Training Big data Data Science Data Science with 'Python' Machine learning Practical Statistics Probability