Data Science: Data Wrangling

Add your review
Product is rated as #58 in category Data Science
Learning Experience9

Data Wrangling is a critical step for any data scientist. Knowing how to wrangle and clean data, and make critical insights that would otherwise be hidden.

Last updated on August 9, 2022 2:13 am

About this course

In this course, part of the Professional Certificate Program in Data Science, you will cover several standard steps of the data wrangling process like importing data into R, tidying data, string processing, HTML parsing, working with dates and times, and text mining. Rarely are all these wrangling steps necessary in a single analysis, but a data scientist will likely face them all at some point.

Very rarely is data easily accessible in a data science project. It’s more likely for the data to be in a file, a database, or extracted from documents such as web pages, tweets, or PDFs. In these cases, the first step is to import the data into R and tidy the data, using the tidy verse package. The steps that convert data from its raw form to the tidy form is called data wrangling.

This process is a critical step for any data scientist. Knowing how to wrangle and clean data will enable you to make critical insights that would otherwise be hidden.

Prerequisites

This is the sixth course in the HarvardX Data Science Professional Certificate Series. It is strongly recommended that you take the first five courses in the series before taking this course. At a minimum, you should have taken Data Science: R Basics.

Do I have to take the courses in sequence?

The courses in the HarvardX Data Science Professional Certificate are designed to be taken in the following order:

  1. R Basics
  2. Visualization
  3. Probability
  4. Inference and Modeling
  5. Productivity Tools
  6. Wrangling
  7. Linear Regression
  8. Machine Learning
  9. Capstone

Each subsequent course assumes familiarity with the content in the preceding courses. Depending on your experience with data science generally and R specifically, you may be able to take the courses out of sequence if you choose.

What you will learn from this Data Wrangling Course?

  • Importing data into R from different file formats.
  • Web scraping.
  • How to tidy data using the tidy verse to better facilitate analysis.
  • String processing with regular expressions (regex).
  • Wrangling data using dplyr.
  • How to work with dates and times as file formats.
  • Text mining.

Syllabus

Introduction and Welcome

  • Welcome to Data Science: Wrangling!
  • Important Pre-Course Survey

Section 1: Data Import

1.1: Data Import

Section 2: Tidy Data

2.1: Reshaping Data

2.2: Combining Tables

2.3: Web Scraping

Section 3: String Processing

3.1: String Processing Part 1

3.2: String Processing Part 2

3.3: String Processing Part 3

Section 4: Dates, Times, and Text Mining

4.1: Dates, Times, and Text Mining

Comprehensive Assessment and Course Wrap-up

  • Comprehensive Assessment: Puerto Rico Hurricane Mortality

Note: Your review matters 

If you have already done this course, kindly drop your review in our reviews section. It would help others to get useful information and better insight into the course offered.

FAQ

Free Course
Verified Certificate at

$99.00

Add to wishlistAdded to wishlistRemoved from wishlist 0
Add to compare
  • EDX
  • Harvard University
  • Online Course
  • Self-paced
  • Beginner
  • 1-3 Months
  • Free Course (Affordable Certificate)
  • English
  • R
  • RStudio
  • None Pre-requisite
  • Data Analysis Data Science Data Science with 'R' Machine learning Probability Web Scraping
Learning Experience
9
PROS: Concise teaching of the main concepts and packed with great material. Good practical examples and exercises related to the course subjects. A very good vivid introduction to the subject and very clear to understand. This part of the Professional Certificate Program in Data Science covers several standard steps.
CONS: This is an entry-level course on the subject. Very lengthy preliminary content.

Description

About this course

In this course, part of the Professional Certificate Program in Data Science, you will cover several standard steps of the data wrangling process like importing data into R, tidying data, string processing, HTML parsing, working with dates and times, and text mining. Rarely are all these wrangling steps necessary in a single analysis, but a data scientist will likely face them all at some point.

Very rarely is data easily accessible in a data science project. It’s more likely for the data to be in a file, a database, or extracted from documents such as web pages, tweets, or PDFs. In these cases, the first step is to import the data into R and tidy the data, using the tidy verse package. The steps that convert data from its raw form to the tidy form is called data wrangling.

This process is a critical step for any data scientist. Knowing how to wrangle and clean data will enable you to make critical insights that would otherwise be hidden.

Prerequisites

This is the sixth course in the HarvardX Data Science Professional Certificate Series. It is strongly recommended that you take the first five courses in the series before taking this course. At a minimum, you should have taken Data Science: R Basics.

Do I have to take the courses in sequence?

The courses in the HarvardX Data Science Professional Certificate are designed to be taken in the following order:

  1. R Basics
  2. Visualization
  3. Probability
  4. Inference and Modeling
  5. Productivity Tools
  6. Wrangling
  7. Linear Regression
  8. Machine Learning
  9. Capstone

Each subsequent course assumes familiarity with the content in the preceding courses. Depending on your experience with data science generally and R specifically, you may be able to take the courses out of sequence if you choose.

What you will learn from this Data Wrangling Course?

  • Importing data into R from different file formats.
  • Web scraping.
  • How to tidy data using the tidy verse to better facilitate analysis.
  • String processing with regular expressions (regex).
  • Wrangling data using dplyr.
  • How to work with dates and times as file formats.
  • Text mining.

Syllabus

Introduction and Welcome

  • Welcome to Data Science: Wrangling!
  • Important Pre-Course Survey

Section 1: Data Import

1.1: Data Import

Section 2: Tidy Data

2.1: Reshaping Data

2.2: Combining Tables

2.3: Web Scraping

Section 3: String Processing

3.1: String Processing Part 1

3.2: String Processing Part 2

3.3: String Processing Part 3

Section 4: Dates, Times, and Text Mining

4.1: Dates, Times, and Text Mining

Comprehensive Assessment and Course Wrap-up

  • Comprehensive Assessment: Puerto Rico Hurricane Mortality

Note: Your review matters 

If you have already done this course, kindly drop your review in our reviews section. It would help others to get useful information and better insight into the course offered.

FAQ

Specification:

  • EDX
  • Harvard University
  • Online Course
  • Self-paced
  • Beginner
  • 1-3 Months
  • Free Course (Affordable Certificate)
  • English
  • R
  • RStudio
  • None Pre-requisite
  • Data Analysis Data Science Data Science with 'R' Machine learning Probability Web Scraping

User Reviews

0.0 out of 5
0
0
0
0
0
Write a review

There are no reviews yet.

Be the first to review “Data Science: Data Wrangling”

Your email address will not be published. Required fields are marked *

Data Science: Data Wrangling
Data Science: Data Wrangling
courseonline.info
courseonline.info
Logo
Compare items
  • Total (0)
Compare
0