All Courses
Data Preprocessing for Data Science
edX
Course
Beginner
Free to Audit
Certificate

Data Preprocessing for Data Science

University of Maryland Baltimore County

Learn how to prepare and transform data for analysis and machine learning. The course includes techniques for data cleaning, normalization, domain reduction, and the application of various dimensionality reduction methods such as PCA and t-SNE to enhance data usability and visualization.

6 hrs/week4 weeksEnglish473 enrolled
Free to Audit

About this Course

The Data Preprocessing for Data Science course is a comprehensive introduction to the essential steps in preparing data for analysis and machine learning. This course covers key techniques and tools used to clean, transform, and reduce data, ensuring it is in the best possible shape for creating accurate and reliable models. This course will provide you with practical experience using Python and popular libraries like NumPy and scikit-learn.

What You'll Learn

  • Understand how to import datasets from various sources, focusing on CSV files and how to manage different file structures.
  • The concepts of domain and range in data science.
  • To split data into training and testing sets.
  • Determine the accuracy of your machine learning models.
  • Apply min-max scaling and Z-score standardization.
  • Using Domain Reduction to Reduce the size of your data's domain.
  • Use PCA for dimensionality reduction.
  • Find hidden patterns in your data using Factor Analysis.
  • Visualize high-dimensional data using t-SNE.

Prerequisites

  • Basic programming knowledge.
  • Familiarity with Python is recommended but not required.

Course Info

PlatformedX
LevelBeginner
PacingUnknown
CertificateAvailable
PriceFree to Audit

Start Learning Now