All Courses
Basics of Data Science
edX
Course
Intermediate
Free to Audit
Certificate

Basics of Data Science

RWTH Aachen University

"Basics of Data Science" gives a comprehensible overview of many fundamental concepts and tools of data science, including data quality and data preprocessing, supervised and unsupervised learning techniques including their evaluation, frequent itemsets and association rules, sequence mining, process mining, text mining, and responsible data science.

8 hrs/week9 weeksEnglish1,494 enrolled
Free to Audit

About this Course

Explore the Fundamentals of Data Science with our Online Course! "Basics of Data Science" is designed to provide participants with a comprehensive overview of the fundamental challenges, concepts and tools of data science. The content can be organized in three main areas of data science: Initially, a brief overview is given to data science infrastructure concerned with volume and velocity. Topics include instrumentation, big data infrastructures and distributed systems, databases and data management. The main challenge is to make things scalable and instant. The main focus of the course is on data analysis concerned with extracting knowledge from data. Key topics covered are data exploration and visualization, data preprocessing, data quality issues and transformations, various supervised learning techniques with a focus on their evaluation, unsupervised learning, clustering, pattern mining, process mining and text mining. The main challenge of data analysis is to provide answers to known and unknown unknowns. Finally, data science affects people, organizations, and society . The course is concluded by discussing challenges and providing guidelines and techniques to apply data science techniques responsibly with a focus on confidentiality and fairness. Topics include ethics & privacy, IT law, human-technology interaction, operations management, business models, entrepreneurship, and the main challenge is to do all of the above in a responsible manner. Throughout the course, the ideas and concepts conveyed in the videos are complemented by hands-on exercises using Python (Jupyter notebooks). Participants will be guided to apply the presented techniques on artificial and real-life data sets to gain valuable hands-on experience. After the course participants should have a good overview of the best practices, challenges, goals and concepts of the broader data science field, providing a strong foundation for further study or professional development in this rapidly evolving field. Through the combination with hands-on experience with commonly used Python Libraries, participants will be able to conceptualize and implement various basic data analysis techniques in their own projects and accurately evaluate and interpret analysis results. Enroll now to start your journey into the world of data science!

What You'll Learn

  • Understanding of the role of data science in today’s society and businesses, including challenges and opportunities
  • Good general overview of a broad range of data science techniques
  • Ability to conceptualize and basic data analysis and accurately evaluate and interpret the outcomes
  • Understanding the challenges of responsible data science (fairness, accuracy, confidentiality, transparency) and possible solutions
  • Understanding of the limitations of machine learning, data mining and AI techniques
  • Ability to write short Python programs and use mainstream Python libraries
  • In particular, understanding of and ability to apply the following data analysis concepts and techniques:
  • data visualization and exploration techniques
  • decision trees
  • linear and logistic regression (basic overview)
  • support vector machines (basic overview)
  • neural networks (basic overview)
  • naive bayesian classification (basic overview)
  • evaluation and interpretation of the results obtained using supervised learning
  • clustering techniques
  • frequent item sets
  • association rules
  • sequence mining
  • process mining
  • text mining
  • data preprocessing, data transformation, spotting and handling of data quality problems
  • Application of data analysis techniques without violating confidentiality and fairness

Prerequisites

  • Everyone from any discipline with an interest in data science can start this course. We expect this course to be useful for everyone. Prior knowledge in math is of advantage (i.e., mathematical notations, linear algebra, stochastics, and statistics), but not mandatory.

Instructors

P

Prof. Dr. Wil van der Aalst

Head of the Chair for Process and Data Science

L

Lisa Luise Mannel

Doctoral student at the Process and Data Science (PADS) group

Topics

Jupyter
Pattern Mining
Association Rule Learning
Operations Management
Big Data
Supervised Learning
Entrepreneurship
Data Preprocessing
Process Mining
Text Mining
Data Management
Data Quality

Course Info

PlatformedX
LevelIntermediate
PacingUnknown
CertificateAvailable
PriceFree to Audit

Skills

Jupyter
تنقيب الأنماط
تعلم قواعد الارتباط
إدارة العمليات
البيانات الضخمة
Supervised Learning
Entrepreneurship
Data Preprocessing
Process Mining
Text Mining

Start Learning Now