TrueschoTruescho
All Courses
Cloud Computing Applications, Part 2
Coursera
Course
Unknown

Cloud Computing Applications, Part 2

University of Illinois Urbana-Champaign

Learn how cloud computing facilitates big data analytics through advanced tools and frameworks for processing diverse large-scale data efficiently.

Unknown5 weeksEnglish33,877 enrolled

About this Course

Welcome to the Cloud Computing Applications course, the second part of a two-course series designed to give you a comprehensive view on the world of Cloud Computing and Big Data! In this second course we continue Cloud Computing Applications by exploring how the Cloud opens up data analytics of huge volumes of data that are static or streamed at high velocity and represent an enormous variety of information. Cloud applications and data analytics represent a disruptive change in the ways that society is informed by, and uses information. We start the first week by introducing some major systems for data analysis including Spark and the major frameworks and distributions of analytics applications including Hortonworks, Cloudera, and MapR. By the middle of week one we introduce the HDFS distributed and robust file system that is used in many applications like Hadoop and finish week one by exploring the powerful MapReduce programming model and how distributed operating systems like YARN and Mesos support a flexible and scalable environment for Big Data analytics. In week two, our course introduces large scale data storage and the difficulties and problems of consensus in enormous stores that use quantities of processors, memories and disks. We discuss eventual consistency, ACID, and BASE and the consensus algorithms used in data centers including Paxos and Zookeeper. Our course presents Distributed Key-Value Stores and in memory databases like Redis used in data centers for performance. Next we present NOSQL Databases. We visit HBase, the scalable, low latency database that supports database operations in applications that use Hadoop. Then again we show how Spark SQL can program SQL queries on huge data. We finish up week two with a presentation on Distributed Publish/Subscribe systems using Kafka, a distributed log messaging system that is finding wide use in connecting Big Data and streaming applications together to form complex systems. Week three moves to fast data real-time streaming and introduces Storm technology that is used widely in industries such as Yahoo. We continue with Spark Streaming, Lambda and Kappa architectures, and a presentation of the Streaming Ecosystem. Week four focuses on Graph Processing, Machine Learning, and Deep Learning. We introduce the ideas of graph processing and present Pregel, Giraph, and Spark GraphX. Then we move to machine learning with examples from Mahout and Spark. Kmeans, Naive Bayes, and fpm are given as examples. Spark ML and Mllib continue the theme of programmability and application construction. The last topic we cover in week four introduces Deep Learning technologies including Theano, Tensor Flow, CNTK, MXnet, and Caffe on Spark

What You'll Learn

  • Understand big data analytics systems like Spark
  • Identify frameworks and distributions for data analytics
  • Explain cloud computing’s role in big data processing
  • Evaluate the impact of cloud data applications

Prerequisites

  • No deep prior experience is required but basic computer and internet skills are helpful
  • Ability to read course instructions in English and complete short practice activities

Instructors

R

Reza Farivar

Data Engineering Manager at Capital One, Adjunct Research Assistant Professor of Computer Science

R

Roy H. Campbell

Professor of Computer Science

Topics

Computer Security and Networks
Computer Science
Apache Spark
Databases
Apache Hadoop
Data Processing
Machine Learning
Apache Kafka
Big Data
Distributed Computing

Course Info

PlatformCoursera
LevelUnknown
PacingUnknown
PriceFree

Skills

أمن الحاسوب والشبكات
علوم الحاسوب
أباتشي سبارك
قواعد البيانات
أباتشي هادوب
معالجة البيانات
التعلم الآلي
أباتشي كافكا
Big Data
Distributed Computing

Start Learning Now