Careers360 Logo
Interested in this College?
Get updates on Eligibility, Admission, Placements Fees Structure
Compare

Quick Facts

Medium Of InstructionsMode Of LearningMode Of Delivery
EnglishSelf StudyVideo and Text Based

Course Overview

The Distributed Computing with Spark SQL certification course is a 14 hours course. This course is available on the educational course platform Coursera, and the syllabus is curated by the University of California. Also, this course is part of the main programme, Learn SQL Basics for Data Science Specialization. 

This Distributed Computing with Spark SQL training course is made for candidates with some ideas, and information about SQL. This Coursera programme will be great for students who want to step ahead in their data journey. All 4 modules developed for this course are intertwined among themselves. In the end, when all 4 modules are learnt, the candidates will have learned many ways related to Spark SQL, and ways to construct reliable data points.

The Highlights

  • Online course
  • Shareable certificate
  • 14 hours for completion
  • English course title available 

Programme Offerings

  • Flexible Deadlines
  • Short Programme
  • Different Subtitles

Courses and Certificate Fees

Certificate AvailabilityCertificate Providing Authority
yesCoursera

The Distributed Computing with Spark SQL certification fee is based on the monthly plans mainly for 1 month, 3 months, or 6 months. All these monthly plans have the number of hours to be learned mentioned and also a certification at the course end to be shared with the candidates.

Distributed Computing with Spark SQL Fee Details

Description

Amount in INR

1 Month

Rs. 3,257/month


Eligibility Criteria

Certification Qualifying Details

  • The Distributed Computing with Spark SQL certification by Coursera is offered when the candidates are done with every course specialization.

What you will learn

SQL knowledge

Here are some things that will be learnt from the Distributed Computing with Spark SQL certification syllabus:

  • Using a collaborative workspace that can help in writing Spark SQL that can be easily executed.
  • Learning to inspect the Spark Up that shall be used for analyzing the query performance that helps in ultimately identifying bottlenecks.
  • Curating end-to-end pipelines that will help read the data by transforming it and ultimately saving the result.
  • Help in building a medallion either gold, bronze, or silver to ensure performance, scalability, and reliability.

Who it is for

Distributed Computing with Spark SQL shall become ideal for people like data scientists, and computer programmers


Admission Details

To get admission to the Distributed Computing with Spark SQL classes, the students can follow these steps: 

Step 1: Follow the official URL: https://www.coursera.org/learn/spark-sql#.

Step 2: During step 2, get to the ‘Enroll Now’ button and then click on it

Step 3: After the account creation is done, then log in must be done which will then lead the students to choose either the free mode or the paid mode.

Step 4: The above decision will be the deciding factor for admission to this course.

The Syllabus

Videos
  • Course Introduction
  • Why Distributed Computing?
  • Spark DataFrames
  • The Databricks Environment
  • SQL in Notebooks
  • Import Data
Readings
  • A Note From UC Davis
  • Readings and Resources
  • Assignment #1 - Queries in Spark SQL
Practice Exercises
  • Assignment #1 Quiz - Queries in Spark SQL
  • Module 1 Quiz

Videos
  • Module Introduction
  • Spark Terminology
  • Caching
  • Shuffle Partitions
  • Spark UI
  • Adaptive Query Execution (AQE)
Reading
  • Readings
  • Assignment #2 - Spark Internals
Practice Exercises
  • Assignment #2 Quiz - Spark Internals
  • Module 2 Quiz

Videos
  • Module Introduction
  • Spark as a Connector
  • Accessing Data
  • File Formats
  • JSON, Schemas and Types
  • Writing Data
  • Tables and Views
Readings
  • Readings
  • Assignment #3 - Engineering Data Pipelines
Practice Exercises
  • Assignment #3 Quiz - Engineering Data Pipelines30m
  • Module 3 Quiz

Videos
  • Module Introduction
  • Data Lakes vs. Data Warehouses
  • What is a Lakehouse?
  • Delta Lake
  • Delta Lake (Demo)
  • Delta Advanced Features (Demo)
  • Continuing with Spark and Data Science
  • Course Summary
Readings
  • Readings
  • Assignment #4 - Lakehouse
Practice Exercises
  • Assignment #4 Quiz - Lakehouse
  • Module 4 Quiz

Instructors

UC Davis Frequently Asked Questions (FAQ's)

1: The Distributed Computing with Spark SQL online course is part of which main programme?

The name of the main course is ‘Learn SQL Basics for Data Science Specialization'.

2: Are the deadlines mentioned on the Coursera platform flexible?

Yes, the deadlines can be adjusted on the Coursera platform.

3: What’s the Distributed Computing with Spark SQL online course’s level?

The level is intermediate.

4: Name the tutors for this Distributed Computing with Spark SQL course?

There are 2 tutors namely Brooke Wenig, and Conor Murphy.

5: Which is the supported institution of the online course on Distributed Computing with Spark SQL?

UC Davis is a partnering institution.

Articles

Back to top