This course is all about big data. It’s for students with SQL experience that want to take the next step on their data journey by learning distributed computing using Apache Spark. Students will gain a thorough understanding of this open-source standard for working with large datasets. Students will gain an understanding of the fundamentals of data analysis using SQL on Spark, setting the foundation for how to combine data with advanced analytics at scale and in production environments. The four modules build on one another and by the end of the course you will understand: the Spark architecture, queries within Spark, common ways to optimize Spark SQL, and how to build reliable data pipelines.
Este curso faz parte do Programa de cursos integrados Learn SQL Basics for Data Science
oferecido por
Informações sobre o curso
Sua empresa se beneficiaria do treinamento dos funcionários em habilidades sob demanda?
Experimente o Coursera for BusinessO que você vai aprender
Use the collaborative Databricks workspace to write scalable Spark SQL code that executes against a cluster of machines
Inspect the Spark UI to analyze query performance and identify bottlenecks
Create an end-to-end pipeline that reads data, transforms it, and saves the result
Build a medallion (bronze, silver, gold) lakehouse architecture with Delta Lake to ensure the reliability, scalability, and performance of your data
Habilidades que você terá
- Data Science
- Apache Spark
- Delta Lake
- SQL
Sua empresa se beneficiaria do treinamento dos funcionários em habilidades sob demanda?
Experimente o Coursera for Businessoferecido por
Programa - O que você aprenderá com este curso
Introduction to Spark
Spark Core Concepts
Engineering Data Pipelines
Data Lakes, Warehouses and Lakehouses
Avaliações
- 5 stars65,90%
- 4 stars23,32%
- 3 stars6,71%
- 2 stars1,94%
- 1 star2,12%
Principais avaliações do DISTRIBUTED COMPUTING WITH SPARK SQL
This course was more of an introduction to the DataBricks platform but did introduce important concepts of Spark. I would have liked a somewhat longer course that goes into more detail.
A good introduction to Spark SQL.
A pity that the course assignment was not very challenging but more focused on understanding of Spark itself.
Great course! I loved the exercises, they were very helpful and I learned a lot about Spark and ML from this course.
A good course to learn the fundamentals of databricks, distribtued computing, and spark unified analytics platform.
Sobre Programa de cursos integrados Learn SQL Basics for Data Science

Perguntas Frequentes – FAQ
Quando terei acesso às palestras e às tarefas?
O que recebo ao me inscrever nesta Especialização?
Existe algum auxílio financeiro disponível?
Mais dúvidas? Visite o Central de Ajuda ao estudante.