Modern Big Data Analysis with SQL Specialization

Cultivate your career with expert-led programs, job-ready certificates, and 10,000 ways to grow. All for $25/month, billed annually. Save now

Modern Big Data Analysis with SQL Specialization

Learn Data Analysis for Big Data. Master using SQL for data analysis on distributed big data systems

Instructors: Glynn Durham

28,796 already enrolled

Included with Coursera Plus

Learn more

3 course series

Get in-depth knowledge of a subject

4.8

(1,175 reviews)

Beginner level

No prior experience required

1 month

at 10 hours a week

Flexible schedule

Learn at your own pace

3 course series

Get in-depth knowledge of a subject

4.8

(1,175 reviews)

Beginner level

No prior experience required

1 month

at 10 hours a week

Flexible schedule

Learn at your own pace

Skills you'll gain

Details to know

Shareable certificate

Add to your LinkedIn profile

Taught in English

See how employees at top companies are mastering in-demand skills

Learn more about Coursera for Business

Advance your subject-matter expertise

Learn in-demand skills from university and industry experts
Master a subject or tool with hands-on projects
Develop a deep understanding of key concepts
Earn a career certificate from Cloudera

Earn a career certificate

Add this credential to your LinkedIn profile, resume, or CV

Share it on social media and in your performance review

Specialization - 3 course series

This Specialization teaches the essential skills for working with large-scale data using SQL.

Maybe you are new to SQL and you want to learn the basics. Or maybe you already have some experience using SQL to query smaller-scale data with relational databases. Either way, if you are interested in gaining the skills necessary to query big data with modern distributed SQL engines, this Specialization is for you.

Most courses that teach SQL focus on traditional relational databases, but today, more and more of the data that’s being generated is too big to be stored there, and it’s growing too quickly to be efficiently stored in commercial data warehouses. Instead, it’s increasingly stored in distributed clusters and cloud storage. These data stores are cost-efficient and infinitely scalable.

To query these huge datasets in clusters and cloud storage, you need a newer breed of SQL engine: distributed query engines, like Hive, Impala, Presto, and Drill. These are open source SQL engines capable of querying enormous datasets. This Specialization focuses on Hive and Impala, the most widely deployed of these query engines.

This Specialization is designed to provide excellent preparation for the Cloudera Certified Associate (CCA) Data Analyst certification exam. You can earn this certification credential by taking a hands-on practical exam using the same SQL engines that this Specialization teaches—Hive and Impala.

Applied Learning Project

Each course in this Specialization includes a hands-on, peer-graded assignment. To earn the Specialization Certificate, you must successfully complete the hands-on, peer-graded assignment in each course. For this Specialization, there is not a separate Capstone Project like there is in some other Coursera Specializations.

Foundations for Big Data Analysis with SQL

Course 111 hours4.7 (1,087 ratings)

What you'll learn

Distinguish operational from analytic databases, and understand how these are applied in big data
Understand how database and table design provides structures for working with data
Appreciate how differences in volume and variety of data affects your choice of an appropriate database system
Recognize the features and benefits of SQL dialects designed to work with big data systems for storage and analysis

Skills you'll gain

Category: Big Data

Category: Data Analysis

Category: Data Warehousing

Category: Database (DBMS)

Category: SQL

Analyzing Big Data with SQL

Course 217 hours4.8 (528 ratings)

What you'll learn

Understand the basics of SELECT statements
Understand how and why to filter results
Explore grouping and aggregation to answer analytic questions
Work with sorting and limiting results

Skills you'll gain

Category: Big Data

Category: Data Analysis

Category: Apache Impala

Category: SQL

Category: Apache Hive

Managing Big Data in Clusters and Cloud Storage

Course 320 hours4.7 (296 ratings)

What you'll learn

Use different tools to browse existing databases and tables in big data systems
Use different tools to explore files in distributed big data filesystems and cloud storage
Create and manage big data databases and tables using Apache Hive and Apache Impala
Describe and choose among different data types and file formats for big data systems

Skills you'll gain

Category: Big Data

Category: Distributed File Systems

Category: SQL

Category: Cloud Storage

Category: Data Management

Instructors

Glynn Durham

2 Courses53,273 learners

Offered by

Cloudera

Why people choose Coursera for their career

Felipe M.

Learner since 2018

"To be able to take courses at my own pace and rhythm has been an amazing experience. I can learn whenever it fits my schedule and mood."

Jennifer J.

Learner since 2020

"I directly applied the concepts and skills I learned from my courses to an exciting new project at work."

Larry W.

Learner since 2021

"When I need courses on topics that my university doesn't offer, Coursera is one of the best places to go."

Chaitanya A.

"Learning isn't just about being better at your job: it's so much more than that. Coursera allows me to learn without limits."

New to Data Analysis? Start here.

Open new doors with Coursera Plus

Unlimited access to 10,000+ world-class courses, hands-on projects, and job-ready certificate programs - all included in your subscription

Learn more

Advance your career with an online degree

Earn a degree from world-class universities - 100% online

Explore degrees

Join over 3,400 global companies that choose Coursera for Business

Upskill your employees to excel in the digital economy

Learn more

Frequently asked questions

Yes, the courses in this Specialization are intended to be taken in order:

A fourth course entitled Advanced SQL for Big Data Analysis is currently under development. When it is completed, it will be added to this Specialization.

To use the hands-on environment for the courses in this Specialization, you need to download and install a virtual machine and the software on which to run it. Before continuing, be sure that you have access to a computer that meets the following hardware and software requirements: • Windows, macOS, or Linux operating system (iPads and Android tablets will not work) • 64-bit operating system (32-bit operating systems will not work) • 8 GB RAM or more • 25GB free disk space or more • Intel VT-x or AMD-V virtualization support enabled (on Mac computers with Intel processors, this is always enabled; on Windows and Linux computers, you might need to enable it in the BIOS) • For Windows XP computers only: You must have an unzip utility such as 7-Zip or WinZip installed (Windows XP’s built-in unzip utility will not work)

Successfully completing this Specialization confers a Coursera Specialization Certificate. This is different from the Cloudera Certified Associate (CCA) Data Analyst credential. You can earn the CCA Data Analyst credential by passing a 120-minute performance-based exam. For pricing and other details, see CCA Data Analyst. If you complete this Specialization, including the honors lessons, then you should be well prepared to take the certification exam, but we cannot guarantee that you will pass it and earn the certification credential.

Each course in this Specialization includes a hands-on, peer-graded assignment. To earn the Specialization Certificate, you must earn the Course Certificate for each course in this Specialization. This requires that you successfully complete the hands-on, peer-graded assignment in each course. For this Specialization, there is not a separate Capstone Project like there is in some other Coursera Specializations.

Yes! To get started, click the course card that interests you and enroll. You can enroll and complete the course to earn a shareable certificate, or you can audit it to view the course materials for free. When you subscribe to a course that is part of a Specialization, you’re automatically subscribed to the full Specialization. Visit your learner dashboard to track your progress.

Modern Big Data Analysis with SQL Specialization

Skills you'll gain

Details to know

See how employees at top companies are mastering in-demand skills

Advance your subject-matter expertise

Earn a career certificate

Specialization - 3 course series

Foundations for Big Data Analysis with SQL

What you'll learn

Skills you'll gain

Analyzing Big Data with SQL

What you'll learn

Skills you'll gain

Managing Big Data in Clusters and Cloud Storage

What you'll learn

Skills you'll gain

Instructors

Offered by

Why people choose Coursera for their career

New to Data Analysis? Start here.

Open new doors with Coursera Plus

Advance your career with an online degree

Join over 3,400 global companies that choose Coursera for Business

Frequently asked questions

Should I take the courses in a specific order?

Is there a fourth course in this Specialization?

What are the hardware and software requirements for the exercise environment?

More questions