Informações sobre o curso
4.3
659 classificações
141 avaliações
Programa de cursos integrados
100% online

100% online

Comece imediatamente e aprenda em seu próprio cronograma.
Prazos flexíveis

Prazos flexíveis

Redefinir os prazos de acordo com sua programação.
Horas para completar

Aprox. 21 horas para completar

Sugerido: 4 weeks of study, 6-8 hours/week...
Idiomas disponíveis

Inglês

Legendas: Inglês...

Habilidades que você terá

Relational AlgebraPython ProgrammingMapreduceSQL
Programa de cursos integrados
100% online

100% online

Comece imediatamente e aprenda em seu próprio cronograma.
Prazos flexíveis

Prazos flexíveis

Redefinir os prazos de acordo com sua programação.
Horas para completar

Aprox. 21 horas para completar

Sugerido: 4 weeks of study, 6-8 hours/week...
Idiomas disponíveis

Inglês

Legendas: Inglês...

Programa - O que você aprenderá com este curso

Semana
1
Horas para completar
6 horas para concluir

Data Science Context and Concepts

Understand the terminology and recurring principles associated with data science, and understand the structure of data science projects and emerging methodologies to approach them. Why does this emerging field exist? How does it relate to other fields? How does this course distinguish itself? What do data science projects look like, and how should they be approached? What are some examples of data science projects? ...
Reading
22 vídeos (Total de 125 min), 4 leituras, 1 teste
Video22 videos
Appetite Whetting: Extreme Weather2min
Appetite Whetting: Digital Humanities8min
Appetite Whetting: Bibliometrics4min
Appetite Whetting: Food, Music, Public Health5min
Appetite Whetting: Public Health cont'd, Earthquakes, Legal4min
Characterizing Data Science5min
Characterizing Data Science, cont'd5min
Distinguishing Data Science from Related Topics4min
Four Dimensions of Data Science6min
Tools vs. Abstractions7min
Desktop Scale vs. Cloud Scale5min
Hackers vs. Analysts2min
Structs vs. Stats5min
Structs vs. Stats cont'd5min
A Fourth Paradigm of Science3min
Data-Intensive Science Examples6min
Big Data and the 3 Vs5min
Big Data Definitions4min
Big Data Sources6min
Course Logistics7min
Twitter Assignment: Getting Started14min
Reading4 leituras
Supplementary: Three-Course Reading List10min
Supplementary: Resources for Learning Python10min
Supplementary: Class Virtual Machine10min
Supplementary: Github Instructions10min
Semana
2
Horas para completar
5 horas para concluir

Relational Databases and the Relational Algebra

Relational Databases are the workhouse of large-scale data management. Although originally motivated by problems in enterprise operations, they have proven remarkably capable for analytics as well. But most importantly, the principles underlying relational databases are universal in managing, manipulating, and analyzing data at scale. Even as the landscape of large-scale data systems has expanded dramatically in the last decade, relational models and languages have remained a unifying concept. For working with large-scale data, there is no more important programming model to learn....
Reading
24 vídeos (Total de 122 min), 1 teste
Video24 videos
From Data Models to Databases4min
Pre-Relational Databases5min
Motivating Relational Databases3min
Relational Databases: Key Ideas4min
Algebraic Optimization Overview6min
Relational Algebra Overview4min
Relational Algebra Operators: Union, Difference, Selection6min
Relational Algebra Operators: Projection, Cross Product4min
Relational Algebra Operators: Cross Product cont'd, Join6min
Relational Algebra Operators: Outer Join4min
Relational Algebra Operators: Theta-Join4min
From SQL to RA6min
Thinking in RA: Logical Query Plans4min
Practical SQL: Binning Timeseries5min
Practical SQL: Genomic Intervals6min
User-Defined Functions3min
Support for User-Defined Functions4min
Optimization: Physical Query Plans5min
Optimization: Choosing Physical Plans4min
Declarative Languages5min
Declarative Languages: More Examples4min
Views: Logical Data Independence5min
Indexes6min
Semana
3
Horas para completar
5 horas para concluir

MapReduce and Parallel Dataflow Programming

The MapReduce programming model (as distinct from its implementations) was proposed as a simplifying abstraction for parallel manipulation of massive datasets, and remains an important concept to know when using and evaluating modern big data platforms. ...
Reading
26 vídeos (Total de 122 min), 1 teste
Video26 videos
A Sketch of Algorithmic Complexity5min
A Sketch of Data-Parallel Algorithms5min
"Pleasingly Parallel" Algorithms4min
More General Distributed Algorithms4min
MapReduce Abstraction4min
MapReduce Data Model3min
Map and Reduce Functions2min
MapReduce Simple Example3min
MapReduce Simple Example cont'd3min
MapReduce Example: Word Length Histogram2min
MapReduce Examples: Inverted Index, Join6min
Relational Join: Map Phase4min
Relational Join: Reduce Phase4min
Simple Social Network Analysis: Counting Friends3min
Matrix Multiply Overview5min
Matrix Multiply Illustrated4min
Shared Nothing Computing4min
MapReduce Implementation5min
MapReduce Phases6min
A Design Space for Large-Scale Data Systems4min
Parallel and Distributed Query Processing5min
Teradata Example, MR Extensions5min
RDBMS vs. MapReduce: Features6min
RDBMS vs. Hadoop: Grep5min
RDBMS vs. Hadoop: Select, Aggregate, Join3min
Semana
4
Horas para completar
3 horas para concluir

NoSQL: Systems and Concepts

NoSQL systems are purely about scale rather than analytics, and are arguably less relevant for the practicing data scientist. However, they occupy an important place in many practical big data platform architectures, and data scientists need to understand their limitations and strengths to use them effectively....
Reading
36 vídeos (Total de 166 min)
Video36 videos
NoSQL Roundup4min
Relaxing Consistency Guarantees3min
Two-Phase Commit and Consensus Protocols5min
Eventual Consistency4min
CAP Theorem4min
Types of NoSQL Systems4min
ACID, Major Impact Systems4min
Memcached: Consistent Hashing2min
Consistent Hashing, cont'd4min
DynamoDB: Vector Clocks5min
Vector Clocks, cont'd5min
CouchDB Overview4min
CouchB Views3min
BigTable Overview5min
BigTable Implementation5min
HBase, Megastore3min
Spanner5min
Spanner cont'd, Google Systems6min
MapReduce-based Systems5min
Bringing Back Joins4min
NoSQL Rebuttal4min
Almost SQL: Pig4min
Pig Architecture and Performance3min
Data Model3min
Load, Filter, Group5min
Group, Distinct, Foreach, Flatten5min
CoGroup, Join3min
Join Algorithms3min
Skew5min
Other Commands3min
Evaluation Walkthrough3min
Review6min
Context3min
Spark Examples5min
RDDs, Benefits6min
Horas para completar
2 horas para concluir

Graph Analytics

Graph-structured data are increasingly common in data science contexts due to their ubiquity in modeling the communication between entities: people (social networks), computers (Internet communication), cities and countries (transportation networks), or corporations (financial transactions). Learn the common algorithms for extracting information from graph data and how to scale them up. ...
Reading
21 vídeos (Total de 91 min)
Video21 videos
Structural Analysis4min
Degree Histograms, Structure of the Web4min
Connectivity and Centrality4min
PageRank3min
PageRank in more Detail3min
Traversal Tasks: Spanning Trees and Circuits5min
Traversal Tasks: Maximum Flow1min
Pattern Matching6min
Querying Edge Tables4min
Relational Algebra and Datalog for Graphs4min
Querying Hybrid Graph/Relational Data3min
Graph Query Example: NSA6min
Graph Query Example: Recursion4min
Evaluation of Recursive Programs3min
Recursive Queries in MapReduce4min
The End-Game Problem3min
Representation: Edge Table, Adjacency List4min
Representation: Adjacency Matrix2min
PageRank in MapReduce5min
PageRank in Pregel5min
4.3

Melhores avaliações

por HAJan 11th 2016

Great course that strikes a balance between teaching general principles and concepts, and providing hands-on technical skills and practice.\n\nThe lessons are well designed and clearly conveyed.

por SLMay 28th 2016

I like the breadth of coverage of this class. Each of the exercise is a gem in that I get to learn something new also. I would highly recommend this even to experience practitioner also.

Instrutores

Avatar

Bill Howe

Director of Research
Scalable Data Analytics

Sobre University of Washington

Founded in 1861, the University of Washington is one of the oldest state-supported institutions of higher education on the West Coast and is one of the preeminent research universities in the world....

Sobre o Programa de cursos integrados Data Science at Scale

Learn scalable data management, evaluate big data technologies, and design effective visualizations. This Specialization covers intermediate topics in data science. You will gain hands-on experience with scalable SQL and NoSQL data management solutions, data mining algorithms, and practical statistical and machine learning concepts. You will also learn to visualize data and communicate results, and you’ll explore legal and ethical issues that arise in working with big data. In the final Capstone Project, developed in partnership with the digital internship platform Coursolve, you’ll apply your new skills to a real-world data science project....
Data Science at Scale

Perguntas Frequentes – FAQ

  • Ao se inscrever para um Certificado, você terá acesso a todos os vídeos, testes e tarefas de programação (se aplicável). Tarefas avaliadas pelos colegas apenas podem ser enviadas e avaliadas após o início da sessão. Caso escolha explorar o curso sem adquiri-lo, talvez você não consiga acessar certas tarefas.

  • Quando você se inscreve no curso, tem acesso a todos os cursos na Especialização e pode obter um certificado quando concluir o trabalho. Seu Certificado eletrônico será adicionado à sua página de Participações e você poderá imprimi-lo ou adicioná-lo ao seu perfil no LinkedIn. Se quiser apenas ler e assistir o conteúdo do curso, você poderá frequentá-lo como ouvinte sem custo.

Mais dúvidas? Visite o Central de Ajuda ao Aprendiz.