Course Schedule
Part 1: Resources
Week 1
Mon, Sep 1
Labor Day!
Week 2
Mon, Sep 8
Deployment (Linux Pipelines)
Read: Designing Data Intensive Applications, Kleppmann ("Batch Processing with Unix Tools" of Chapter 10)
Watch: Lecture
Slides: PDF
Wed, Sep 10
Deployment (Docker)
Release: P1 (Docker)
Watch: Lecture
Slides: PDF
Anki Flashcards: Deck
Online Quiz: week 1
Fri, Sep 12
Network Resources (Overview)
Read: Designing Data Intensive Applications, Kleppmann (Chapter 4, "Encoding and Evolution")
Watch: Lecture
Slides: PDF
Anki Flashcards: Deck
Week 3
Mon, Sep 15
Network Resources (gRPC)
Read: gRPC Basics Tutorial
Watch: Lecture
Slides: PDF
Anki Flashcards: Deck
Wed, Sep 17
gRPC demo
Read: gRPC Basics Tutorial
Watch: Lecture
Slides: PDF
Anki Flashcards: Deck
Online Quiz: week 2 and before (cumulative)
Week 4
Mon, Sep 22
Memory Resources (Caching)
Read: Systems Performance, Gregg (6.2.2; "CPU Caches" and "Latency" subsections of 6.4.1)
Due: P1
Release: P2 (Network+Memory)
Watch: Lecture
Slides: PDF
Anki Flashcards: Deck
Week 5
Mon, Sep 29
Memory Resources (PyArrow)
Read: Gallery of Processor Cache Effects (Examples 1 and 2)
Watch: Lecture
Slides: PDF
Anki Flashcards: Deck
Wed, Oct 1
Compute Resources (Threads)
Read: Fluent Python, 2nd Edition ("What's New in This Chapter" through "A Bit of Jargon" in chapter 19, "Concurrency Models in Python")
Watch: Lecture
Slides: PDF
Anki Flashcards: Deck
Fri, Oct 3
Compute Resources (Locks)
Read: Mastering Concurrency in Python ("Working With Threads In Python" chapter)
Watch: Lecture
Slides: PDF
Anki Flashcards: Deck
Week 6
Mon, Oct 6
Storage Resources (File Systems)
Due: P2
Release: P3 (Compute+Storage)
Watch: Lecture
Slides: PDF
Anki Flashcards: Deck
Wed, Oct 8
Storage Resources (Formats and DBs)
Read: Designing Data Intensive Applications, Kleppmann ("Transaction Processing or Analytics?" and "Column-Oriented Storage" sections of Chapter 3, "Storage and Retrieval")
Evening: Exam 1
- Regular exam: 5:45 to 6:45 pm; Location: TBD
- McBurney exam: 5:45 to 7:45 pm; Location: TBD
Watch: Lecture
Slides: PDF
Anki Flashcards: Deck
Fri, Oct 10
SQL Databases (MySQL)
Read: MySQL Crash Course, Silva (Chapters 3+5), Designing Data-Intensive Applications, Kleppmann ("The Meaning of ACID" section in Chapter 7, "Transactions")
Watch: Lecture
Slides: PDF
Anki Flashcards: Deck
Part 2: Clusters
Week 7
Wed, Oct 15
Hadoop
Read: Mastering Hadoop 3, Singh et al. ("Deep Dive Into the Hadoop Distributed File System" chapter)
Release: P4 (HDFS, Loans)
Due: P3
Watch: Lecture
Slides: PDF
Anki Flashcards: Deck
Online Quiz: week 6 and before (cumulative)
Fri, Oct 17
MapReduce
Read: Learning Spark, 2nd edition by Damji et al. (sections "The Importance of an Optimal Storage Solution", "Databases", and "Data Lakes" of chapter 9, "Building Reliable Data Lakes with Apache Spark")
Watch: Lecture
Slides: PDF
Anki Flashcards: Deck
Week 8
Wed, Oct 22
Spark RDDs
Read: Learning Spark, 2nd edition by Damji et al. (Chapter 4, "Spark SQL and DataFrames: Introduction to Built-in Data Sources")
Watch: Lecture
Anki Flashcards: Deck
Fri, Oct 24
Spark DataFrames
Read: Designing Data Intensive Applications, Kleppmann ("Reduce-Side Joins and Grouping" of Chapter 10, "Batch Processing")
Watch: Lecture
Anki Flashcards: Deck
Week 9
Mon, Oct 27
Spark SQL
Read: Learning Spark, 2nd edition by Damji et al. (Chapter 7, "Optimizing and Tuning Spark Applications")
Due: P4
Release: P5 (Spark, Loans)
Watch: Lecture
Slides: PDF
Anki Flashcards: Deck
Fri, Oct 31
Spark Internals and Performance
Read: Learning Spark, 2nd edition by Damji et al. (Chapter 10, "Machine Learning with MLlib")
Watch: Lecture
Slides: PDF
Anki Flashcards: Deck
Week 10
Wed, Nov 5
Spark Machine Learning API
Read: Cassandra, The Definitive Guide, by Carpenter et al. (Chapter 4, "The Cassandra Query Language")
Watch: Lecture
Slides: PDF
Anki Flashcards: Deck
Online Quiz: week 9 and before (cumulative)
Week 11
Mon, Nov 10
Wide Tables: HBase and Cassandra
Read: Cassandra, The Definitive Guide, by Carpenter et al. (sections "Data Centers and Racks" to "Hinted Handoff" of Chapter 6, "The Cassandra Architecture")
Watch: Lecture
Slides: PDF
Anki Flashcards: Deck
Fri, Nov 14
Cassandra Partitioning
Read: Kafka, The Definitive Guide, 2nd edition by Shapira et al. ("Enter Kafka" section of Chapter 1, "Meet Kafka")
Watch: Lecture
Slides: PDF
Anki Flashcards: Deck
Week 12
Wed, Nov 19
Exam 2 review
Review/Catchup
- Regular exam: 5:45 to 6:45 pm; Location: TBD
- McBurney exam: 5:45 to 7:45 pm; Location: TBD
Watch: Lecture
Anki Flashcards: Deck
Fri, Nov 21
Streaming: Kafka Concepts
Read: Kafka, The Definitive Guide, 2nd edition by Shapira et al. ("Enter Kafka" section of Chapter 1, "Meet Kafka")
Due: P6
Release: P7 (Kafka, Weather Stations)
Watch: Lecture
Slides: PDF
Anki Flashcards: Deck
Week 13
Wed, Nov 26
Streaming: Kafka demo (contd.)
Read: Kafka, The Definitive Guide, 2nd edition by Shapira et al. (Chapter 7, "Reliable Data Delivery")
Watch: Lecture
Anki Flashcards: Deck
Fri, Nov 28
Thanksgiving Break
Part 3: Cloud
Week 14
Wed, Dec 3
The Cloud + Big Query 1: Basics
Watch: Lecture
Slides: PDF
Anki Flashcards: Deck
Online Quiz: week 13 and before (cumulative)
Fri, Dec 5
Big Query 1: Basics
Read: Google BigQuery: The Definitive Guide, by Lakshmanan et al. ("BigQuery Geographic Information Systems" section of Chapter 8, "Advanced Queries")
Due: P7
Release: P8 (Cloud Services)
Watch: Lecture
Week 15
Mon, Dec 8
Big Query 1: demo catch up
Read: Google BigQuery: The Definitive Guide, by Lakshmanan et al. (Chapter 9, "Machine Learning in BigQuery")
Watch: Lecture
Slides: PDF
Fri, Dec 12
No Class
Due: P8
Mon, Sep 1
Labor Day!
Mon, Sep 8
Deployment (Linux Pipelines)
Read: Designing Data Intensive Applications, Kleppmann ("Batch Processing with Unix Tools" of Chapter 10)Watch: Lecture
Slides: PDF
Wed, Sep 10
Deployment (Docker)
Release: P1 (Docker)Watch: Lecture
Slides: PDF
Anki Flashcards: Deck
Online Quiz: week 1
Fri, Sep 12
Network Resources (Overview)
Read: Designing Data Intensive Applications, Kleppmann (Chapter 4, "Encoding and Evolution")Watch: Lecture
Slides: PDF
Anki Flashcards: Deck
Week 3
Mon, Sep 15
Network Resources (gRPC)
Read: gRPC Basics Tutorial
Watch: Lecture
Slides: PDF
Anki Flashcards: Deck
Wed, Sep 17
gRPC demo
Read: gRPC Basics Tutorial
Watch: Lecture
Slides: PDF
Anki Flashcards: Deck
Online Quiz: week 2 and before (cumulative)
Week 4
Mon, Sep 22
Memory Resources (Caching)
Read: Systems Performance, Gregg (6.2.2; "CPU Caches" and "Latency" subsections of 6.4.1)
Due: P1
Release: P2 (Network+Memory)
Watch: Lecture
Slides: PDF
Anki Flashcards: Deck
Week 5
Mon, Sep 29
Memory Resources (PyArrow)
Read: Gallery of Processor Cache Effects (Examples 1 and 2)
Watch: Lecture
Slides: PDF
Anki Flashcards: Deck
Wed, Oct 1
Compute Resources (Threads)
Read: Fluent Python, 2nd Edition ("What's New in This Chapter" through "A Bit of Jargon" in chapter 19, "Concurrency Models in Python")
Watch: Lecture
Slides: PDF
Anki Flashcards: Deck
Fri, Oct 3
Compute Resources (Locks)
Read: Mastering Concurrency in Python ("Working With Threads In Python" chapter)
Watch: Lecture
Slides: PDF
Anki Flashcards: Deck
Week 6
Mon, Oct 6
Storage Resources (File Systems)
Due: P2
Release: P3 (Compute+Storage)
Watch: Lecture
Slides: PDF
Anki Flashcards: Deck
Wed, Oct 8
Storage Resources (Formats and DBs)
Read: Designing Data Intensive Applications, Kleppmann ("Transaction Processing or Analytics?" and "Column-Oriented Storage" sections of Chapter 3, "Storage and Retrieval")
Evening: Exam 1
- Regular exam: 5:45 to 6:45 pm; Location: TBD
- McBurney exam: 5:45 to 7:45 pm; Location: TBD
Watch: Lecture
Slides: PDF
Anki Flashcards: Deck
Fri, Oct 10
SQL Databases (MySQL)
Read: MySQL Crash Course, Silva (Chapters 3+5), Designing Data-Intensive Applications, Kleppmann ("The Meaning of ACID" section in Chapter 7, "Transactions")
Watch: Lecture
Slides: PDF
Anki Flashcards: Deck
Part 2: Clusters
Week 7
Wed, Oct 15
Hadoop
Read: Mastering Hadoop 3, Singh et al. ("Deep Dive Into the Hadoop Distributed File System" chapter)
Release: P4 (HDFS, Loans)
Due: P3
Watch: Lecture
Slides: PDF
Anki Flashcards: Deck
Online Quiz: week 6 and before (cumulative)
Fri, Oct 17
MapReduce
Read: Learning Spark, 2nd edition by Damji et al. (sections "The Importance of an Optimal Storage Solution", "Databases", and "Data Lakes" of chapter 9, "Building Reliable Data Lakes with Apache Spark")
Watch: Lecture
Slides: PDF
Anki Flashcards: Deck
Week 8
Wed, Oct 22
Spark RDDs
Read: Learning Spark, 2nd edition by Damji et al. (Chapter 4, "Spark SQL and DataFrames: Introduction to Built-in Data Sources")
Watch: Lecture
Anki Flashcards: Deck
Fri, Oct 24
Spark DataFrames
Read: Designing Data Intensive Applications, Kleppmann ("Reduce-Side Joins and Grouping" of Chapter 10, "Batch Processing")
Watch: Lecture
Anki Flashcards: Deck
Week 9
Mon, Oct 27
Spark SQL
Read: Learning Spark, 2nd edition by Damji et al. (Chapter 7, "Optimizing and Tuning Spark Applications")
Due: P4
Release: P5 (Spark, Loans)
Watch: Lecture
Slides: PDF
Anki Flashcards: Deck
Fri, Oct 31
Spark Internals and Performance
Read: Learning Spark, 2nd edition by Damji et al. (Chapter 10, "Machine Learning with MLlib")
Watch: Lecture
Slides: PDF
Anki Flashcards: Deck
Week 10
Wed, Nov 5
Spark Machine Learning API
Read: Cassandra, The Definitive Guide, by Carpenter et al. (Chapter 4, "The Cassandra Query Language")
Watch: Lecture
Slides: PDF
Anki Flashcards: Deck
Online Quiz: week 9 and before (cumulative)
Week 11
Mon, Nov 10
Wide Tables: HBase and Cassandra
Read: Cassandra, The Definitive Guide, by Carpenter et al. (sections "Data Centers and Racks" to "Hinted Handoff" of Chapter 6, "The Cassandra Architecture")
Watch: Lecture
Slides: PDF
Anki Flashcards: Deck
Fri, Nov 14
Cassandra Partitioning
Read: Kafka, The Definitive Guide, 2nd edition by Shapira et al. ("Enter Kafka" section of Chapter 1, "Meet Kafka")
Watch: Lecture
Slides: PDF
Anki Flashcards: Deck
Week 12
Wed, Nov 19
Exam 2 review
Review/Catchup
- Regular exam: 5:45 to 6:45 pm; Location: TBD
- McBurney exam: 5:45 to 7:45 pm; Location: TBD
Watch: Lecture
Anki Flashcards: Deck
Fri, Nov 21
Streaming: Kafka Concepts
Read: Kafka, The Definitive Guide, 2nd edition by Shapira et al. ("Enter Kafka" section of Chapter 1, "Meet Kafka")
Due: P6
Release: P7 (Kafka, Weather Stations)
Watch: Lecture
Slides: PDF
Anki Flashcards: Deck
Week 13
Wed, Nov 26
Streaming: Kafka demo (contd.)
Read: Kafka, The Definitive Guide, 2nd edition by Shapira et al. (Chapter 7, "Reliable Data Delivery")
Watch: Lecture
Anki Flashcards: Deck
Fri, Nov 28
Thanksgiving Break
Part 3: Cloud
Week 14
Wed, Dec 3
The Cloud + Big Query 1: Basics
Watch: Lecture
Slides: PDF
Anki Flashcards: Deck
Online Quiz: week 13 and before (cumulative)
Fri, Dec 5
Big Query 1: Basics
Read: Google BigQuery: The Definitive Guide, by Lakshmanan et al. ("BigQuery Geographic Information Systems" section of Chapter 8, "Advanced Queries")
Due: P7
Release: P8 (Cloud Services)
Watch: Lecture
Week 15
Mon, Dec 8
Big Query 1: demo catch up
Read: Google BigQuery: The Definitive Guide, by Lakshmanan et al. (Chapter 9, "Machine Learning in BigQuery")
Watch: Lecture
Slides: PDF
Fri, Dec 12
No Class
Due: P8
Mon, Sep 15
Network Resources (gRPC)
Read: gRPC Basics TutorialWatch: Lecture
Slides: PDF
Anki Flashcards: Deck
Wed, Sep 17
gRPC demo
Read: gRPC Basics TutorialWatch: Lecture
Slides: PDF
Anki Flashcards: Deck
Online Quiz: week 2 and before (cumulative)
Mon, Sep 22
Memory Resources (Caching)
Read: Systems Performance, Gregg (6.2.2; "CPU Caches" and "Latency" subsections of 6.4.1)Due: P1
Release: P2 (Network+Memory)
Watch: Lecture
Slides: PDF
Anki Flashcards: Deck
Week 5
Mon, Sep 29
Memory Resources (PyArrow)
Read: Gallery of Processor Cache Effects (Examples 1 and 2)
Watch: Lecture
Slides: PDF
Anki Flashcards: Deck
Wed, Oct 1
Compute Resources (Threads)
Read: Fluent Python, 2nd Edition ("What's New in This Chapter" through "A Bit of Jargon" in chapter 19, "Concurrency Models in Python")
Watch: Lecture
Slides: PDF
Anki Flashcards: Deck
Fri, Oct 3
Compute Resources (Locks)
Read: Mastering Concurrency in Python ("Working With Threads In Python" chapter)
Watch: Lecture
Slides: PDF
Anki Flashcards: Deck
Week 6
Mon, Oct 6
Storage Resources (File Systems)
Due: P2
Release: P3 (Compute+Storage)
Watch: Lecture
Slides: PDF
Anki Flashcards: Deck
Wed, Oct 8
Storage Resources (Formats and DBs)
Read: Designing Data Intensive Applications, Kleppmann ("Transaction Processing or Analytics?" and "Column-Oriented Storage" sections of Chapter 3, "Storage and Retrieval")
Evening: Exam 1
- Regular exam: 5:45 to 6:45 pm; Location: TBD
- McBurney exam: 5:45 to 7:45 pm; Location: TBD
Watch: Lecture
Slides: PDF
Anki Flashcards: Deck
Fri, Oct 10
SQL Databases (MySQL)
Read: MySQL Crash Course, Silva (Chapters 3+5), Designing Data-Intensive Applications, Kleppmann ("The Meaning of ACID" section in Chapter 7, "Transactions")
Watch: Lecture
Slides: PDF
Anki Flashcards: Deck
Part 2: Clusters
Week 7
Wed, Oct 15
Hadoop
Read: Mastering Hadoop 3, Singh et al. ("Deep Dive Into the Hadoop Distributed File System" chapter)
Release: P4 (HDFS, Loans)
Due: P3
Watch: Lecture
Slides: PDF
Anki Flashcards: Deck
Online Quiz: week 6 and before (cumulative)
Fri, Oct 17
MapReduce
Read: Learning Spark, 2nd edition by Damji et al. (sections "The Importance of an Optimal Storage Solution", "Databases", and "Data Lakes" of chapter 9, "Building Reliable Data Lakes with Apache Spark")
Watch: Lecture
Slides: PDF
Anki Flashcards: Deck
Week 8
Wed, Oct 22
Spark RDDs
Read: Learning Spark, 2nd edition by Damji et al. (Chapter 4, "Spark SQL and DataFrames: Introduction to Built-in Data Sources")
Watch: Lecture
Anki Flashcards: Deck
Fri, Oct 24
Spark DataFrames
Read: Designing Data Intensive Applications, Kleppmann ("Reduce-Side Joins and Grouping" of Chapter 10, "Batch Processing")
Watch: Lecture
Anki Flashcards: Deck
Week 9
Mon, Oct 27
Spark SQL
Read: Learning Spark, 2nd edition by Damji et al. (Chapter 7, "Optimizing and Tuning Spark Applications")
Due: P4
Release: P5 (Spark, Loans)
Watch: Lecture
Slides: PDF
Anki Flashcards: Deck
Fri, Oct 31
Spark Internals and Performance
Read: Learning Spark, 2nd edition by Damji et al. (Chapter 10, "Machine Learning with MLlib")
Watch: Lecture
Slides: PDF
Anki Flashcards: Deck
Week 10
Wed, Nov 5
Spark Machine Learning API
Read: Cassandra, The Definitive Guide, by Carpenter et al. (Chapter 4, "The Cassandra Query Language")
Watch: Lecture
Slides: PDF
Anki Flashcards: Deck
Online Quiz: week 9 and before (cumulative)
Week 11
Mon, Nov 10
Wide Tables: HBase and Cassandra
Read: Cassandra, The Definitive Guide, by Carpenter et al. (sections "Data Centers and Racks" to "Hinted Handoff" of Chapter 6, "The Cassandra Architecture")
Watch: Lecture
Slides: PDF
Anki Flashcards: Deck
Fri, Nov 14
Cassandra Partitioning
Read: Kafka, The Definitive Guide, 2nd edition by Shapira et al. ("Enter Kafka" section of Chapter 1, "Meet Kafka")
Watch: Lecture
Slides: PDF
Anki Flashcards: Deck
Week 12
Wed, Nov 19
Exam 2 review
Review/Catchup
- Regular exam: 5:45 to 6:45 pm; Location: TBD
- McBurney exam: 5:45 to 7:45 pm; Location: TBD
Watch: Lecture
Anki Flashcards: Deck
Fri, Nov 21
Streaming: Kafka Concepts
Read: Kafka, The Definitive Guide, 2nd edition by Shapira et al. ("Enter Kafka" section of Chapter 1, "Meet Kafka")
Due: P6
Release: P7 (Kafka, Weather Stations)
Watch: Lecture
Slides: PDF
Anki Flashcards: Deck
Week 13
Wed, Nov 26
Streaming: Kafka demo (contd.)
Read: Kafka, The Definitive Guide, 2nd edition by Shapira et al. (Chapter 7, "Reliable Data Delivery")
Watch: Lecture
Anki Flashcards: Deck
Fri, Nov 28
Thanksgiving Break
Part 3: Cloud
Week 14
Wed, Dec 3
The Cloud + Big Query 1: Basics
Watch: Lecture
Slides: PDF
Anki Flashcards: Deck
Online Quiz: week 13 and before (cumulative)
Fri, Dec 5
Big Query 1: Basics
Read: Google BigQuery: The Definitive Guide, by Lakshmanan et al. ("BigQuery Geographic Information Systems" section of Chapter 8, "Advanced Queries")
Due: P7
Release: P8 (Cloud Services)
Watch: Lecture
Week 15
Mon, Dec 8
Big Query 1: demo catch up
Read: Google BigQuery: The Definitive Guide, by Lakshmanan et al. (Chapter 9, "Machine Learning in BigQuery")
Watch: Lecture
Slides: PDF
Fri, Dec 12
No Class
Due: P8
Mon, Sep 29
Memory Resources (PyArrow)
Read: Gallery of Processor Cache Effects (Examples 1 and 2)Watch: Lecture
Slides: PDF
Anki Flashcards: Deck
Wed, Oct 1
Compute Resources (Threads)
Read: Fluent Python, 2nd Edition ("What's New in This Chapter" through "A Bit of Jargon" in chapter 19, "Concurrency Models in Python")Watch: Lecture
Slides: PDF
Anki Flashcards: Deck
Fri, Oct 3
Compute Resources (Locks)
Read: Mastering Concurrency in Python ("Working With Threads In Python" chapter)Watch: Lecture
Slides: PDF
Anki Flashcards: Deck
Mon, Oct 6
Storage Resources (File Systems)
Due: P2Release: P3 (Compute+Storage)
Watch: Lecture
Slides: PDF
Anki Flashcards: Deck
Wed, Oct 8
Storage Resources (Formats and DBs)
Read: Designing Data Intensive Applications, Kleppmann ("Transaction Processing or Analytics?" and "Column-Oriented Storage" sections of Chapter 3, "Storage and Retrieval")Evening: Exam 1
- Regular exam: 5:45 to 6:45 pm; Location: TBD
- McBurney exam: 5:45 to 7:45 pm; Location: TBD
Watch: Lecture
Slides: PDF
Anki Flashcards: Deck
Fri, Oct 10
SQL Databases (MySQL)
Read: MySQL Crash Course, Silva (Chapters 3+5), Designing Data-Intensive Applications, Kleppmann ("The Meaning of ACID" section in Chapter 7, "Transactions")Watch: Lecture
Slides: PDF
Anki Flashcards: Deck
Part 2: Clusters
Week 7
Wed, Oct 15
Hadoop
Read: Mastering Hadoop 3, Singh et al. ("Deep Dive Into the Hadoop Distributed File System" chapter)
Release: P4 (HDFS, Loans)
Due: P3
Watch: Lecture
Slides: PDF
Anki Flashcards: Deck
Online Quiz: week 6 and before (cumulative)
Fri, Oct 17
MapReduce
Read: Learning Spark, 2nd edition by Damji et al. (sections "The Importance of an Optimal Storage Solution", "Databases", and "Data Lakes" of chapter 9, "Building Reliable Data Lakes with Apache Spark")
Watch: Lecture
Slides: PDF
Anki Flashcards: Deck
Week 8
Wed, Oct 22
Spark RDDs
Read: Learning Spark, 2nd edition by Damji et al. (Chapter 4, "Spark SQL and DataFrames: Introduction to Built-in Data Sources")
Watch: Lecture
Anki Flashcards: Deck
Fri, Oct 24
Spark DataFrames
Read: Designing Data Intensive Applications, Kleppmann ("Reduce-Side Joins and Grouping" of Chapter 10, "Batch Processing")
Watch: Lecture
Anki Flashcards: Deck
Week 9
Mon, Oct 27
Spark SQL
Read: Learning Spark, 2nd edition by Damji et al. (Chapter 7, "Optimizing and Tuning Spark Applications")
Due: P4
Release: P5 (Spark, Loans)
Watch: Lecture
Slides: PDF
Anki Flashcards: Deck
Fri, Oct 31
Spark Internals and Performance
Read: Learning Spark, 2nd edition by Damji et al. (Chapter 10, "Machine Learning with MLlib")
Watch: Lecture
Slides: PDF
Anki Flashcards: Deck
Week 10
Wed, Nov 5
Spark Machine Learning API
Read: Cassandra, The Definitive Guide, by Carpenter et al. (Chapter 4, "The Cassandra Query Language")
Watch: Lecture
Slides: PDF
Anki Flashcards: Deck
Online Quiz: week 9 and before (cumulative)
Week 11
Mon, Nov 10
Wide Tables: HBase and Cassandra
Read: Cassandra, The Definitive Guide, by Carpenter et al. (sections "Data Centers and Racks" to "Hinted Handoff" of Chapter 6, "The Cassandra Architecture")
Watch: Lecture
Slides: PDF
Anki Flashcards: Deck
Fri, Nov 14
Cassandra Partitioning
Read: Kafka, The Definitive Guide, 2nd edition by Shapira et al. ("Enter Kafka" section of Chapter 1, "Meet Kafka")
Watch: Lecture
Slides: PDF
Anki Flashcards: Deck
Week 12
Wed, Nov 19
Exam 2 review
Review/Catchup
- Regular exam: 5:45 to 6:45 pm; Location: TBD
- McBurney exam: 5:45 to 7:45 pm; Location: TBD
Watch: Lecture
Anki Flashcards: Deck
Fri, Nov 21
Streaming: Kafka Concepts
Read: Kafka, The Definitive Guide, 2nd edition by Shapira et al. ("Enter Kafka" section of Chapter 1, "Meet Kafka")
Due: P6
Release: P7 (Kafka, Weather Stations)
Watch: Lecture
Slides: PDF
Anki Flashcards: Deck
Week 13
Wed, Nov 26
Streaming: Kafka demo (contd.)
Read: Kafka, The Definitive Guide, 2nd edition by Shapira et al. (Chapter 7, "Reliable Data Delivery")
Watch: Lecture
Anki Flashcards: Deck
Fri, Nov 28
Thanksgiving Break
Part 3: Cloud
Week 14
Wed, Dec 3
The Cloud + Big Query 1: Basics
Watch: Lecture
Slides: PDF
Anki Flashcards: Deck
Online Quiz: week 13 and before (cumulative)
Fri, Dec 5
Big Query 1: Basics
Read: Google BigQuery: The Definitive Guide, by Lakshmanan et al. ("BigQuery Geographic Information Systems" section of Chapter 8, "Advanced Queries")
Due: P7
Release: P8 (Cloud Services)
Watch: Lecture
Week 15
Mon, Dec 8
Big Query 1: demo catch up
Read: Google BigQuery: The Definitive Guide, by Lakshmanan et al. (Chapter 9, "Machine Learning in BigQuery")
Watch: Lecture
Slides: PDF
Fri, Dec 12
No Class
Due: P8
Wed, Oct 15
Hadoop
Read: Mastering Hadoop 3, Singh et al. ("Deep Dive Into the Hadoop Distributed File System" chapter)Release: P4 (HDFS, Loans)
Due: P3
Watch: Lecture
Slides: PDF
Anki Flashcards: Deck
Online Quiz: week 6 and before (cumulative)
Fri, Oct 17
MapReduce
Read: Learning Spark, 2nd edition by Damji et al. (sections "The Importance of an Optimal Storage Solution", "Databases", and "Data Lakes" of chapter 9, "Building Reliable Data Lakes with Apache Spark")Watch: Lecture
Slides: PDF
Anki Flashcards: Deck
Wed, Oct 22
Spark RDDs
Read: Learning Spark, 2nd edition by Damji et al. (Chapter 4, "Spark SQL and DataFrames: Introduction to Built-in Data Sources")Watch: Lecture
Anki Flashcards: Deck
Fri, Oct 24
Spark DataFrames
Read: Designing Data Intensive Applications, Kleppmann ("Reduce-Side Joins and Grouping" of Chapter 10, "Batch Processing")Watch: Lecture
Anki Flashcards: Deck
Week 9
Mon, Oct 27
Spark SQL
Read: Learning Spark, 2nd edition by Damji et al. (Chapter 7, "Optimizing and Tuning Spark Applications")
Due: P4
Release: P5 (Spark, Loans)
Watch: Lecture
Slides: PDF
Anki Flashcards: Deck
Fri, Oct 31
Spark Internals and Performance
Read: Learning Spark, 2nd edition by Damji et al. (Chapter 10, "Machine Learning with MLlib")
Watch: Lecture
Slides: PDF
Anki Flashcards: Deck
Week 10
Wed, Nov 5
Spark Machine Learning API
Read: Cassandra, The Definitive Guide, by Carpenter et al. (Chapter 4, "The Cassandra Query Language")
Watch: Lecture
Slides: PDF
Anki Flashcards: Deck
Online Quiz: week 9 and before (cumulative)
Week 11
Mon, Nov 10
Wide Tables: HBase and Cassandra
Read: Cassandra, The Definitive Guide, by Carpenter et al. (sections "Data Centers and Racks" to "Hinted Handoff" of Chapter 6, "The Cassandra Architecture")
Watch: Lecture
Slides: PDF
Anki Flashcards: Deck
Fri, Nov 14
Cassandra Partitioning
Read: Kafka, The Definitive Guide, 2nd edition by Shapira et al. ("Enter Kafka" section of Chapter 1, "Meet Kafka")
Watch: Lecture
Slides: PDF
Anki Flashcards: Deck
Week 12
Wed, Nov 19
Exam 2 review
Review/Catchup
- Regular exam: 5:45 to 6:45 pm; Location: TBD
- McBurney exam: 5:45 to 7:45 pm; Location: TBD
Watch: Lecture
Anki Flashcards: Deck
Fri, Nov 21
Streaming: Kafka Concepts
Read: Kafka, The Definitive Guide, 2nd edition by Shapira et al. ("Enter Kafka" section of Chapter 1, "Meet Kafka")
Due: P6
Release: P7 (Kafka, Weather Stations)
Watch: Lecture
Slides: PDF
Anki Flashcards: Deck
Week 13
Wed, Nov 26
Streaming: Kafka demo (contd.)
Read: Kafka, The Definitive Guide, 2nd edition by Shapira et al. (Chapter 7, "Reliable Data Delivery")
Watch: Lecture
Anki Flashcards: Deck
Fri, Nov 28
Thanksgiving Break
Part 3: Cloud
Week 14
Wed, Dec 3
The Cloud + Big Query 1: Basics
Watch: Lecture
Slides: PDF
Anki Flashcards: Deck
Online Quiz: week 13 and before (cumulative)
Fri, Dec 5
Big Query 1: Basics
Read: Google BigQuery: The Definitive Guide, by Lakshmanan et al. ("BigQuery Geographic Information Systems" section of Chapter 8, "Advanced Queries")
Due: P7
Release: P8 (Cloud Services)
Watch: Lecture
Week 15
Mon, Dec 8
Big Query 1: demo catch up
Read: Google BigQuery: The Definitive Guide, by Lakshmanan et al. (Chapter 9, "Machine Learning in BigQuery")
Watch: Lecture
Slides: PDF
Fri, Dec 12
No Class
Due: P8
Mon, Oct 27
Spark SQL
Read: Learning Spark, 2nd edition by Damji et al. (Chapter 7, "Optimizing and Tuning Spark Applications")Due: P4
Release: P5 (Spark, Loans)
Watch: Lecture
Slides: PDF
Anki Flashcards: Deck
Fri, Oct 31
Spark Internals and Performance
Read: Learning Spark, 2nd edition by Damji et al. (Chapter 10, "Machine Learning with MLlib")Watch: Lecture
Slides: PDF
Anki Flashcards: Deck
Wed, Nov 5
Spark Machine Learning API
Read: Cassandra, The Definitive Guide, by Carpenter et al. (Chapter 4, "The Cassandra Query Language")Watch: Lecture
Slides: PDF
Anki Flashcards: Deck
Online Quiz: week 9 and before (cumulative)
Week 11
Mon, Nov 10
Wide Tables: HBase and Cassandra
Read: Cassandra, The Definitive Guide, by Carpenter et al. (sections "Data Centers and Racks" to "Hinted Handoff" of Chapter 6, "The Cassandra Architecture")
Watch: Lecture
Slides: PDF
Anki Flashcards: Deck
Fri, Nov 14
Cassandra Partitioning
Read: Kafka, The Definitive Guide, 2nd edition by Shapira et al. ("Enter Kafka" section of Chapter 1, "Meet Kafka")
Watch: Lecture
Slides: PDF
Anki Flashcards: Deck
Week 12
Wed, Nov 19
Exam 2 review
Review/Catchup
- Regular exam: 5:45 to 6:45 pm; Location: TBD
- McBurney exam: 5:45 to 7:45 pm; Location: TBD
Watch: Lecture
Anki Flashcards: Deck
Fri, Nov 21
Streaming: Kafka Concepts
Read: Kafka, The Definitive Guide, 2nd edition by Shapira et al. ("Enter Kafka" section of Chapter 1, "Meet Kafka")
Due: P6
Release: P7 (Kafka, Weather Stations)
Watch: Lecture
Slides: PDF
Anki Flashcards: Deck
Week 13
Wed, Nov 26
Streaming: Kafka demo (contd.)
Read: Kafka, The Definitive Guide, 2nd edition by Shapira et al. (Chapter 7, "Reliable Data Delivery")
Watch: Lecture
Anki Flashcards: Deck
Fri, Nov 28
Thanksgiving Break
Part 3: Cloud
Week 14
Wed, Dec 3
The Cloud + Big Query 1: Basics
Watch: Lecture
Slides: PDF
Anki Flashcards: Deck
Online Quiz: week 13 and before (cumulative)
Fri, Dec 5
Big Query 1: Basics
Read: Google BigQuery: The Definitive Guide, by Lakshmanan et al. ("BigQuery Geographic Information Systems" section of Chapter 8, "Advanced Queries")
Due: P7
Release: P8 (Cloud Services)
Watch: Lecture
Week 15
Mon, Dec 8
Big Query 1: demo catch up
Read: Google BigQuery: The Definitive Guide, by Lakshmanan et al. (Chapter 9, "Machine Learning in BigQuery")
Watch: Lecture
Slides: PDF
Fri, Dec 12
No Class
Due: P8
Mon, Nov 10
Wide Tables: HBase and Cassandra
Read: Cassandra, The Definitive Guide, by Carpenter et al. (sections "Data Centers and Racks" to "Hinted Handoff" of Chapter 6, "The Cassandra Architecture")Watch: Lecture
Slides: PDF
Anki Flashcards: Deck
Fri, Nov 14
Cassandra Partitioning
Read: Kafka, The Definitive Guide, 2nd edition by Shapira et al. ("Enter Kafka" section of Chapter 1, "Meet Kafka")Watch: Lecture
Slides: PDF
Anki Flashcards: Deck
Wed, Nov 19
Exam 2 review
Review/Catchup
- Regular exam: 5:45 to 6:45 pm; Location: TBD
- McBurney exam: 5:45 to 7:45 pm; Location: TBD
Watch: Lecture
Anki Flashcards: Deck
Fri, Nov 21
Streaming: Kafka Concepts
Read: Kafka, The Definitive Guide, 2nd edition by Shapira et al. ("Enter Kafka" section of Chapter 1, "Meet Kafka")Due: P6
Release: P7 (Kafka, Weather Stations)
Watch: Lecture
Slides: PDF
Anki Flashcards: Deck
Week 13
Wed, Nov 26
Streaming: Kafka demo (contd.)
Read: Kafka, The Definitive Guide, 2nd edition by Shapira et al. (Chapter 7, "Reliable Data Delivery")
Watch: Lecture
Anki Flashcards: Deck
Fri, Nov 28
Thanksgiving Break
Part 3: Cloud
Week 14
Wed, Dec 3
The Cloud + Big Query 1: Basics
Watch: Lecture
Slides: PDF
Anki Flashcards: Deck
Online Quiz: week 13 and before (cumulative)
Fri, Dec 5
Big Query 1: Basics
Read: Google BigQuery: The Definitive Guide, by Lakshmanan et al. ("BigQuery Geographic Information Systems" section of Chapter 8, "Advanced Queries")
Due: P7
Release: P8 (Cloud Services)
Watch: Lecture
Week 15
Mon, Dec 8
Big Query 1: demo catch up
Read: Google BigQuery: The Definitive Guide, by Lakshmanan et al. (Chapter 9, "Machine Learning in BigQuery")
Watch: Lecture
Slides: PDF
Fri, Dec 12
No Class
Due: P8
Wed, Nov 26
Streaming: Kafka demo (contd.)
Read: Kafka, The Definitive Guide, 2nd edition by Shapira et al. (Chapter 7, "Reliable Data Delivery")Watch: Lecture
Anki Flashcards: Deck
Fri, Nov 28
Thanksgiving Break
Wed, Dec 3
The Cloud + Big Query 1: Basics
Watch: LectureSlides: PDF
Anki Flashcards: Deck
Online Quiz: week 13 and before (cumulative)
Fri, Dec 5
Big Query 1: Basics
Read: Google BigQuery: The Definitive Guide, by Lakshmanan et al. ("BigQuery Geographic Information Systems" section of Chapter 8, "Advanced Queries")Due: P7
Release: P8 (Cloud Services)
Watch: Lecture
Week 15
Mon, Dec 8
Big Query 1: demo catch up
Read: Google BigQuery: The Definitive Guide, by Lakshmanan et al. (Chapter 9, "Machine Learning in BigQuery")
Watch: Lecture
Slides: PDF
Fri, Dec 12
No Class
Due: P8
Mon, Dec 8
Big Query 1: demo catch up
Read: Google BigQuery: The Definitive Guide, by Lakshmanan et al. (Chapter 9, "Machine Learning in BigQuery")Watch: Lecture
Slides: PDF