Basic Course Info

Course Documentation

Course TA Staff

Course Overview

This course provides an introduction to computer science and programming for data science. Students will be able to…

  1. Import and manipulate data in a variety of formats
  2. Discuss how data is managed within organizations
  3. Describe how computers work at a basic level and reason about the implications of these hardware details for how we build software
  4. Take advantage of productivity-enhancing features of development environments (VS Code and Jupyter)
  5. Perform basic operations using the command line (Bash)
  6. Version control their software (Git)
  7. Solve programming exercises (Python)
  8. Create data visualizations using dashboarding software (Superset)
  9. Describe the relational data model and devise SQL schema appropriate to a given use case
  10. Set up a SQL database and write SQL queries to perform basic data manipulation tasks (PostgresQL with Supabase)
  11. Discuss the advantages and disadvantages of noSQL databases, set up and use a noSQL database (MongoDB).
  12. Solve exercises on data structures and algorithms (including abstract data types, asymptotic notation, sorting and binary search, graph algorithms, and database algorithms).
  13. Describe the paradigmatic use cases for graph databases (neo4j) and streaming databases (Kafka), and perform basic tasks using those databases.
  14. Build systems which can perform computations in parallel across multiple nodes (PySpark)
  15. Get data from the web via scraping or interacting with REST APIs.
  16. Deploy a dashboard-style website which draws from a data source and updates live.