Programming with data (Fall 2024 Schedule)

Essential information

Syllabus with course description and policies etc.

All readings are available online, either on the open web or through the NYU Library EZproxy.

Please turn in homework using this Google form.

Unit A: Computational building blocks

Session 01 (2024-09-09): Introduction

Some interesting data-related projects in the worlds of art and design:

Places to look for interesting datasets:

Reading assigned: What is data?

Optional but recommended:

Before the next session:

Please install Python on your computer. Once Python is installed, use the pip command to install Jupyter Notebook. Note that this will probably require doing a little bit of work at the command line (Terminal on macOS, PowerShell on Windows). Make sure that you can launch Jupyter Notebook on your machine before we begin next session.

NOTE: If you already have a working installation of Python 3 on your computer, you don’t need to install it again! Just use the version that you already have installed.

Here are two good tutorials on YouTube:

And here’s a more general tutorial for all platforms.

Linux users and users of other UNIX-alikes: In this class, you can probably get away with using your distribution’s default Python 3. However, you may want to research a tool like pyenv or asdf to make it easier to have multiple versions of Python available on your machine at once (e.g., your distribution’s Python alongside the latest version of Python).

Another option for many platforms is Anaconda (though please read the licensing terms).

Note that (as far as I know) there is no satisfactory option for installing Python on iOS or Android. If you only have access to iOS and/or Android, you may be better off using a web-hosted service like Python Anywhere (you will need their $5/mo service, which includes access to Jupyter Notebooks). You can also use Google Colab in a pinch.

Session 02 (2024-09-16): Python basics

Session 03 (2024-09-23): Strings, lists and loops

Exercise #1 assigned: Python basics. Download the notebook to your own computer, open it in Jupyter Notebook, and follow the instructions. We’ll review the exercise and discuss how to upload your completed notebook next week.

Unit B: Tools for data

Session 04 (2024-09-30): Dataframes with Pandas

Reading assigned: Forms of data.

Session 05 (2024-10-07): Pandas, continued

No homework this week, but if you’re looking for more practice with Python and Pandas, try Julia Evans’ Pandas Cookbook.

Session 06 (2024-10-15): Data structures

NOTE: This is a Tuesday! (Monday 2024-10-14 is “Fall Break”)

Exercise #2 assigned: Fun with Pandas.

Midterm project assigned (due Session 08).

Session 07 (2024-10-21): Regular expressions

Session 08 (2024-10-28): Midterm presentations

Session 09 (2024-11-04): Clustering and correlation

Reading assigned: Corpora and databases.

Unit C: Databases

Session 10 (2024-11-11): Structured Query Language (SQL)

Session 11 (2024-11-18): SQL, part 2

Exercise #3 assigned: SQL practice.

Session 12 (2024-11-25): Sharing data projects

Session 13 (2024-12-02): Final project presentations 1

Session 14 (2024-12-09): Final project presentations 2