Teaching

Responsible Data Science and Algorithmic Fairness (CS 516)---Fall 2025

Course Description

This course views data-driven and algorithmic decision-making through the lens of safety, accountability, and fairness. We first explore the context under which a data-driven software solution operates. This includes fundamentals of engineering and operating data-driven software such as requirements and validation. Then, the course covers responsible data-driven software development. While a major focus of the course will be on Algorithmic Fairness, other aspects such as robustness, interpretability, accountability, explainability, transparency, and trust, etc. will also be covered.

Course Objectives

Upon completion of this course, students will be able to:

What are the key elements of data-driven software? How to reliably develop and maintain data-driven software?
Which qualities matter beyond a model’s prediction accuracy? How can we identify and measure important quality requirements, including learning and inference latency, operating cost, scalability, explainablity, fairness, privacy, robustness, and safety?
How to test, debug, and repair production ML systems? How can we evaluate the quality of a model's predictions in production? How can we test the entire AI-enabled system, not just the model? What lessons can we learn from software testing, automated test case generation, simulation, and continuous integration for testing for production machine learning?
What does it take to build responsible products? How to think about fairness of a production system at the model and system level? How to mitigate safety and security concerns? How can we communicate the reasons of an automated decision or explain uncertainty to users?
What are the key responsible requirements in data-driven software? How to define and measure fairness? How to develop interpretability by design? How to perform post-hoc explainability techniques? Who is accountable for software failures?
What does it take to build responsible data-driven software? How to think about the fairness of such a system at the model and system level? What are the fairness issues in pre-processing, in-processing, and post-processing stage of data-driven software solutions?

Course Topics

From Models to AI-Enabled Systems (weeks 1-2)
Requirements, Model Quality, and Unit Testing (weeks 3-4)
Responsibility Basics/foundations: Data Sources, Detection Theory, Supervised Learning, and Causality (Week 5-6)
Responsibility Topics: Fairness, Distribution Shift, Interpretability and Explainability, and Transparency (Week 7-8)
Fairness in Machine Learning (Week 9-10)
Bias in Data and Fairness-aware Data Curation (Week 11)
Fair Algorithm Design (Week 12)
Fairness Testing and Post-Processing Techniques (Week 13)
Fairness in Ranking and Recommendation Systems (Week 14)
Fairness in Generative AI and LLMs (Week 15)
Project demo and presentations (week 16)

Acknowledgment: Special thanks to Prof. Abolfazl Asudeh (UIC) in the assistance and guidelince for the preparation of this course. Some materials are adapted from the CMU AI Engineering course by Dr. Christian Kästner.