DS5063 Data Industry Practicum (DIP)

AUP - Master Human Rights and Data Science, Prof. Claudia Roda

Note the dates of feedback sessions. Make sure you mark your calendars with all dates. All reading lists are tentative and subject to change until 2 weeks before each workshop.

Introductory Lecture (January 12) back to top

Tue 13.1, Q604, 9:00 - 10:20
Data Industry Practicum introduction, Claudia Roda

Neurodata, neurotechnology : last boundary of privacy ? back to top
Régis Chatellier is the Innovation & Foresight Project Manager at the Technologies and Innovation Department of the CNIL (French Data Protection Authority). His work is at the intersection of innovation, technology, digital humanities, society, regulation and ethics, and includes future studies and reports to actively participate in the data protection and data ethics debate and support the CNIL's future stance (on smart cities, design and privacy, civic tech, metaverses, ecology, AI, etc.)

According to the definition adopted by UNESCO, neuro-data are ‘first-order data collected directly from a person's neural systems (including both the brain and nervous systems) and second-order inferences based directly on these data’, but also “non-neural data allowing mental states inference”. This data can be collected by invasive or semi-invasive devices, or by simple connected objects (wearables). Initially used in the medical field, these technologies can be integrated into consumer devices (smartphones, earpieces and augmented reality headsets), or any other device, used for commercial purposes, for comfort, education, gaming or in the workplace. The definition of such data, what can really be done based on it, and the potential need to set boundaries are hot topics for regulators and civil society.

View/Hide schedule and assignments

 

Fri 16.1, Q604, 09:00 - 10:20
Workshop Intro, Claudia Roda
Before this lecture you should have studied the mandatory readings listed below.

Mon 19.1, Q609, 15:20 - 18:15
Neurodata, neurotechnology : last boundary of privacy ? - Régis Chatellier

Reading mandatory:

The following are optional, relevant readings:

Tue 20.1, Q709, 15:20 - 18:15
Exploring Future(s) of data protection - Régis Chatellier

Thu 22.1 First version of assignment due

Fri 23.1, Q609, 16:30 - 18:15
Feedback session, Régis Chatellier

Mon 26.1 at 15h00 Final version of assignment due


From Analysis to Action: How Data leads to Senior-Level Decisions back to top
Jessica Summers is lead strategist for data, digital and innovation at the UN Office of the High Commissioner for Human Rights. Prior to joining the Office, she worked on data and digital strategies in both private and public sectors, including in the Executive Office of the UN Secretary-General, where she provided support to several Secretary-General-led initiatives to engage the multilateral system on data and digital technology, including his Data Strategy, Scientific Advisory Board, and UN 2.0. Prior to joining the United Nations, Jessica worked with the Canadian government in New York, the Swedish government and World Food Programme in post-earthquake Haiti, as well as the Economic Commission for Africa in Ethiopia, among others. She holds a Master of Laws from the University of Kent. 

Effective data-driven decision-making at senior levels requires more than strong analytics – it demands judgment, consensus-building, and clear communication. This workshop invites students to reflect on who gets involved in high-stakes data products, how to recognize which data can or cannot be used, and how to navigate the dynamics of getting stakeholders to agree on appropriate use. We will also explore communications techniques for encouraging decision-makers to actually use the insights, using real-world scenarios that allow students to practice framing evidence in ways that are both responsible and persuasive.

View/Hide schedule and assignments

 

Fri. 23.1 Q604, 9:00 - 10:20
Workshop intro , Claudia Roda
Before this lecture you should have studied the mandatory readings listed below.

Mon 26.1, Q609, 15:20 - 18:15
From Analysis to Action: How Data leads to Senior-Level Decisions, Jessica Summers

Mandatory readings:

TBD

Recommended:

TBD

Tue 27.1, Q709, 15:20 - 18:15
From Analysis to Action: How Data leads to Senior-Level Decisions, Jessica Summers

Thu 29.1 First version of assignment due

Fri 30.1, Q609, 15:20 - 18:15 (Speaker online)
Feedback session, Jessica Summers

Mon 2.2 at 15h00 Final version of assignment due

AI and Privacy in Finance back to top
Pagona Tsormpatzoudi is Senior Vice President, Assistant General Counsel, Privacy, Artificial Intelligence & Data Protection at MasterCard. She is responsible for legal compliance, policy and regulatory engagement on privacy and data protection for MasterCard Cyber and Intelligence solutions which inform the overall MasterCard safety and security strategy. She often advises on new technologies, such as artificial intelligence and digital identity. Previously, she was a researcher at the Center for IT and IP Law (KU Leuven). 

Artificial Intelligence (AI) has revolutionized the way we do business, the way we work, the way we live our lives. Besides the many benefits, AI may also bring risks to fundamental human rights, including the rights to privacy and data protection. This workshop explores, through practical examples, regulatory issues on AI development and deployment.

View/Hide schedule and assignments

 

Fri 30.1, Q604, 09:00 - 10:20
Workshop Intro, Claudia Roda
Before this lecture you should have studied the mandatory readings listed below.

Mon 2.2, Q609, 15:20 - 18:15
AI and Privacy in Finance, Pagona Tsormpatzoudi

Mandatory readings:

Tue 3.2, Q709, 15:20 - 18:15
AI and Privacy in Finance, Pagona Tsormpatzoudi

Thu 5.2 at First version of assignment due

Fri. 6.2, Q609 (speaker online) 15:20 - 18:15
Feedback session, Pagona Tsormpatzoudi

Mon 9.2 at 15h00 Final version of assignment due

Standardizing AI back to top
Antonio Kung is member of the executive board of Trialog, supporting its strategy, innovation and knowledge management. Co-founder of Trialog, he was CTO until 2018, and CEO until 2024. He has been active since 2015 on standardisation as well as on standardisation strategy. Antonio has initiated the development of more than twenty standards on architecture, interoperability, conformity in domains such as the internet of things, digital twins, security and privacy or artificial intelligence. Antonio graduated from Harvard university and École Centrale de Paris. Antonio teaches regularly at AUP.

Workshop summary forthcoming

View/Hide schedule and assignments

 

Fri 6.2, Q604, 09:00 - 10:20
Workshop Intro, Claudia Roda
Before this lecture you should have studied the mandatory readings listed below.

Mon 9.2, Q609, 15:20 - 18:15
AI standardization, Antonio Kung

Mandatory readings:

  • TBD

Tue 10.2, Q709, 15:20 - 18:15
AI standardization, Antonio Kung

Thu 12.2 First version of assignment due

Fri. 13.2, Q609 15:20 - 18:15
AI standardization, Antonio Kung

Mon 16.2 at 15h00 Final version of assignment due

How good is language technology for most of the world’s languages? back to top
An AUP alumna, Anna Kazantseva is a Research Officer at the National Research Council of Canada. Her immediate work and research interests are in applications of Natural Language Technology for the revitalization of Indigenous languages in Canada. She has a long-term interest in making digitized literature easy to access, to find and to read: automatic summarization of literature, topical segmentation of literature, information retrieval in the context of literature, document similarity, modelling narrative structure, and so on.  

Most of us think of language as written, typed, transmitted through various media  and recorded with large repositories of data or books available. However, most of the world’s 7000 languages have recent or no writing systems, are mostly spoken and not written and passed from person to person – not through media or books. And they weaken and disappear at an alarming rate.

In this workshop we will take a quick overview at the state-of-the-art in language technology and will examine the gap between so called “well-resourced” languages like English or French and “under-resourced” languages like Mayan, Mohawk or Inuktitut. We will look at why most of the modern language technology cannot be applied to these smaller languages, both from ethical and from technical points of view.

We will also look at some of the tools that can be used for building language technology when little or no data is available and we will build one or two small applications.

View/Hide schedule and assignments

 

Fri 13.2, Q604, 09:00 - 10:20
Workshop Intro, Claudia Roda
Before this lecture you should have studied the mandatory readings listed below.

Mon 16.2, Q609 (speaker online) , 17:00 - 18:15
How good is language technology for most of the world’s languages? Anna Kazantseva

Mandatory readings:

  • TBD

Tue 17.2, Q709 (speaker online) , 17:00 - 18:15
How good is language technology for most of the world’s languages? Anna Kazantseva

Fri 20.2, Q604 9:00 - 10:20
DIP checkpoint, Claudia Roda

Fri 20.2, Q609 (speaker online) , 17:00 - 18:15
How good is language technology for most of the world’s languages? Anna Kazantseva

Sun 11.3 First version of assignment due

Thu 19.3, online, 17:00 - 18:15
Feedback session, Anna Kazantseva

Fri 11.4 Final version of assignment due

Inclusive Data Visualisations back to top
An AUP alumna, Alex Phuong Nguyen is Director of Product & Analytics at Ulula, a Canadian stakeholder technology company. In addition to product management, Alex leads efforts to derive values from data. Prior to joining Ulula, Alex contributed to developing and implementing the data strategy and innovation agenda in the Executive Office of the UN Secretary-General in New York. Her background is in international human and labour rights. 

Data visualization is an important component of most data science projects, not only enabling better interpretation of the data, but also facilitating data scrubbing and exploration. People who design visualizations need to address questions related to what they should display and why, but also pay increased attention to how they should display it. This workshop explores how different visualization techniques may impact the information conveyed and introduces the concepts behind visual accessibilities. Students will be working on a small real-world project requiring them to apply the principles introduced during the workshop.

View/Hide schedule and assignments

 

Mon. 9.3 Q609, 15:20 - 18:15
Ethical Considerations in Data Visualization , Alex Phuong Nguyen
Before this lecture you should have studied the mandatory readings listed below.

Mandatory readings:

  1. Data Visualization in Data Science (local copy)
  2. Statistics, lies and the virus: five lessons from a pandemic
  3. How Deceptive are Deceptive Visualizations (pdf)

Recommended:

  1. A Great Way to Think About Data Science: The Bowtie
  2. How To Break A Scale (pdf)
  3. Practicing Good Ethics in Data Visualization
  4. Visual Arrangements Influence Comparisons (pdf)

Visualization exercise and Data for the visualization exercise (accessible only to students at the time of the workshop)

Tue. 10.3 Q709 15:20 - 18:15
Ethical Considerations in Data Visualization , Alex Phuong Nguyen

Thu 12.3 First version of assignment due

Fri. 13.3, Q604 9:00 - 10:20
Feedback session

Mon 16.3 at 15h00 Final version of assignment due

Privacy and Ethics in Data Sharing and Emerging Technologies back to top
An AUP alumna, Maria-Martina Yalamova is VP Privacy Counsel at NBCUniversal Media, LLC. She is an international privacy lawyer experienced in the media and entertainment, and technology industries. She has advised leading Internet, technology and pharmaceutical companies on electronic communications, data privacy, security, e-commerce, online advertising, and intellectual property issues. Maria-Martina started her privacy career as a researcher and policy advisor at the London-based NGO, Privacy International, where she worked on privacy and human rights issues across the Asia Pacific region. 

The life-cycle of modern data-driven systems requires careful consideration of privacy and ethical issues, particularly when personal data is shared across platforms and stakeholders. This workshop will introduce students to key European regulations – GDPR, the Data Act, and the AI Act – and explore their interaction and practical application in real-world contexts. Through an interactive case study involving connected devices, students will analyse data flows, assess compliance challenges, and collaborate on drafting a Privacy Impact Assessment (PIA). The session emphasises problem-solving and teamwork to help future HR and data science professionals integrate privacy by design into their work.

View/Hide schedule and assignments

 

Mon. 16.3 Q609, 15:20 - 18:15
Privacy and Ethics in Data Sharing and Emerging Technologies, Maria-Martina Yalamova
Before this lecture you should have studied the mandatory readings listed below.

Tue. 17.3 Q604 9:00 - 10:20
DIP checkpoint, Claudia Roda

Tue. 17.3 Q709 15:20 - 18:15
Privacy and Ethics in Data Sharing and Emerging Technologies, Maria Martina Yalamova

Mandatory readings:

  • TBD

Recommended readings:

  • TBD

Thu 19.3 First version of assignment due

Fri 20.3, Q609 (speaker online) , 15:20 - 18:15
Feedback session, Maria Martina Yalamova

Mon 23.3 at 15h00 Final version of assignment due

AI, Chatbots, the Question of Intelligence, and the Notion of Relevance back to top
Laurent Ach, former CTO Qwant, former CTO Rakuten France, started building applications with neural networks and virtual reality in the 1990s at Thales. He later worked for startup companies, where he developed innovative 3D streaming technologies and led the development of virtual assistants based on early chatbot technologies. With the rise of deep learning, he joined Rakuten to create and lead the European branch of the Rakuten Institute of Technology, where he built an applied research and engineering team in AI and machine learning that delivered practical innovations for the Group’s e-commerce ecosystem. As CTO of Rakuten France and later Qwant, he has led cross-functional teams to build complex, large-scale systems. At Qwant, he oversaw the development of a privacy-focused web search engine powered by multiple machine-learning pipelines, explored new approaches to search retrieval, and directed the integration of generative AI into search results. In addition to his technical work, he maintains an active interest in the differences between human and artificial intelligence, a topic he frequently discusses in podcasts and public conversations. 

Chatbots are currently one of the most visible parts of AI technology, using advanced generative language models often combined with more traditional techniques and services. They have been, and are again, considered by some to be a stepping stone toward Artificial General Intelligence, machines with human-like cognitive abilities. Yet the very definition and evaluation of intelligence remain open questions, which is why some benchmarks attempt to capture tasks humans perform easily but AI finds difficult. The uncertainty and fuzziness around LLM intelligence make it important to use them in ways where answers are grounded in well-identified documents.
This workshop will explore the historical roots of AI and chatbots, explain their underlying principles, and focus on a usage that represents an evolution of web search engines, combining information retrieval and text generation. For this use case, the notion of relevance is crucial, since the documents retrieved from the web or any other corpus form the basis of the generated answer. We will address the concept of embeddings and semantic similarity, which are crucial to predict relevance, and examine with concrete examples how they vary depending on the models and techniques used..

View/Hide schedule and assignments

 

Fri 20.3, Q604, 09:00 - 10:20
Workshop Intro, Claudia Roda
Before this lecture you should have studied the mandatory readings listed below.

Mon. 23.3 Q609, 15:20 - 18:15
AI, Chatbots, the Question of Intelligence, and the Notion of Relevance, Laurent Ach

Tue. 24.3 Q709 15:20 - 18:15
AI, Chatbots, the Question of Intelligence, and the Notion of Relevance, Laurent Ach

Mandatory readings:

  • TBD

Recommended readings:

  • TBD

Thu 26.3 First version of assignment due

Fri 27.3, Q609, 15:20 - 18:15
Feedback session, Laurent Ach

Mon 30.3 at 15h00 Final version of assignment due

Workshop Conclusion and Portfolio Presentation back to top

Fri 7.4, Q604, 9:00 - 10:20
Review of draft version of portfolio (sample structure), Claudia Roda

Fri 24.4, Q604, 9:00 - 10:30
Portfolio final presentation, Claudia Roda

Please also reserve the following dates back to top

All class periods: T, F period 1 (9:00 - 10:30)
All class periods: M,T,F periods 5 and 6

Other dates may be added during the semester