DS5063 Data Industry Practicum (DIP)

AUP - Master Human Rights and Data Science, Prof. Claudia Roda

Note the dates of feedback sessions. Make sure you mark your calendars with all dates. All reading lists are tentative and subject to change until 2 weeks before each workshop.

Introductory Lecture (January 28) back to top

Tue 28.1, Q704, 9:00 - 10:20
Data Industry Practicum introduction, Claudia Roda (PPT)

Neurodata, neurotechnology : last boundary of privacy ? ( January 30 - February 7) back to top
Régis Chatellier is the Innovation & Foresight Project Manager at the Technologies and Innovation Department of the CNIL (French Data Protection Authority). His work is at the intersection of innovation, technology, digital humanities, society, regulation and ethics, and includes future studies and reports to actively participate in the data protection and data ethics debate and support the CNIL's future stance (on smart cities, design and privacy, civic tech, metaverses, ecology, AI, etc.)

According to the definition adopted by UNESCO and the OECD, neuro-data are ‘first-order data collected directly from a person's neural systems (including both the brain and nervous systems) and second-order inferences based directly on these data’. This data can be collected by invasive or semi-invasive devices, or by simple connected objects (wearables), and can be read as well as written. Initially used in the medical field, these technologies can be integrated into consumer devices (smartphones, earpieces and augmented reality headsets), used for commercial purposes, for comfort or in the workplace. Could this ability to 'read' and 'write' our brains breach the final frontier of intimacy?

View/Hide schedule and assignments

 

Fri 31.1, Q704, 09:00 - 10:20
Workshop Intro, Claudia Roda
Before this lecture you should have studied the mandatory readings listed below.

Mon 3.2, Q609, 15:20 - 18:15
Neurodata, neurotechnology : last boundary of privacy ? - Régis Chatellier (PPT)

Reading mandatory:

The following are optional, relevant readings:

Tue 4.2, Q609, 13:45 - 16:40
Exploring Future(s) of data protection - Régis Chatellier

Fri 7.2, workshop assignment due 9am

Fri 7.2, Q709, 12:10 - 14:10
Feedback session, Régis Chatellier

Mon 10.3 Policy brief due

Tue 25.3, Q704, 9:00 - 10:20
Feedback session, Régis Chatellier

Quantum Science and Technology, 2025 (February 5) back to top

Register for the Opening Ceremony of the International Year of Quantum Science and Technology (IYQ) that will be held from 4-5 February 2025 at UNESCO Headquarters. We will attend the second day (February 5th).

The cerimony "marks the official commencement of a global initiative dedicated to advancing quantum science and its transformative applications. As the lead agency for IYQ, UNESCO aims to maximize the visibility of IYQ and the transformative potential of quantum science and technology in addressing critical global challenges. It will serve as a platform for the exchange of ideas, allowing participants to showcase best practices in quantum science education, research, and industry applications. It will provide an opportunity to inspire interdisciplinary and cross-regional cooperation and to address disparities between the Global North and South while inspiring inclusive innovation. The opening ceremony will also highlight the importance of integrating ethics and responsible innovation into the core of discussions."

Inclusive Data Visualisations (February 11 - 14) back to top
An AUP alumna, Alex Phuong Nguyen is Director of Product & Analytics at Ulula, a Canadian stakeholder technology company. In addition to product management, Alex leads efforts to derive values from data. Prior to joining Ulula, Alex contributed to developing and implementing the data strategy and innovation agenda in the Executive Office of the UN Secretary-General in New York. Her background is in international human and labour rights. 

Data visualization is an important component of most data science projects, not only enabling better interpretation of the data, but also facilitating data scrubbing and exploration. People who design visualizations need to address questions related to what they should display and why, but also pay increased attention to how they should display it. This workshop explores how different visualization techniques may impact the information conveyed and introduces the concepts behind visual accessibilities. Students will be working on a small real-world project requiring them to apply the principles introduced during the workshop.

View/Hide schedule and assignments

 

Tue. 11.2 Q704, 9:00 - 10:20
Workshop intro , Claudia Roda
Before this lecture you should have studied the mandatory readings listed below.

Tue. 11.2 Q609 (Speaker online), 14:00 - 16:30
Ethical Considerations in Data Visualization, Alex Phuong Nguyen


Mandatory readings:

  1. Data Visualization in Data Science (local copy)
  2. Statistics, lies and the virus: five lessons from a pandemic
  3. How Deceptive are Deceptive Visualizations (pdf)

Recommended:

  1. A Great Way to Think About Data Science: The Bowtie
  2. How To Break A Scale (pdf)
  3. Practicing Good Ethics in Data Visualization
  4. Visual Arrangements Influence Comparisons (pdf)

Visualization exercise and Data for the visualization exercise (accessible only to students at the time of the workshop)

Friday 14.2, workshop assignment due 9:00

Fri. 14.2, Q609 (Speaker online) 14:00 - 15:00
Feedback session

Friday, 21.2, Final version of assignment due

How good is language technology for most of the world’s languages? (February 14 - March 26) back to top
An AUP alumna, Anna Kazantseva is a Research Officer at the National Research Council of Canada. Her immediate work and research interests are in applications of Natural Language Technology for the revitalization of Indigenous languages in Canada. She has a long-term interest in making digitized literature easy to access, to find and to read: automatic summarization of literature, topical segmentation of literature, information retrieval in the context of literature, document similarity, modelling narrative structure, and so on.  

Most of us think of language as written, typed, transmitted through various media  and recorded with large repositories of data or books available. However, most of the world’s 7000 languages have recent or no writing systems, are mostly spoken and not written and passed from person to person – not through media or books. And they weaken and disappear at an alarming rate.

In this workshop we will take a quick overview at the state-of-the-art in language technology and will examine the gap between so called “well-resourced” languages like English or French and “under-resourced” languages like Mayan, Mohawk or Inuktitut. We will look at why most of the modern language technology cannot be applied to these smaller languages, both from ethical and from technical points of view.

We will also look at some of the tools that can be used for building language technology when little or no data is available and we will build one or two small applications.

View/Hide schedule and assignments

 

Fri 14.2, Q704, 09:00 - 10:20
Workshop Intro, Claudia Roda
Before this lecture you should have studied the mandatory readings listed below.

Mon 17.2, Q609 (speaker online) , 17:00 - 18:15
How good is language technology for most of the world’s languages? Anna Kazantseva

Mandatory readings:

Wed 19.2, online, 17:00 - 18:30
How good is language technology for most of the world’s languages? Anna Kazantseva

Fri 21.2, online, 17:00 - 18:30
How good is language technology for most of the world’s languages? Anna Kazantseva

Sun 16.3 workshop assignment due

Wed 26.3, online 17:00 - 18:30
Feedback session, Anna Kazantseva

Fri 11.4 Final version of assignment due

Tech and data tectonics: how geo-economics and geo-politics influence digital sovereignty and resilience (February 23 - 27) back to top
An AUP alum, Yann Lechelle is co-founder and CEO of probabl.ai, a company that aims at democratizing data science and machine learning. Over the past three decades, from Paris to Los Angeles, via Cambridge UK and New York, and back, Yann has been leveraging software to help unlock business opportunities in many fields: financial markets, cartoon animation, yield management, digital art on new media, mobile app discovery and monetization, and applied AI to deploy state of the art voice recognition on the edge. Before founding probabl.ai, Yann has developed a small cloud company to become a credible regional alternative to the major 3 cloud hyperscalers, growing the team threefold to 600 employees while operating complex change management. Yann's has played key roles as an entrepreneur, key shareholder, advisory board member, and angel investor. He is a strong advocate of the Paris tech ecosystem as a co-founding member of France Digitale and HUB AI Paris, as well as Entrepreneur-in-Residence at INSEAD Business School.

In these sessions, we'll explore the organization of digital technology on a large scale, focusing on a few major players who dominate the value chain. This dominance allows them to capture significant value, which could hinder innovation and our control over data processing and value creation, especially as artificial intelligence advances. We'll examine monopolies and oligopolies, antitrust efforts, and ways to achieve a more balanced playing field. We'll also consider the impact on society and democracy's future. Your assignment, should you choose to accept it, will be to conceive strategies, policies, and methods to enhance societal resilience and maximize technological benefits for humanity.

 

View/Hide schedule and assignments

 

Mon 10.3, Online, 9:50 - 10:20
Workshop Intro, Claudia Roda
Before this lecture you should have studied the mandatory readings listed below.

Mon 10.3, Q609, 15:20 - 18:15
Tech and data tectonics, Yann Lechelle 
Mandatory readings :

TBD

Optional readings:

TBD

Fri 14.3, workshop assignment due 9:00

Fri 14.3, Q609, 12:10 - 15:05
Feedback session - Tech and data tectonics , Yann Lechelle 

Sat 22.3 Final version assignment due

Data management in refugee protection and humanitarian settings (March 14 - April 18) back to top
Wellington Pereira Carneiro is Senior Learning Development Officer at UNHCR, the UN Refugee Agency. He previously spent several years as Senior Protection Officer. An UN humanitarian worker and researcher, Wellington obtained his PhD in International Relations at the University of Brasilia, Brazil in 2012, and holds two master’s degrees, a Mst in International Human Rights Law by the University of Oxford in the United Kingdom and a LLM in International Law by the University Drujby Narodov, Moscow 1996. Wellington has been a lecturer at the University do Vale do Paraiba, and the University Center of Brasilia. In 2004 joined the International Civil Service of the United Nations and served in many conflict and troubled areas including in emergency humanitarian operations, in Chad/Cameroon 2008 and Uganda (South Sudan emergency - 2017), and completed assignments in Brazil, Sudan, Kazakhstan, Colombia, Lebanon, Angola and the Russian Federation. He published a book on crimes against humanity in 2015 and authored several articles on topics related to human rights and humanitarian action.

By June 2023 110 million people were forced displaced worldwide due to persecution, conflict, violence, human rights violations, or events seriously disturbing public order, including environmental disasters. Around 75% of all world’s refugees and other in need of international protection are hosted in low- or middle-income countries which eventually cannot cope with the amount of aid to be provided in their territories. That makes aid a multi-billion-dollar industry involving huge data management challenges, registration, provision of several types of assistance, family tracing, persons with specific needs, unaccompanied children and other issues involving sensitive data protection challenges.

Protecting individuals' personal data is an integral part of protecting their life and dignity. In situations of humanitarian crises or conflict, data protection may acquire a critical life-saving significance due to persecution, ethnic grievances, or discrimination. For humanitarian organizations data protection is of fundamental importance.

The workshop provides a solid overview of the refugee and forced displacement international regime, its interaction with other human rights, including in Data Protection Policies and principles. The workshop will discuss the use of Artificial Intelligence in humanitarian operations its impacts and risks. Case studies and concrete “Data protection impact assessments” by participants will provide a practical dimension and serve as a basis for a collective paper on AI in humanitarian operations.

View/Hide schedule and assignments

 

Fri 14.3, Q704, 9:00 - 10:20
Workshop Intro, Claudia Roda
Before this lecture you should have studied the mandatory readings listed below.

Mon 17.3, Q609, 15:20 - 18:15
Data for Refugees Protection , Wellington Pereira Carneiro 
Mandatory readings :

Tue 18.3, Q609, 13:45 - 16:40
Data for Refugees Protection , Wellington Pereira Carneiro  

Sat 12.4, workshop assignment due

Tue 18.4, Q704 (speaker online) 9:00 - 10:20
Feedback session, Wellington Pereira Carneiro 

Fri 25.4 Final version of assignment due

Legal Process Automation: Building Accessible Legal Assistants (March 21 - 28) back to top
Tomer Libal is a computer science researcher at the University of Luxembourg, specializing in legal informatics and dedicated to integrating cutting-edge technology into the legal field. As the co-founder of Enidia AI, he lead efforts to design trustworthy AI solutions for legal professionals, ranging from precise legal assistants to advanced document drafting tools. His work is driven by a dual mission: to empower lawyers with innovative technologies that enhance efficiency, accuracy, and reliability, while also developing accessible tools to improve access to justice for all.

This workshop will explore how to bridge the gap between complex legal systems and everyday individuals. It invites students to analyse a legal process and transform it into an automated legal assistant using cutting-edge process automation tools. Students will learn how to:

  • Extract and Analyze multiple legal articles and relevant court case judgments to identify and understand the intricacies of the legal process.
  • Abstract and Design legal procedures into their logical components and create detailed flowcharts that map out each step.
  • Implement and Automate: Utilize a commercial process automation tool (freely available through an academic license) to translate flowcharts into functional legal assistants.

By the end of the workshop, students will understand how AI can be used to developed a user-friendly legal assistant designed to provide laypeople with easy access to legal knowledge and support their pursuit of justice.

View/Hide schedule and assignments

 

Fri 21.3, Q704, 9:00 - 10:20
Workshop Intro, Claudia Roda
Before this lecture you should have studied the mandatory readings listed below.

Mon 24.3, Q609, 15:20 - 18:15
Legal Process Automation: Building Accessible Legal Assistants , Tomer Libal 
Mandatory readings :

  • TBD

Tue 25.3, Q609, 13:45 - 16:40
Legal Process Automation: Building Accessible Legal Assistants , Tomer Libal

Fri. 28.3, workshop assignment due

Fri. 28.3,, Q609 (Speaker online) 12:10 - 15:05
Feedback session, Tomer Libal 

Fri 11.4 Final version of assignment due
 

AI and Privacy in Finance (April 11 - 18) back to top
Pagona Tsormpatzoudi is Senior Vice President, Assistant General Counsel, Privacy, Artificial Intelligence & Data Protection at MasterCard. She is responsible for legal compliance, policy and regulatory engagement on privacy and data protection for MasterCard Cyber and Intelligence solutions which inform the overall MasterCard safety and security strategy. She often advises on new technologies, such as artificial intelligence and digital identity. Previously, she was a researcher at the Center for IT and IP Law (KU Leuven). 

Artificial Intelligence (AI) has revolutionized the way we do business, the way we work, the way we live our lives. Besides the many benefits, AI may also bring risks to fundamental human rights, including the rights to privacy and data protection. This workshop explores, through practical examples, regulatory issues on AI development and deployment.

View/Hide schedule and assignments

 

Fri 11.4, Q704, 09:00 - 10:20
Workshop Intro, Claudia Roda
Before this lecture you should have studied the mandatory readings listed below.

Mon 14.4, Q609, 15:20 - 18:15
AI and Privacy in Finance, Pagona Tsormpatzoudi

Mandatory readings:

  • TBD

Tue 15.4, Q609, 13:45 - 16:40
AI and Privacy in Finance, Pagona Tsormpatzoudi

Fri 18.4, workshop assignment due

Fri. 18.4, Q609 (speaker online) 12:10 - 15:05
Feedback session, Pagona Tsormpatzoudi

Fri 25.4 Final version of assignment due

Workshop Conclusion and Portfolio Presentation back to top

Fri 25.4, Q704, 9:00 - 10:20
Review of draft version of portfolio (sample structure), Claudia Roda

Tue 29.4, Q704, 9:00 - 10:30
Portfolio final draft, Claudia Roda

Tue 6.5, Q704, 9:00 - 12:00
Final exam period, portfolio presentation, Claudia Roda

Please also reserve the following dates back to top

 

All class periods: T, F period 1 (9:00 - 10:30)
Thur 20.3, 13:45 - 16:45
Fri 21.3 12:10 - 15:05

Other dates may be added during the semester