Your Model _Probably_ Memorized the Training Data [PyCon DE & PyData Berlin 2024]
PyData PyData
162K subscribers
416 views
20

 Published On Sep 26, 2024

🔊 Recorded at PyCon DE & PyData Berlin 2024, 22.04.2024
https://2024.pycon.de/program/BFF9VA/

🎓 Watch as privacy activist and data scientist Katharine Jarmul reveals the hidden truth about deep learning models memorizing training data and explores the legal risks and privacy implications associated with it.

Speakers:
Katharine Jarmul

Description:
Katharine Jarmul, a Principal Data Scientist at Thoughtworks, delivered a talk shedding light on the likelihood of deep learning models memorizing training data. She emphasized that even large models like language and multi-modal models are not immune to this phenomenon. Jarmul discussed the active research on deep learning memorization and its implications, highlighting scenarios where memorization could be beneficial and the associated legal and privacy risks. The talk covered various aspects including the mathematical rationale behind model memorization, successful attacks on deep learning models to extract memorized data, and the social and legal consequences of using such data. Jarmul also explored potential solutions like differential privacy, federated models, and distillation techniques to mitigate the risks posed by model memorization. Ultimately, the talk aimed to raise awareness about the issues surrounding model memorization and advocate for thoughtful consideration of privacy and security within data science workflows.

⭐️ About PyCon DE & PyData Berlin:
The PyCon DE & PyData conference unite the Python, AI, and data science communities, offering a unique platform for collaboration and innovation. The PyCon DE & PyData Berlin 2024 conference, hosted in partnership with the local Berlin PyData chapter, provided an exceptional experience, fostering deeper connections within the Python community while showcasing advancements in AI and data science. Attendees enjoyed a diverse and engaging program, solidifying the event as a highlight for Python and AI enthusiasts nationwide.

Follow us:
• LinkedIn:   / 28908640  
• X: https://www.x.com/pyconde
• X: https://www.x.com/pydataberlin

Links:
• Conference website: http://pycon.de
• Related sessions: http://2024.pycon.de/program/categori...

The conference is organized by
• Python Softwareverband e.V.: http://pysv.org
• NumFOCUS Inc.: http://numfocus.org
• Pioneers Hub gemeinnützige GmbH: http://pioneershub.org


If you enjoyed this session, please like, comment, and subscribe to our channel for more insightful talks and discussions.
Share this video with your network to spread the knowledge!

Hashtags:
#Python #PyConDE #PyData #OpenSource #AI #DataScience #MachineLearning #SoftwareDevelopment #LLMs #Community

Acknowledgements:
Special thanks to all the volunteers and sponsors who made this event possible.

About:
Python Softwareverband e.V.:
PySV is a non-profit that promotes the use and development of Python in Germany through events, education, and advocacy, fostering an open Python community.

NumFOCUS Inc.
supports open-source scientific computing by providing financial and logistical support to key projects like NumPy and Jupyter, promoting sustainable development and collaboration.

Pioneers Hub gemeinnützige GmbH:
is a non-profit fostering innovation in AI and tech by connecting experts and promoting knowledge exchange through events and collaborative initiatives.
www.pydata.org

PyData is an educational program of NumFOCUS, a 501(c)3 non-profit organization in the United States. PyData provides a forum for the international community of users and developers of data analysis tools to share ideas and learn from each other. The global PyData network promotes discussion of best practices, new approaches, and emerging technologies for data management, processing, analytics, and visualization. PyData communities approach data science using many languages, including (but not limited to) Python, Julia, and R.

show more

Share/Embed