In this workshop, participants will delve into the world of text mining, leveraging the power of transformer models to analyze, classify, and interpret (large-scale) text data. The workshop emphasizes a practical lab session, hands-on approach, allowing attendees to work with real-world data sets.
The workshop deals with the following topics:
Participants should have a basic knowledge of NLP/ML/data science and know programming and scripting in Python.
Participants are requested to bring their own laptop for the lab meetings.
Chapter 10 of Jurafsky, D., Martin, J.H. (2023). Speech and language processing, third edition. Find here online.
Start time | End time | Type |
---|---|---|
09:00 | 10:15 | Lecture |
Break | ||
10:30 | 11:30 | Lab |
11:30 | 12:00 | Discussion |
Bring a laptop computer to the workshop and make sure that you have an Internet connection to be able to use Python in Google Colab. If you are using PyCharm or Jupyter Notebook, also check that you have full write access and administrator rights to the machine. We will explore programming and compiling in this workshop. Some corporate laptops come with limited access for their users, we therefore advise you to bring a personal laptop computer, if you have one.
Python is a general-purpose interpreted, interactive, object-oriented, and high-level programming language. It is a powerful environment for scientific computing.
We expect that many of you will have some experience with Python; for the rest of you, this section will serve as a quick crash course both on the Python programming language and on the use of Python in Google Colab:
Follow the tutorial on Python in Google Colab from the Applied Text Mining summer course.
This tutorial is mainly from the CS231n Python Tutorial With Google Colab.