NLP & LLMs: Harnessing the power of language for economic and related research

With Matthias Aßenmacher
Munich, September 5th 2024

Do you want to learn how to make use of text data for your research?

This in-person workshop is ideal for researchers who want to learn the potential of Natural Language Processing (NLP) and Large Language Models (LLMs) for economic and related research.

You will receive a broad overview of the theoretical foundations of NLP and its practical concepts. The workshop closes with a Coding Lab to apply this knowledge to real world problems.

Prerequisites

A basic understanding of Machine Learning and Deep Learning is required.

Basic Python skills are required. Familiarity with common modules for text processing and deep learning frameworks is recommended.

Schedule

09:30 – 11:00 NLP Basics (1-3)

11:15 – 12:45 Neural Nets & Transfer Learning (4-8)

13:45 – 15:45 Generative Models (9-End)

16:00 – 17:00 Coding Lab

Topics

1 Learning Paradigms

Understand the different learning paradigms, Relate type of learning to amount of labeled data required

2 NLP tasks

Understand the different types of tasks (low- vs. high-level), Purely Linguistic tasks vs. more general classification tasks

3 Word Embeddings

Understand what word embeddigns are, Learn the main methods for creating them

4 Recurrent Neural Networks

Understand recurrent structure of RNNs, Learn the different types of RNNs

5 Attention

Understand attention mechanism, Learn the different types of attention, The Transformer / Self-Attention

6 The BERT Architecture

Use of the transformer encoder in this model, Understand the pre-training, Gain understanding of the fine-tuning procedure Differences between token- and sequence classification

7 BERTology

Understand how impactful this architecture was, See how this changed research in the field, Glimpse into BERTology

8 Model distillation

soft vs. hard targets, understand how distillation works, DistilBERT, other approaches towards compression

9 Towards a unified task format  

developments of the post-BERT era, reformulating classification tasks, multi-task learning, fine-tuning on task-prefixes

10 GPT series

use of the transformer decoder, input modifications (and how this is useful), concept of prompting