Attention models

Tomasz Rembiasz and Juliusz Straszyński (Huawei).

Course Summary | About the lecturer | Location and schedule | Materials | Videos | Assignment

Registration form:

Register here


part 1
part 2

Course summary:

Attention is a technique used in neural networks that mimics cognitive attention letting the model focus on the most relevant part of the input data. Attention-like mechanisms were first introduced in machine learning in the 1990s. Currently, most state-of-the-art (SotA) natural language processing (NLP) models used in commercial applications are based on attention. In these lectures, we are going to discuss different types of attention mechanisms (additive vs multiplicative attention; causal vs self-attention), different architectures using attention (recurrent neural networks; purely attention-based models: Transformer, BERT and GPT) as well as their real-life applications in NLP: including neural machine translation and chatbot.

About the lecturers:

Tomasz Rembiasz is a Senior AI Researcher at Huawei Warsaw Research Center (HWRC) who specializes in NLP. In his daily work, he is responsible for developing the voice assistant Celia ( daily used by millions of consumers) which incorporates SoTA attention-based models. He conducts commercial research on Natural Language Understanding (NLU) and Natural Language Generation (NLG) problems, including intent detection and slot filling with multiple intents and ambiguity, multi-turn dialogue management, as well as transformer-based open domain chat-bots with proprietary evaluation methods. Tomasz Rembiasz received his Master's degree in Theoretical Physics from the Jagiellonian University in 2009 and his PhD in Computational Astrophysics from the Technical University of Munich/Max Planck Institute for Astrophysics in 2013.

Juliusz Straszyński is a Senior AI Developer at HWRC specializing in NLP. He received his Master’s degree in Computer Science at University of Warsaw and is currently pursuing his PhD there. His academic research is focused on text processing algorithms, specifically identifying regularities and approximate pattern matching being applied at Huawei to NLU problems such as fuzzy searching. Juliusz conducts a wide research on multi-turn dialogue management and attention models for intent recognition for the voice assistant Celia, or machine translation of natural language sentences into SQL queries. Juliusz is an active participant of AI and optimization competitions, notably scoring 4th place at Google’s Hash Code in 2019 and becoming Europe’s champion of Citadel’s Terminal in 2022. He supports Polish Olympiad in Informatics by authoring and preparing tasks for participants.

Thursday, April 21st in 5440
14:15 - 15:45 lecture
16:15 - 17:15 class
Friday, April 22nd in 5440
14:15 - 15:45 lecture
16:15 - 17:15 class
Saturday, April 23rd in 2041 (labs)
10:15 - 11:45 lecture
12:15 - 13:15 class

You can find the assignment here. Please send it to the lecturers until end of June 2022.