Deep Learning for Speech and Language

Winter Seminar UPC TelecomBCN (January 24-31, 2017)

The aim of this course is to train students in methods of deep learning for speech and language. Recurrent Neural Networks (RNN) will be presented and analyzed in detail to understand the potential of these state of the art tools for time series processing. Engineering tips and scalability issues will be addressed to solve tasks such as machine translation, speech recognition, speech synthesis or question answering. Hands-on sessions will provide development skills so that attendees can become competent in contemporary data anlaytics tools.

Course Instructors

Teaching assistants

Organizers

Social event funded by the Tradeheader. Computational resources have been provided by the Amazon Web Services Educate Program. Code repos and website provided by the Github Education Program.

If your company or organization is interested in sponsoring this course, please contact Professor Antonio Bonafonte.

Slides and Videos

Topic Speaker Slideshare YouTube D1L2 The Perceptron SP Slides Video D1L3 Convolutional Neural Networks ES Slides D1L4 Basic Deep Architectures XG Slides Video D1L5 Backpropagation ES Slides Video D1L6 Training ES Slides Video D2L1 Deep Belief Networks ES Slides Video D2L2 Recurrent Neural Networks I SP Slides Video D2L3 Recurrent Neural Networks II SP Slides Video D2L4 Word Embeddings AB Slides Video D2L5 Generative Adversarial Networks SP Slides Video D2L6 Advanced Deep Architectures XG Slides Video D3L1 Language Model MR Slides Video D3L2 Speech Recognition I AR Slides Video D3L3 Speaker Identification I JH Slides Video D3L4 Neural Machine Translation I MR Slides Video D3L5 Speech Synthesis I AB Slides Video D3L6 Speech Recognition II JA Slides Video D4L1 Speaker Identification II JH Slides Video D4L2 Neural Machine Translation II MR Slides Video D4L3 Speech Synthesis II:WaveNet AB Slides Video D4L4 Multimodal Deep Learning XG Slides Video D5 Music Data Processing JP Slides Video

Invited talks

This 2017 edition of the seminar will include two invited talks

Title: Facts and myths about deep learning.

Abstract: Deep learning has revolutionized the traditional machine learning pipeline, with impressive results in domains such as computer vision, speech analysis, or natural language processing. The concept has gone beyond research/application environments, and permeated into the mass media, news blogs, job offers, startup investors, or big company executives’ meetings. But what is behind deep learning? Why has it become so mainstream? What can we expect from it? In this talk, I will highlight a number of facts and myths that will provide a shallow answer to the previous questions. While doing that, I will also highlight various applications we have worked on at our lab. Overall, the talk wants to place a series of basic concepts, while giving ground for reflection or discussion on the topic.

Jordi Pons from the Music Technology Group of the Universitat Pompeu Fabra (UPF)

Title: Deep learning for Music Informatics Research

Abstract: A brief review of the state-of-the-art in music informatics research and deep learning reveals that such models achieved competitive results for several tasks in a relatively short amount of time. Due to these promising results, some researchers declare that is the time for a paradigm shift: from hand-crafted features and shallow classifiers to deep processing models. In the past, introducing machine learning for global modeling (ie. classification) resulted in a significant state-of-the-art advance. And now, some researchers think that another advance could be done by using data-driven feature extractors based on deep learning instead of using hand-crafted features. However, deep learning for music informatics research is still in its early ages - current systems are based on solutions proposed for computer vision or speech. We will present our work describing how to adapt these technologies for the music case.

[Slides]

Student Projects

Master and bachelor student developed during the week of the course a practical project. Summary slides and source code are publicly available.

Master Students

Team Project Web Slides Repo Team 1 Sentiment analysis of Movie Reviews Slides Repo Team 2 Smart text Web Slides Repo Team 3 Sentiment analysis for IMDB database Slides Repo Team 4 (award) Text to Phonemes Slides Repo Team 5 Phonetic Transcription Slides

Bachelor Students

Team Slides Repo Team 1 Slides Repo Team 2 Slides Repo Team 3 Slides Repo Team 4 Slides Repo Team 5 (award) Slides Repo

Pics

Photo album available from Google Photos.

Schedule

When Tuesday 24 Wedneday 25 Thursday 26 Friday 27 Tuesday 31 10:00-10:20 Welcome DNN/DBN LM SpeakerId II Project Expo 1 10:20-10:40 Perceptron Recurrent I ASR Translation II Project Expo 2 10:40-11:00 Convolutional Recurrent II Project Expo 3 11:00-11:20 Architectures I Embeddings SpeakerID Joan Serrà Project Expo 4 11:20-11:40 Backpropagation Keras Translation I Project Expo 5 11:40-12:00 Training Keras 12:00-12:20 Keras Keras Synthesis I Synthesis II 12:20-12:40 Keras Generative ASR II Multimodal Jordi Pons 12:40-13:00 Keras Architectures II 13:00-14:00 Project (MSc) Project (MSc) Project (MSc) Project (MSc) Closing

Practical

Course on Piazza.

Course code: 230362 (Phd & master) / 230325 (Bachelor)

ECTS credits: 2.5 (Phd & master) / 2 (bachelor) (corresponds to full-time dedication during the week course)

during the week course) Teaching language: English

The course is offered for both master and bachelor students, but under two study programmes adapted to each profile.

Class Dates: 24, 25, 26, 27 and 31 January 2017 (there are no sessions on January 30).

Class Schedule: 4 hours a day (you will need 6 extra hours a day for homework during the week course). From 10am until 2pm.

Capacity: 15 MSc/Phd students + 15 BSc students

Location: Campus Nord UPC, Module D5, Room 010

If you have any general question about the course, please use the public issues section of this repo. Otherwise, you can send an e-mail to Xavier Giro-i-Nieto.

Our Computer Vision Seminar

If you liked this seminar, you may want to check the Deep Learning for Computer Vision seminar we organised in 2016 on computer vision, as well as enrol in the new one we are organizing in 2017.

Related courses