17 April 2019
LIT JINR, Conference Hall, 5th floor
Europe/Moscow timezone

"Deep and Machine Learning methods for document clustering and classification" tutorial will be held by Priv.-Doz.Dr. Alexei I. Streltsov (Senior Data Scientist, SAP SE, Germany) and HybriLIT heterogeneous computation team in frames of The XXIII International Scientific Conference of Young Scientists and Specialists (AYSS-2019) on the basis of  the developed ecosystem for ML/DL.

In this tutorial, we consider a complete workflow of a typical Data Science project dealing with text documents. We define a problem, generate data, analyze data, explore relevant features – discuss several ways how to extract and describe semantic information, and show how to incorporate/augment it by an additional non-semantic one (which might help to improve the results). Next, we consider, construct and apply several standard Machine Learning (ML) models to describe our data:  we cast it to a classification and regression problems.  Then, we analyze an efficiency of the ML methods as well as a role, impact and relevance of our semantic and non-sematic features. Next, we show how to apply Deep Learning methods to attack the same problem – we consider simple DNN (Deep Neural Network) and CNN (Convolutional Neural Network) models. At the end we contrast our ML and DL results, discuss their pluses and minuses: efficiencies, required computational resources, possible way to improve them.

Tutorial supports an active and passive participations. I will use an alive Jupiter Notebook presentation to describe, discuss and execute each end every block of the Python-code requited for the above program/workflow. The corresponding blocks will be shared/available on a dedicated Slack channel (HybriLIT subscription required: https://web-stc.jinr.ru). If you have a valid account on the HybriLIT cluster you will be able to copy/paste them from the Slack channel and re-execute it in on-line mode in your own Notebook via GITLab (https://jhub.jinr.ru/) service. No extra work on your side to install, tune, support the required python packages: JHub – already did it for you.

SAP Leonardo

IMPORTANT: Please bring your own laptops!


Step by step instructions:

  1. Please follow this link:
  2. Enter you e-mail address and verify it;
  3. Follow the link that was sent to your e-mail;
  4. Create an account.

Welcome to the mctdhb-lab channel !

P.S. In case of any problems, please contact: shushanik@jinr.ru

LIT JINR, Conference Hall, 5th floor
Conference Hall
Registration for this event is currently open.