17 April 2019
LIT JINR, Conference Hall, 5th floor
Europe/Moscow timezone

"Deep and Machine Learning methods for document clustering and classification" tutorial will be held by Priv.-Doz.Dr. Alexei I. Streltsov (Senior Data Scientist, SAP SE, Germany) and HybriLIT heterogeneous computation team in frames of The XXIII International Scientific Conference of Young Scientists and Specialists (AYSS-2019) on the basis of  the developed ecosystem for ML/DL.

In this tutorial, we consider a complete workflow of a typical Data Science project dealing with text documents. We define a problem, generate data, analyze data, explore relevant features – discuss several ways how to extract and describe semantic information, and show how to incorporate/augment it by an additional non-semantic one (which might help to improve the results). Next, we consider, construct and apply several standard Machine Learning (ML) models to describe our data:  we cast it to a classification and regression problems.  Then, we analyze an efficiency of the ML methods as well as a role, impact and relevance of our semantic and non-sematic features. Next, we show how to apply Deep Learning methods to attack the same problem – we consider simple DNN (Deep Neural Network) and CNN (Convolutional Neural Network) models. At the end we contrast our ML and DL results, discuss their pluses and minuses: efficiencies, required computational resources, possible way to improve them.

Tutorial supports an active and passive participations. I will use an alive Jupiter Notebook presentation to describe, discuss and execute each end every block of the Python-code requited for the above program/workflow. The corresponding blocks will be shared/available on a dedicated Slack channel (HybriLIT subscription required: https://web-stc.jinr.ru). If you have a valid account on the HybriLIT cluster you will be able to copy/paste them from the Slack channel and re-execute it in on-line mode in your own Notebook via GITLab (https://jhub.jinr.ru/) service. No extra work on your side to install, tune, support the required python packages: JHub – already did it for you.

SAP Leonardo

IMPORTANT: Please bring your own laptops!


Step by step instructions:

  1. Please follow this link:
  2. Enter you e-mail address and verify it;
  3. Follow the link that was sent to your e-mail;
  4. Create an account.

Welcome to the mctdhb-lab channel !

P.S. In case of any problems, please contact: shushanik@jinr.ru

LIT JINR, Conference Hall, 5th floor
Conference Hall
Registration for this event is currently open.
Your browser is out of date!

Update your browser to view this website correctly. Update my browser now