Learning Data Representation for Clustering

In Conjonction with IJCAI-PRICAI 2020,July 11-17, 2020, Yokohama, Japan (Due to COVID-19, the conference will still be held at a later date, most likely during January 2021most likely during January 2021, Kyoto, Japan)

Workshop Overview

To deal with massive data, Clustering which is the process of organizing similar objects into meaningful clusters. This approach is essential in many fields, including data science, information retrieval, bio-informatics and computer vision. Despite their success, most existing clustering methods are severely challenged by the data generated by modern applications, which are typically high dimensional, noisy, heterogeneous and sparse. Therefore, it is a fundamental problem to find a suitable representation of high dimensional data, which can enhance the performance of clustering. This has driven many researchers to investigate new clustering models to overcome these difficulties. One promising category of such models relies on learning data representation. Although specific domain knowledge can be used to help design representations, and the quest for machine learning is motivating the design of more powerful representation-learning algorithms implementing such priors.

The idea is to learn a new data representations of the objects of interest, e.g., images, that encode only the most relevant information characterizing the original data, which would for example reduce noise and sparsity. Since the representation learning process is not guaranteed to infer accurate representations that are suitable for the clustering task, it is important to perform both tasks jointly, as recommended by several authors, so as to let clustering govern feature extraction and vice-versa. Within this framework, classical dimensionality reduction approaches, e.g., Principal Component Analysis (PCA), have been widely considered for the data representation task. However, the linear nature of such techniques makes it challenging to infer faithful representations of real-world data, which typically lie on highly non-linear manifolds. This motivates the investigation of deep representation learning models (e.g., auto-encoders, convolutional neural networks, etc.), which have proven so far successful in extracting highly non-linear features from complex data, such as text, images and graphs. While promising, composing deep representation learning with clustering simultaneously has just started. The marriage between “Data representation” and “clustering” will bring huge opportunities as well as challenges to communities concerned with dimensionality reduction and clustering. This workshop aims at discovering the recent advanced on data representation for clustering under different approaches. Thereby, the LDRC workshop is an opportunity to:

present the recent advances in data representation based clustering algorithms,
outline potential applications that could inspire new data representation approaches for clustering,
explore benchmark data to better evaluate and study data representation based clustering models.

The workshop is co-located with IJCAI-PRICAI2020, the 29th International Joint Conference on Artificial Intelligence and the 17th Pacific Rim International Conference on Artificial Intelligence.

Important Dates

Due to the COVID-19 pandemic to and many requests, the submission deadline has been extended to August 15, 2020.

Workshop papers submission - August 15, 2020
Author notification - September 20, 2020
Camera-ready due - september 30, 2020
IJCAI-PRICAI 2020 workshops - most likely during January 2021

Workshop Chairs

General Chair - Mohamed Nadif, Université de Paris
Program Chair - Lazhar Labiod, Université de Paris

Workshop Organizers

Mohamed Nadif - Mohamed.nadif@u-paris.fr
Lazhar Labiod - lazhar.labiod@u-paris.fr
Daniel Berrar - daniel.berrar@ict.e.titech.ac.jp

Call for papers

This workshop aims at discovering the recent advanced on data representation for clustering under different approaches. Thereby, the LDRC workshop is an opportunity to:

present the recent advances in data representation based clustering algorithms,
outline potential applications that could inspire new data representation approaches for clustering,
explore benchmark data to better evaluate and study data representation based clustering models

This workshop intends to promote research at the intersection of data representation learning and clustering, and its application to real-life data mining challenges. The workshop welcomes both high-quality academic (theoretical or empirical) and practical papers on unsupervised data representation learning for clustering and related work. For reference, here is a non-exclusive list of topics of interest:

Technical areas - Unsupervised data representation learning, Deep network embedding, Attributed graph embedding, Manifold learning, Dimensionality reduction Spectral data embedding, Factorization, Tensor clustering, Co-clustering , Latent Block Models, Graph Laplacian Mixture Models, Subspace clustering, Visualization

To attract researchers from various communities, this workshop will encourage submissions on applications, especially those that motivate the development of powerful representation-learning models, such as

Application areas - Bioinformatics, Medicine, Recommendation Systems, Computer Vision, Text mining, Natural Language Processing

Program Committee

TBD...

Papers submission

Papers submitted to this workshop must not have been accepted for publication elsewhere or be under review for another workshop, conference or journal.All paper submissions should be limited to a maximum of 8 pages for long paper and 5 pages for short paper in the IJCAI format. After you have generated a pdf file of your paper, you can proceed to the Workshop submission system. Submission will be through the Easychair conference management system:

Submit here

Program

IJCAI-PRICAI workshop : Learning Data Representation for Clustering (W29)

LDRC workshop Program : January 7, 12am-5am UTC

12am-12:45am -- Encouraging Neural Machine Translation to Satisfy Terminology Constraints.Melissa Ailem, Lingua Custodia
12:45am-1:15am -- Hierarchical Clustering using Auto-encoded Compact Representation for Time-series Analysis, Soma Bandyopadhyay, Anish Datta and Arpan Pal, TCS Research, TATA Consultancy Services ,Kolkata, India
1:15am-1:45am -- Variational Autoencoders for Generating Diverse Samples. Tsubasa Takahashi, Tatsuya Komatsu and Koki Yamada, LINE Corporation - Tokyo University of Agriculture and Technology
1:45am-2:30am -- Bilateral Variational Autoencoder for Dyadic Data. Aghiles Salah, Singapore Management University