24th International Conference on Discovery Science

Halifax, Canada
11-13 October, 2021

DS’2021 provides an open forum for intensive discussions and exchange of new ideas among researchers working in the area of Discovery Science. The conference focus is on the use of artificial intelligence methods in science. Its scope includes the development and analysis of methods for discovering scientific knowledge, coming from machine learning, data mining, intelligent data analysis, and big data analytics, as well as their application in various domains.

We invite submissions of research papers addressing all aspects of discovery science. We encourage papers that focus on the analysis of different types of massive and complex data, including structured, spatio-temporal and network data, as well as heterogeneous, continuous or imprecise data. We also encourage papers in the fields of computational scientific discovery, mining scientific data, computational creativity and discovery informatics. We welcome papers addressing applications of artificial intelligence in different domains of science, including biomedicine and life sciences, materials science, astronomy, physics, chemistry, as well as social sciences.

Conference Format

After careful consideration of all pros and cons of the different alternatives, and given the current uncertainty concerning Covid-19 travel restrictions, we have decided that the conference will take place as a fully online event. The accepted papers will still be published by Springer and the special issue will proceed as announced. In these challenging times that the whole of humanity is going through, we hope that all of you are safe and remain healthy and positive.

News

2021/10/08 - Web access the conference proceedings is now available in the program section of this web page!

2021/10/04 - The paper Neural Additive Vector Autoregression Models for Causal Discovery in Time Series by Bart Bussmann, Jannes Nys and Steven Latré was selected for the Best Student Paper Award sponsored by Springer! Congratulations!

2021/10/03 - Dr. Ross King will be one of the keynote speakers of DS’2021!

2021/09/15 - Dr. Rita Orji will be one of the keynote speakers of DS’2021!

2021/09/10 - Dr. Tanya Berger-Wolf will be one of the keynote speakers of DS’2021!

2021/08/07 - Conference registration is now open!

2021/08/03 - After careful consideration of all aspects related to this pandemic, we have decided that the conference will be 100% online.

2021/07/28 - Notifications sent to authors! 15 Long papers and 21 Short papers were accepted for DS’2021! Congratulations to all authors!

2021/07/20 - Due to the extension of the submission deadline, the notification and camera ready dates were also push forward a bit.

2021/06/15 - Due to numerous requests the abstract and full paper submission deadlines were extended one further week!

2021/05/12 - Abstract and full paper submission deadlines extended!

2021/05/01 - Submission site on Easy Chair open

2021/03/03 - First version of the conference Web site is live! Still work in progress, obviously!

Keynote Speakers

Avatar

Tanya Berger-Wolf

Human-machine partnership for conservation: AI and humans combatting extinction together

Avatar

Ross King

Automating Science using Robot Scientists

Avatar

Rita Orji

AI and Persuasive Technology to Promote Social Good Among Under-served Population

Accepted Papers

Access the Conference Proceedings HERE

Regular Papers

  • Incremental k-Nearest Neighbors Using Reservoir Sampling for Data Streams
    Maroua Bahri and Albert Bifet

  • Neural Additive Vector Autoregression Models for Causal Discovery in Time Series
    Bart Bussmann, Jannes Nys and Steven Latré
    BEST STUDENT PAPER AWARD

  • Leveraging Grad-CAM to Improve the Accuracy of Network Intrusion Detection Systems
    Francesco Paolo Caforio, Giuseppina Andresini, Gennaro Vessio, Annalisa Appice and Donato Malerba

  • Consensus Based Vertically Partitioned Multi-Layer Perceptrons for Edge Computing
    Haimonti Dutta, Saurabh Amarnath Mahindre and Nitin Nataraj

  • FHA: Fast Heuristic Attack against Graph Convolutional Networks
    Haoxi Zhan and Xiaobing Pei

  • Shapley-Value Data Valuation for Semi-Supervised Learning
    Christie Courtnage and Evgueni Smirnov

  • Automated Grading of Exam Responses: An Extensive Classification Benchmark
    Jimmy Ljungman, Vanessa Lislevand, John Pavlopoulos, Alexandra Farazouli, Zed Lee, Panagiotis Papapetrou and Uno Fors

  • Learning Time Series Counterfactuals via Latent Space Representations
    Zhendong Wang, Isak Samsten, Rami Mochaourab and Panagiotis Papapetrou

  • Combining Predictions under Uncertainty: The Case of Random Decision Trees
    Florian Peter Busch, Moritz Kulessa, Eneldo Loza Mencía and Hendrik Blockeel

  • Prioritization of COVID-19 literature via unsupervised keyphrase extraction and document representation learning
    Blaž Škrlj, Marko Jukič, Nika Eržen, Senja Pollak and Nada Lavrač

  • Ranking Structured Objects with Graph Neural Networks
    Clemens Damke and Eyke Hüllermeier

  • An Ensemble Hypergraph Learning framework for Recommendation
    Alireza Gharahighehi, Celine Vens and Konstantinos Pliakos

  • KATRec: Knowledge Aware aTtentive Sequential Recommendations
    Seyed Danial Mohseni Taheri, Mehrnaz Amjadi and Theja Tulabandhula

  • HTML-LSTM: Information Extraction from HTML Tables in Web Pages using Tree-Structured LSTM
    Kazuki Kawamura and Akihiro Yamamoto

  • Controlling BigGAN Image Generation with a Segmentation Network
    Aman Jaiswal, Harpreet Singh Sodhi, Mohamed Muzamil H, Rajveen Singh Chandhok, Sageev Oore and Chandramouli Shama Sastry

Short Papers

  • Multi-Scale Sentiment Analysis of Location-Enriched COVID-19 Arabic Social Data
    Tarek Elsaka, Imad Afyouni, Ibrahim Hashem and Zaher Al Aghbari

  • A Network Intrusion Detection System for Concept Drifting Network Traffic Data
    Giuseppina Andresini, Annalisa Appice, Corrado Loglisci, Vincenzo Belvedere, Domenico Redavid and Donato Malerba

  • An Analysis of Performance Metrics for Imbalanced Classification
    Jean-Gabriel Gaudreault, Paula Branco and João Gama

  • Local Interpretable Classifier Explanations with Self-generated Semantic Features
    Fabrizio Angiulli, Fabio Fassetti and Simona Nisticò

  • Automatic human-like detection of code smells
    Chitsutha Soomlek, Jan N. van Rijn and Marcello Bonsangue

  • Spatially-Aware Autoencoders for Detecting Contextual Anomalies in Geo-Distributed Data
    Roberto Corizzo, Michelangelo Ceci, Gianvito Pio, Paolo Mignone and Nathalie Japkowicz

  • Statistical Analysis of Pairwise Connectivity
    Georg Krempl, Daniel Kottke and Tuan Pham

  • Local Exceptionality Detection in Time Series Using Subgroup Discovery
    Dan Hudson, Travis Wiltshire and Martin Atzmueller

  • Elliptical Ordinal Embedding
    Aissatou Diallo and Johannes Fürnkranz

  • Calibrated Resampling for Imbalance and Long-Tails in Deep learning
    Colin Bellinger, Roberto Corizzo and Nathalie Japkowicz

  • Deriving a Single Interpretable Model by Merging Tree-based Classifiers
    Valerio Bonsignori, Riccardo Guidotti and Anna Monreale

  • Ensemble of Counterfactual Explainers
    Riccardo Guidotti and Salvatore Ruggieri

  • A Semi-Supervised Framework for Misinformation Detection
    Yueyang Liu, Zois Boukouvalas and Nathalie Japkowicz

  • GANs for tabular healthcare data generation: a review on utility and privacy
    João Almeida, Ricardo Correia and Pedro Pereira Rodrigues

  • Unsupervised Feature Ranking via Attribute Networks
    Urh Primožič, Blaž Škrlj, Sašo Džeroski and Matej Petković

  • A Sentence-level Hierarchical BERT Model for Document Classification with Limited Labelled Data
    Jinghui Lu, Maeve Henchion, Ivan Bacher and Brian Mac Namee

  • Sentiment Nowcasting during the COVID-19 Pandemic
    Ioanna Miliou, John Pavlopoulos and Panagiotis Papapetrou

  • Privacy risk assessment of individual psychometric profiles
    Giacomo Mariani, Anna Monreale and Francesca Naretto

  • Predicting reach to find persuadable customers: improving uplift models for churn prevention
    Théo Verhelst, Jeevan Shrestha, Denis Mercier, Jean-Christophe Dewitte and Gianluca Bontempi

  • The Case for Latent Variable vs Deep Learning Methods in Misinformation Detection: An Application to COVID-19
    Caitlin Moroney, Evan Crothers, Sudip Mittal, Anupam Joshi, Tulay Adali, Christine Mallinson, Nathalie Japkowicz and Zois Boukouvalas

  • Knowledge discovery of the delays experienced in reporting covid19 confirmed positive cases using time to event models
    Aleksandar Novakovic, Adele Marshall and Carolyn McGregor

Timetable

Access the Conference Proceedings HERE


— Monday, Oct 11 —

9:00-16:25 ADT (UTC -3H)

09:00 Conference Opening - Carlos Soares, Luis Torgo

Preferences & Recommender Systems - Session Chair:
09:10 An Ensemble Hypergraph Learning framework for Recommendation - Alireza Gharahighehi, Celine Vens and Konstantinos Pliakos
09:35 KATRec: Knowledge Aware aTtentive Sequential Recommendations - Seyed Danial Mohseni Taheri, Mehrnaz Amjadi and Theja Tulabandhula
10:00 Unsupervised Feature Ranking via Attribute Networks - Urh Primožič, Blaž Škrlj, Sašo Džeroski and Matej Petković
10:20 Elliptical Ordinal Embedding - Aissatou Diallo and Johannes Fürnkranz

Short Break (10:40)


10:55 Keynote Talk (chair: Luis Torgo)

Human-machine partnership for conservation: AI and humans combatting extinction together, by Tanya Berger-Wolf


Neural Networks & Deep Learning - Session Chair:
11:55 Consensus Based Vertically Partitioned Multi-Layer Perceptrons for Edge Computing - Haimonti Dutta, Saurabh Amarnath Mahindre and Nitin Nataraj
12:20 A Sentence-level Hierarchical BERT Model for Document Classification with Limited Labelled Data - Jinghui Lu, Maeve Henchion, Ivan Bacher and Brian Mac Namee

Long Break (12:40)

Neural Networks & Deep Learning - Session Chair: Paula Branco
14:00 Neural Additive Vector Autoregression Models for Causal Discovery in Time Series- Bart Bussmann, Jannes Nys and Steven Latré BEST STUDENT PAPER AWARD
14:25 Spatially-Aware Autoencoders for Detecting Contextual Anomalies in Geo-Distributed Data - Roberto Corizzo, Michelangelo Ceci, Gianvito Pio, Paolo Mignone and Nathalie Japkowicz
14:45 Local Exceptionality Detection in Time Series Using Subgroup Discovery- Dan Hudson, Travis Wiltshire and Martin Atzmueller

Short Break (15:05)

Streams - Session Chair: Gjorgji Madjarov
15:20 Incremental k-Nearest Neighbors Using Reservoir Sampling for Data Streams- Maroua Bahri and Albert Bifet
15:45 A Network Intrusion Detection System for Concept Drifting Network Traffic Data- Giuseppina Andresini, Annalisa Appice, Corrado Loglisci, Vincenzo Belvedere, Domenico Redavid and Donato Malerba
16:05 Statistical Analysis of Pairwise Connectivity- Georg Krempl, Daniel Kottke and Tuan Pham

— Tuesday, Oct 12 —

9:00-16:25 ADT (UTC -3H)

09:00 Announcements

Classification - Session Chair: Michelangelo Ceci
09:10 Combining Predictions under Uncertainty: The Case of Random Decision Trees - Florian Peter Busch, Moritz Kulessa, Eneldo Loza Mencía and Hendrik Blockeel
09:35 Shapley-Value Data Valuation for Semi-Supervised Learning - Christie Courtnage and Evgueni Smirnov
10:00 A Semi-Supervised Framework for Misinformation Detection- Yueyang Liu, Zois Boukouvalas and Nathalie Japkowicz
10:20 An Analysis of Performance Metrics for Imbalanced Classification- Jean-Gabriel Gaudreault, Paula Branco and João Gama

Short Break (10:40)


10:55 Keynote Talk (chair: Carlos Soares)

Automating Science using Robot Scientists, by Ross King


Applications - Session Chair: Dino Ienco
11:55 Automated Grading of Exam Responses: An Extensive Classification Benchmark- Jimmy Ljungman, Vanessa Lislevand, John Pavlopoulos, Alexandra Farazouli, Zed Lee, Panagiotis Papapetrou and Uno Fors
12:20 Automatic human-like detection of code smells- Chitsutha Soomlek, Jan van Rijn and Marcello Bonsangue

Long Break (12:40)

The Steering Meeting will take place during this break

Applications - Session Chair: Vitor Cerqueira
14:00 HTML-LSTM: Information Extraction from HTML Tables in Web Pages using Tree-Structured LSTM- Kazuki Kawamura and Akihiro Yamamoto
14:25 Predicting reach to find persuadable customers: improving uplift models for churn prevention- Théo Verhelst, Jeevan Shrestha, Denis Mercier, Jean-Christophe Dewitte and Gianluca Bontempi
14:45 Knowledge discovery of the delays experienced in reporting covid19 confirmed positive cases using time to event models - Aleksandar Novakovic, Adele Marshall and Carolyn McGregor

Short Break (15:05)

COVID-19 - Session Chair:
15:20 Prioritization of COVID-19 literature via unsupervised keyphrase extraction and document representation learning - Blaž Škrlj, Marko Jukič, Nika Eržen, Senja Pollak and Nada Lavrač
15:45 Sentiment Nowcasting during the COVID-19 Pandemic- Ioanna Miliou, John Pavlopoulos and Panagiotis Papapetrou
16:05 Multi-Scale Sentiment Analysis of Location-Enriched COVID-19 Arabic Social Data - Tarek Elsaka, Imad Afyouni, Ibrahim Hashem and Zaher Al Aghbari

— Wednesday, Oct 13 —

9:00-16:25 ADT (UTC -3H)

09:00 Announcements

Graphs - Session Chair: Nathalie Japkowicz
09:10 Ranking Structured Objects with Graph Neural Networks - Clemens Damke and Eyke Hüllermeier
09:35 FHA: Fast Heuristic Attack against Graph Convolutional Networks - Haoxi Zhan and Xiaobing Pei
10:00 Deriving a Single Interpretable Model by Merging Tree-based Classifiers - Valerio Bonsignori, Riccardo Guidotti and Anna Monreale
10:20 Local Interpretable Classifier Explanations with Self-generated Semantic Features - Fabrizio Angiulli, Fabio Fassetti and Simona Nisticò

Short Break (10:40)


10:55 Keynote Talk (chair: Saso Dzeroski)

AI and Persuasive Technology to Promote Social Good Among Under-served Population, by Rita Orji


Responsible AI - Session Chair: Anna Monreale
11:55 Learning Time Series Counterfactuals via Latent Space Representations - Zhendong Wang, Isak Samsten, Rami Mochaourab and Panagiotis Papapetrou
12:20 Ensemble of Counterfactual Explainers - Riccardo Guidotti and Salvatore Ruggieri

Long Break (12:40)

The Community Meeting will take place during this break

Responsible AI - Session Chair:
14:00 Leveraging Grad-CAM to Improve the Accuracy of Network Intrusion Detection Systems- *Francesco Paolo Caforio, Giuseppina Andresini, Gennaro Vessio, Annalisa Appice and Donato Malerba *
14:25 Privacy risk assessment of individual psychometric profiles- Giacomo Mariani, Anna Monreale and Francesca Naretto
14:45 The Case for Latent Variable vs Deep Learning Methods in Misinformation Detection: An Application to COVID-19- Caitlin Moroney, Evan Crothers, Sudip Mittal, Anupam Joshi, Tulay Adali, Christine Mallinson, Nathalie Japkowicz and Zois Boukouvalas

Short Break (15:05)

Spatiotemporal - Session Chair:
15:20 Controlling BigGAN Image Generation with a Segmentation Network- Aman Jaiswal, Harpreet Singh Sodhi, Mohamed Muzamil H, Rajveen Singh Chandhok, Sageev Oore and Chandramouli Shama Sastry
15:45 GANs for tabular healthcare data generation: a review on utility and privacy - João Almeida, Ricardo Correia and Pedro Pereira Rodrigues
16:05 Calibrated Resampling for Imbalance and Long-Tails in Deep learning- Colin Bellinger, Roberto Corizzo and Nathalie Japkowicz

Conference Registration

Each accepted paper should be accompanied by a processing fee of 150 Canadian dollars.

The deadline for paying the registration fee is August 25, 2021. After that date, papers without a fee, will not be included in the proceedings.

Registration Details

  • Accepted paper processing fee: 150 CAD (Canadian Dollars)
  • Non-authors: Free with mandatory registration
  • Register below on Eventbrite

REGISTRATION FEES ARE NON-REFUNDABLE

Conference Organization

Program Chairs

Steering Committe Chair

Local Organizing Committee

Program Committee

Program Chairs

  • Carlos Soares, University of Porto, Portugal
  • Luis Torgo, Dalhousie University, Canada

Program Committee Members

Name Institution
Alberto Cano Virginia Commonwealth University
Albrecht Zimmermann Université Caen Normandie
André L.D. Rossi São Paulo State University (Unesp)
Anna Monreale Computer Science Dep., University of Pisa
Bernhard Pfahringer University of Waikato
Bruno Cremilleux Universite de Caen Normandie
Catarina Oliveira INESC TEC
Chedy Raïssi INRIA
Colin Bellinger NRC
Daniel Castro Silva FEUP-DEI / LIACC
Dino Ienco IRSTEA
Dragan Gamberger Rudjer Boskovic Institute
Elio Masciari Federico II University
Francesca Alessandra Lisi University of Bari Aldo Moro
George Papakostas Human-Machines Interaction (HMI) Laboratory, Department of Computer and Informatics Engineering, EMT Institute of Technology
Gianvito Pio University of Bari Aldo Moro
Giuseppe Manco ICAR-CNR
Gjorgji Madjarov Faculty of Computer Science and Engineering, Ss. Cyril and Methodius University
Herna Viktor University of Ottawa
Ioannis Tsamardinos Computer Science Department, University of Crete
Ivica Dimitrovski Faculty of Computer Science and Engineering
Jaakko Hollmén Aalto University
Jan Ramon INRIA
Jerzy Stefanowski Poznan University of Technology, Poland
Johannes Fürnkranz Johannes Kepler University Linz
Kazumi Saito Univesity of Shizuoka
Kouichi Hirata Kyushu Institute of Technology
Luis Teixeira FEUP-DEI / INESC TEC
Makoto Haraguchi Hokkaido University
Marina Sokolova Faculty of Medicine, University of Ottawa and Institute for Big Data Analytics
Martin Atzmueller Osnabrueck University
Michelangelo Ceci Universita degli Studi di Bari
Mohamed Gaber Birmingham City University
Nada Lavrač Jozef Stefan Institute
Nathalie Japkowicz American University
Nicola Di Mauro Università di Bari
Pance Panov Jozef Stefan Institute
Pascal Poncelet LIRMM Montpellier
Pedro Larranaga University of Madrid
Pedro Pereira Rodrigues University of Porto
Rafael Gomes Mantovani Federal Technology University of Paraná, campus Apucarana
Rita P. Ribeiro University of Porto
Ruggero G. Pensa University of Torino, Italy
Sageev Oore Dalhousie University and Vector Institute
Saso Dzeroski Jozef Stefan Institute
Stefan Kramer Johannes Gutenberg University Mainz
Tomislav Lipic Rudjer Boskovic Institute
Tomislav Smuc Rudjer Boskovic Institute
Vincenzo Lagani Ilia State University
Wouter Duivesteijn Eindhoven University of Technology

Sponsors

dalfcsfcs

Venue

The event will be fully online. Details on attending the conference will be sent later to registered participants.

Key Dates

Submit your paper here

Abstract submission (extended): May 16, 2021 June 18, 2021

Full paper submission (extended): May 23, 2021 June 20, 2021

Notification (extended): July 20, 2021 July 28, 2021

Camera ready version, author registration (extended): August 8, 2021 August 11, 2021

Conference: October 11-13, 2021

Call for Papers

Scope

The international conference on Discovery Science provides an open forum for intensive discussions and exchange of new ideas among researchers working in the area of Discovery Science. The conference focus is on the use of artificial intelligence methods in science. Its scope includes the development and analysis of methods for discovering scientific knowledge, coming from machine learning, data mining, intelligent data analysis, and big data analytics, as well as their application in various domains.

Topics

We invite submissions of research papers addressing all aspects of discovery science. We encourage papers that focus on the analysis of different types of massive and complex data, including structured, spatio-temporal and network data, as well as heterogeneous, continuous or imprecise data. We also encourage papers in the fields of computational scientific discovery, mining scientific data, computational creativity and discovery informatics. We welcome papers addressing applications of artificial intelligence in different domains of science, including biomedicine and life sciences, materials science, astronomy, physics, chemistry, as well as social sciences.

Possible topics include, but are not limited to:

  • Artificial intelligence (machine learning, knowledge representation and reasoning, natural language processing, statistical methods, etc.) applied to science
  • Machine learning: supervised learning (including ranking, multi-target prediction and structured prediction), unsupervised learning, semi-supervised learning, active learning, reinforcement learning, online learning, transfer learning, etc.
  • Knowledge discovery and data mining
  • Causal modelling
  • AutoML, meta-learning, planning to learn
  • Machine learning and high-performance computing, grid and cloud computing
  • Literature-based discovery
  • Ontologies for science, including the representation and annotation of datasets and domain knowledge
  • Explainable AI, interpretability of machine learning and deep learning models
  • Process discovery and analysis
  • Computational creativity
  • Anomaly detection and outlier detection
  • Data streams, evolving data, change detection, concept drift, model maintenance
  • Network analysis
  • Time-series analysis
  • Learning from complex data
    • Graphs, networks, linked and relational data
    • Spatial, temporal and spatiotemporal data
    • Unstructured data, including textual and web data
    • Multimedia data
  • Data and knowledge visualization
  • Human-machine interaction for knowledge discovery and management
  • Evaluation of models and predictions in discovery setting
  • Machine learning and cybersecurity
  • Applications of the above techniques in scientific domains, such as
    • Physical sciences (e.g., materials sciences, particle physics)
    • Life sciences (e.g., systems biology/systems medicine)
    • Environmental sciences
    • Natural and social sciences

Submission Guidelines

Papers must be written in English and formatted according to the Springer LNCS guidelines. Papers should be submitted in PDF form via the DS 2020 Online Submission System at EasyChair. Once a paper has been submitted to the conference, changes to the author list are not permitted.

Submitted papers should not exceed 15 pages (long papers) and 10 pages (short ones), in total (including references). All submissions will be subject to review by the DS 2020 Program Committee. The Program Committee reserves the right to offer acceptance as Short Papers (10 pages in the Proceedings) to some Long Paper submissions. All accepted papers will appear in the conference proceedings published by Springer LNCS series and will have allocated time for oral presentation in the conference.

The reviews are single-blind. Authors do not need to anonymize their submission. Submitted papers may not have appeared in or be under consideration for another workshop, conference or journal. They may not be under review or submitted to another forum during the DS 2020 review process.

Guidelines for Accepted Papers

To be announced

Special Issue

The authors of a number of selected papers presented at DS 2021 will be invited to submit extended versions of their papers for possible inclusion in a special issue of Machine Learning journal (published by Springer) on Discovery Science. Fast-track processing will be used to have them reviewed and published.

Award

There will be a Best Student Paper Award in the value of 555 Eur sponsored by Springer.

Contact