Track 1

n2c2/OHNLP Track on Clinical Semantic Textual Similarity

Description

The wide adoption of electronic health records (EHRs) has provided a way to electronically document patient’s medical conditions, thoughts, and actions. While the use of EHRs has led to an improvement in the quality of healthcare, it has introduced new challenges. One such challenge is the growing use of copy-and-paste and templates, resulting in bloated, poorly organized, or erroneous documentation. EHRs are no longer optimized for tracking multiple complex medical problems or maintaining continuity and quality of clinical decision-making. There is a growing need for automated methods for processing patient data and reducing the cognitive burden in clinical decision-making of providers. This problem is compounded by the distribution of patient data across multiple heterogeneous sources. There is a need for tools that can aggregate data from diverse sources, minimize data redundancy, and organize and present the data in a user-friendly way in order to reduce providers’ cognitive burden.

A step in this direction consists of computing the semantic similarity among text snippets. In the general English domain, the SemEval Semantic Textual Similarity (STS) shared tasks have been organized since 2012 to develop automated methods for this purpose[1]. Clinical text contains highly domain-specific terminologies; therefore domain-specific NLP tools and resources are needed for analysis, interpretation and management of clinical text[2]. This task was previously tackled in the BioCreative/OHNLP  2018 ClinicalSTS task[3]. The 2019 n2c2/OHNLP shared task Track on Clinical Semantic Textual Similarity (ClinicalSTS) will build on this experience and provide a venue for further evaluation of systems on previously unseen data.

Task Overview

ClinicalSTS provides paired clinical text snippets that are de-identified sentences taken from clinical notes. The task is to assign a numerical score to each pair of sentences indicating their degree of semantic similarity. The scores fall on an ordinal scale, ranging from 0 to 5 where 0 means that the two snippets are completely dissimilar (i.e., no overlap in their meanings) and 5 means that the two snippets have complete semantic equivalence.

Evaluation Format

Evaluation will be conducted using withheld test data. Performance will be measured by the Pearson correlation coefficient between the predicted similarity scores and human judgments. Evaluation script will be made available to the participants, so that continuous evaluation on the training data could be conducted during system development.

Participating teams are required to register and sign a Mayo Data Use Agreement to get access to the dataset.

Each team can submit up to 3 runs for the testing data where each run should have one line for each sentence pair that provides the similarity score assigned by the system as a floating point number.

Dissemination

Participants are asked to submit a 500-word long abstract describing their methodologies. Abstracts may also have a graphical summary of the proposed architecture. The document should not exceed 2 pages (i.e., 1.5 line spacing, 12pt-font size). The authors of either top performing systems or particularly novel approaches will be invited to present or demonstrate their systems at the workshop. A journal venue will be organized following the workshop.

Contact

Please join the discussion group below for announcements. Questions about the challenge can be addressed to the organizers by posting to the group (New Topic button) or sending email to the address below.
Discussion Group: N2C2/OHNLP_2019_ClinicalSTS_Task
Email: n2c2ohnlp_2019_clinicalsts_task@googlegroups.com

Tentative Timeline

Registration April 10, 2019
Training Data Release May 27, 2019
Test Data Release August 5, 2019
System Outputs Due August 7, 2019 (11:59pm Eastern Time)
Aggregate Results Release August 12, 2019 (11:59pm Eastern Time)
Abstract Submission September 9, 2019

References

[1]Cer, D., Diab, M., Agirre, E., Lopez-Gazpio, I., & Specia, L. (2017). Semeval-2017 task 1: Semantic textual similarity-multilingual and cross-lingual focused evaluation. arXiv preprint arXiv:1708.00055.

[2]Wang, Y., Afzal, N., Fu, S., Wang, L., Shen, F., Rastegar-Mojarad, M., & Liu, H. (2018). MedSTS: a resource for clinical semantic textual similarity. Language Resources and Evaluation, 1-16.

[3]Wang, Y., Afzal, N., Liu, S., Rastegar-Mojarad, M., Wang, L., Shen, F., ... & Liu, H. (2018). Overview of the BioCreative/OHNLP Challenge 2018 Task 2: Clinical Semantic Textual Similarity. Proceedings of the BioCreative/OHNLP Challenge, 2018.