CSpace: A concept embedding space for bio-medical applications

Category
Publications
Author
Danilo Tomasoni, Luca Marchetti
Pub. date
June 27, 2025

Published in Bioinformatics (Oxford University Press)

A new study by COSBI researchers Danilo Tomasoni and Luca Marchetti has been published in Bioinformatics (Oxford University Press), presenting CSpace—a compact, web-based concept-embedding model designed to support semantic discovery in the biomedical domain.

Read the article in Bioinformatics

While traditional keyword searches often miss relevant information due to synonyms, paraphrasing, or spelling variations, CSpace embeds over 30 million biomedical concepts and ontology identifiers, enabling researchers to retrieve related terms they wouldn’t think to search for directly.

Key features:

  • Semantic expansion that reveals hidden connections between diseases, genes, and phenotypes
  • Compact and efficient: smaller than comparable models, with sentence similarity performance comparable to OpenAI, as shown in Table 7
  • GPU-free: designed to run on everyday hardware without sacrificing performance

In a COSBI use case, integrating CSpace into a literature search pipeline increased coverage from 22% to 80%—demonstrating its value in supporting more complete, accurate scientific reviews.

This work reflects COSBI’s ongoing mission to develop accessible, high-impact tools for data-driven biomedical research.

Congratulations to Danilo Tomasoni and Luca Marchetti for this contribution to the field.

Latest News

Stay tuned