semantic search with opensearch

Summary

Top 2 papers analyzed

Entity-oriented search has revolutionized search engines. Whether they express an information need through a keyword query, expecting documents and entities, or through a clicked entity, expecting related entities, there is an inherent need for the combination of corpora and knowledge bases to obtain an answer. We explore joint representation models of documents and entities, while taking a step towards the definition of a more general retrieval approach. Specifically, we propose that graphs should be used to incorporate explicit and implicit information derived from the relations between text found in corpora and entities found in knowledge bases. We also take advantage of this framework to elaborate a general model for entity-oriented search, proposing a universal ranking function for the tasks of ad hoc document retrieval (leveraging entities), ad hoc entity retrieval, and entity list completion. We begin by proposing the graph-of-entity, based on the relations between combinations of term and entity nodes. We introduce the entity weight as the corresponding ranking function, relying on the idea of seed nodes for representing the query, either directly through term nodes, or based on the expansion to adjacent entity nodes. The score is computed based on a series of geodesic distances to the remaining nodes, providing a ranking for the documents (or entities) in the graph. To improve on the low scalability of the graph-of-entity, we then redesigned this model in a way that reduced the number of edges in relation to the number of nodes, by relying on the hypergraph data structure. The resulting model, which we called hypergraph-of-entity, is the main contribution of this thesis. We evaluate the TREC OpenSearch track and participate in TREC Common Core track. Our experiments were supported on the INEX 2009 Wikipedia collection. Results supported the viability of a general retrieval model, opening novel challenges in information retrieval, and proposing a new path towards generality in this area. We present a Retrieval-Augmented Generation (RAG) based chatbot framework that uses Natural Language Processing (NLP) and state-of-the-art language models to analyze Multiple Myeloma (MM)-specific literature and provide personalized treatment recommendations based on patient-specific genomic data. Our framework integrates the BioMed-RoBERTa-base model for embedding generation and the Mistral-7B language model for question answering, enabling understanding and response to complex clinical queries. A data analysis pipeline provides insights into the MM research landscape, informing the chatbot's knowledge base. Deployed using Amazon Kendra, our RAG chatbot offers a scalable platform for accessing MM information. The framework aims to democratize precision medicine by providing clinicians with a tool for interpreting complex genomic data in MM, streamlining clinical workflows, and facilitating personalized treatment plans. This paper presents our RAG-based chatbot framework's conceptualization, development, and potential impact on MM treatment and precision medicine. The integration of AI, NLP, and domain-specific knowledge marks a new era of healthcare with highly personalized, data-driven, and effective treatment. Our framework advances precision medicine in MM and serves as a blueprint for similar systems in other diseases, improving patient outcomes.

A RAG Chatbot for Precision Medicine of Multiple Myeloma

We present an AI framework that provides personalized treatment recommendations for Multiple Myeloma patients.The chatbot integrates biomedical and language models to understand complex clinical queries and offer insights into precision medicine.

Published By:

M. A. Quidwai - medRxiv

2024

Cited By:

Graph-based entity-oriented search

We propose a graph framework incorporate texts and entities.We evaluate our model on TREC datasets,results show viability of a general model.

Published By:

José Luís Devezas - SIGIR Forum

2021

Cited By: