Semantic enrichment of database columns using generative language models for advanced query searches
Abstract
This study introduces a novel application of natural language generation (NLG) models to improve database table retrieval. Unlike previous works primarily utilizing embeddings and natural language processing (NLP) models, this work explores using NLGs to generate database column descriptions to enhance search accuracy. The evaluation involves two main aspects: firstly, assessing the accuracy of AI-generated column descriptions compared to ground truth descriptions; secondly, examining the impact of these descriptions when integrated into existing search models to evaluate accuracy improvements. Results indicate improved semantic alignment when comparing generated descriptions to ground truth over column names alone and improved scores for established work.