From Documents to Dialogue: Unlocking PDF Data with a Smart Chatbot

What you will learn

Data Preparation: Discover how to extract text from PDF documents, generate textual descriptions of images and store these in CrateDB for vector and full-text searching.
Data Retrieval and Augmentation: Learn how natural language search queries are converted to vector embeddings and used in semantic and hybrid searches. You’ll also see how to augment a Large Language Model (LLM) prompt with data from the database.
Response Generation: In the final step of the pipeline, we’ll introduce techniques for generating coherent and fluent responses to users