Episodios del podcast
Temporada 3
Louis Brandy - SQL meets Vector Search at Rockset
00:00 Intro 00:42 Louis's background 05:39 From Facebook to Rockset 07:41 Embeddings prior to deep learning / LLM era 12:35 What's Rockset as a product 15:27 Use cases 18:04 RocksDB as part of Rockset 20:33 AI capabilities: ANN index, hybrid search 25:11 Types of hybrid search 28:05 Can one learn the alpha? 30:03 Louis's prediction of the future of vector search 33:55 RAG and other AI capabilities 41:46 Call out to the Vector Search community 46:16 Vector Databases vs Databases 49:16 Question of WHY
Saurabh Rai - Growing Resume Matcher
Topics: 00:00 Intro - how do you like our new design? 00:52 Greets 01:55 Saurabh's background 03:04 Resume Matcher: 4.5K stars, 800 community members, 1.5K forks 04:11 How did you grow the project? 05:42 Target audience and how to use Resume Matcher 09:00 How did you attract so many contributors? 12:47 Architecture aspects 15:10 Cloud or not 16:12 Challenges in maintaining OS projects 17:56 Developer marketing with Swirl AI Connect 21:13 What you (listener) can help with 22:52 What drives you? Show notes: - Resume Matcher: https://github.com/srbhr/Resume-Matcher website: https://resumematcher.fyi/ - Ultimate CV by Martin John Yate: https://www.amazon.com/Ultimate-CV-Cr... - fastembed: https://github.com/qdrant/fastembed - Swirl: https://github.com/swirlai/swirl-search
Temporada 2
Sid Probstein - Creator of SWIRL - Search in siloed data with LLMs
Topics: 00:00 Intro 00:22 Quick demo of SWIRL on the summary transcript of this episode 01:29 Sid’s background 08:50 Enterprise vs Federated search 17:48 How vector search covers for missing folksonomy in enterprise data 26:07 Relevancy from vector search standpoint 31:58 How ChatGPT improves programmer’s productivity 32:57 Demo! 45:23 Google PSE 53:10 Ideal user of SWIRL 57:22 Where SWIRL sits architecturally 1:01:46 How to evolve SWIRL with domain expertise 1:04:59 Reasons to go open source 1:10:54 How SWIRL and Sid interact with ChatGPT 1:23:22 The magical question of WHY 1:27:58 Sid’s announcements to the community YouTube version: https://www.youtube.com/watch?v=vhQ5LM5pK_Y Design by Saurabh Rai: https://twitter.com/_srbhr_ Check out his Resume Matcher project: https://www.resumematcher.fyi/
Atita Arora - Search Relevance Consultant - Revolutionizing E-commerce with Vector Search
Topics: 00:00 Intro 02:20 Atita’s path into search engineering 09:00 When it’s time to contribute to open source 12:08 Taking management role vs software development 14:36 Knowing what you like (and coming up with a Solr course) 19:16 Read the source code (and cook) 23:32 Open Bistro Innovations Lab and moving to Germany 26:04 Affinity to Search world and working as a Search Relevance Consultant 28:39 Bringing vector search to Chorus and Querqy 34:09 What Atita learnt from Eric Pugh’s approach to improving Quepid 36:53 Making vector search with Solr & Elasticsearch accessible through tooling and documentation 41:09 Demystifying data embedding for clients (and for Java based search engines) 43:10 Shifting away from generic to domain-specific in search+vector saga 46:06 Hybrid search: where it will be useful to combine keyword with semantic search 50:53 Choosing between new vector DBs and “old” keyword engines 58:35 Women of Search 1:14:03 Important (and friendly) People of Open Source 1:22:38 Reinforcement learning applied to our careers 1:26:57 The magical question of WHY 1:29:26 Announcements See show notes on YouTube: https://www.youtube.com/watch?v=BVM6TUSfn3E
Connor Shorten - Research Scientist, Weaviate - ChatGPT, LLMs, Form vs Meaning
Topics: 00:00 Intro 01:54 Things Connor learnt in the past year that changed his perception of Vector Search 02:42 Is search becoming conversational? 05:46 Connor asks Dmitry: How Large Language Models will change Search? 08:39 Vector Search Pyramid 09:53 Large models, data, Form vs Meaning and octopus underneath the ocean 13:25 Examples of getting help from ChatGPT and how it compares to web search today 18:32 Classical search engines with URLs for verification vs ChatGPT-style answers 20:15 Hybrid search: keywords + semantic retrieval 23:12 Connor asks Dmitry about his experience with sparse retrieval 28:08 SPLADE vectors 34:10 OOD-DiskANN: handling the out-of-distribution queries, and nuances of sparse vs dense indexing and search 39:54 Ways to debug a query case in dense retrieval (spoiler: it is a challenge!) 44:47 Intricacies of teaching ML models to understand your data and re-vectorization 49:23 Local IDF vs global IDF and how dense search can approach this issue 54:00 Realtime index 59:01 Natural language to SQL 1:04:47 Turning text into a causal DAG 1:10:41 Engineering and Research as two highly intelligent disciplines 1:18:34 Podcast search 1:25:24 Ref2Vec for recommender systems 1:29:48 Announcements For Show Notes, please check out the YouTube episode below. This episode on YouTube: https://www.youtube.com/watch?v=2Q-7taLZ374 Podcast design: Saurabh Rai: https://twitter.com/srvbhr