Sign in to confirm you’re not a bot
This helps protect our community. Learn more
Beyond Keywords: Image similarity search in Azure Cosmos DB for PostgreSQL | Python Data Science Day
3Likes
264Views
2024Mar 28
Vector search, also known as vector similarity search, is a method that helps you find similar items based on their content rather than exact matches on properties like keywords, tags, or other metadata, as keyword-based search systems do. It leverages machine learning to capture the meaning of data, allowing you to find similar items based on their content. The key idea behind vector search is the translation of unstructured data, such as text, images, videos, and audio, into high-dimensional vectors (also known as embeddings) and the application of nearest neighbor algorithms to find similar data. In this quickstart session, we will work together to build an image similarity search system utilizing Python, Azure Cosmos DB for PostgreSQL, and pgvector, an open-source vector similarity search extension for PostgreSQL. We will explore the process of generating vector embeddings using the Azure AI Vision multi-modal embeddings API and enabling the pgvector extension. We will then discuss the exact and approximate nearest neighbor search and use Azure Cosmos DB for PostgreSQL for storing and querying vector data. Chapters: 00:00 Image similarity search in Azure CosmosDB for PostgreSQL 00:56 Why vector search? 01:49 Agenda 02:14 Turn data into vectors 03:02 Project the vectors onto the 2D vector space 03:37 How to measure if 2 vectors are simlar 03:56 Vector search workflow 04:34 Vector search in PostgreSQL 05:01 Create a table to store embeddings 05:34 Query embeddings 06:01 Demo 07:00 Vector search strategies 08:21 Create an IVFFlat index in pgvector 09:30 Demo 10:01 Resources Resources: Project - Github Repository: https://github.com/sfoteini/vector-se... pgvector - Github Repository: https://github.com/pgvector/pgvector Vectors on Azure Cosmos DB for PostgreSQL: https://learn.microsoft.com/azure/cos... Azure AI Vision multimodal embeddings APIs: https://learn.microsoft.com/azure/ai-... Survey https://aka.ms/Python/DataScienceDay/... Python at Microsoft https://aka.ms/python Cloud Skills Challenge - through April 15, 2024 https://aka.ms/Python/DataScienceDay/CSC GitHub codespaces https://github.com/codespaces VS Code Release notes https://code.visualstudio.com/updates Featuring: Foteini Savvidou, Software Engineer, Microsoft AI MVP (@SavvidouFoteini)

Follow along using the transcript.

Visual Studio Code

755K subscribers