Sign in to confirm you’re not a bot
This helps protect our community. Learn more
Application of Azure Computer Vision OCR Service - Extracting Text for NLP | Python Data Science Day
4Likes
518Views
2024Mar 28
Acquiring data for Natural Language Processing (NLP) tasks, such as sentiment analysis, can be a formidable challenge. Recognizing the paramount importance of tailoring an environment to meet the dynamic needs of data scientists, Microsoft Azure offers indispensable tools, notably the Computer Vision Service. This service enables a seamless process for collecting and engineering data. Diverse data sources, spanning text from papers, pictures, books, and more, provide ample opportunities for enriching datasets. Participants will gain insights into leveraging Microsoft Azure tools and Visual Studio Code extensions to streamline data acquisition, storage, and utilization for enhanced Natural Language Processing endeavors. The overarching goal is to empower data scientists with the resources needed to navigate the complexities of NLP data, ultimately fostering a more efficient and effective data science workflow. Storage technologies that would be incorporated are NoSQL (MongoDB), Azure Blob Storage, ultimately we also look at how to utilize local files provided and incorporate data from these sources and make use of the data for NLP tasks. Chapters: 00:00 Application of Azure Computer Vision OCR Service 00:34 Tools we'll be using 02:51 Azure Computer Vison overview 06:20 Today's goal 06:56 Python in NLP 07:40 Demo Resources: Survey https://aka.ms/Python/DataScienceDay/... Python at Microsoft https://aka.ms/python Cloud Skills Challenge - through April 15, 2024 https://aka.ms/Python/DataScienceDay/CSC GitHub codespaces https://github.com/codespaces VS Code Release notes https://code.visualstudio.com/updates Featuring: Theophilus Owiti, Machine Learning Practitioner, Microsoft Learn Student Ambassador (@lincolnowiti)

Follow along using the transcript.

Visual Studio Code

722K subscribers