Sign in to confirm you’re not a bot
This helps protect our community. Learn more

Simplifying Data Analysis & Visualization with Developer Tools & AI

0:00

Follow along

0:29

Introduction - Data Analysis Challenges & Goals

0:54

GitHub Codespaces - Reusable environments

4:44

Jupyter Notebooks - Make it reproducible

8:32

GitHub Copilot - AI-assisted learning

11:18

Visual Studio Code - Productivity extensions

14:43

Open Datasets - Data Wrangler

15:39

Resonsible AI toolkit - Model debugging for fairness

19:15

Project LIDA - AI-assisted intuition & visualization

21:13

Azure AI Studio - Paradigm shift to LLM Ops

25:24

Summary - Questions & Next Steps

25:47
Simplifying Data Analysis & Visualization with Developer Tools & AI | Python Data Science Day
2Likes
410Views
2024Mar 28
Having data analysis and visualization skills is increasingly important in the new age of Large Language Models and generative AI. But how does a non-Python developer skill up rapidly with the tools & best practices required to achieve project goals, without having the benefit of years of Python or data science experience? This is where the right developer tooling, with a little bit of AI assistance, can help. In this talk, we'll go from identifying an open-source data set, to analyzing it for insights and visualizing relevant outcomes, in 25 minutes - with just a GitHub account and an OpenAI endpoint. Along the way, we'll introduce you to a series of developer tools that make your journey easier:
  • Open Dataset: to ""analyze"" - from Kaggle, Hugging Face, or Azure
  • Data Wrangler: to ""sanitize"" data - extension from Visual Studio Code
  • Jupyter Notebook: to ""record"" process - for transferable learning
  • GitHub Codespaces: to ""pre-build"" environment - for consistent reuse
  • GitHub Copilot: to ""explain/fix"" code - for focused learning with AI help
  • Microsoft LIDA: to ""suggest/build"" visualization goals - for building your intuition with AI help
The talk comes with an associated repo that you can fork - then replace with your own dataset to extend or experiment on your own later. By the end of the talk you should have a sense of how you can go from discovering a data set to getting some visual insights about it, using existing tools with a little AI assistance. Chapters: 00:00 Simplifying Data Analysis & Visualization with Developer Tools & AI 00:29 Follow along 00:54 Introduction - Data Analysis Challenges & Goals 04:44 GitHub Codespaces - Reusable environments 08:32 Jupyter Notebooks - Make it reproducible 11:18 GitHub Copilot - AI-assisted learning 14:43 Visual Studio Code - Productivity extensions 15:39 Open Datasets - Data Wrangler 19:15 Resonsible AI toolkit - Model debugging for fairness 21:13 Project LIDA - AI-assisted intuition & visualization 25:24 Azure AI Studio - Paradigm shift to LLM Ops 25:47 Summary - Questions & Next Steps   Resources: 30 Days of Data Science: https://30daysof.github.io/data-scien... Data Science Recipes: https://aka.ms/2024/data-science-recipes Workshops: https://aka.ms/workshops/python-data-... Survey https://aka.ms/Python/DataScienceDay/... Python at Microsoft https://aka.ms/python Cloud Skills Challenge - through April 15, 2024 https://aka.ms/Python/DataScienceDay/CSC GitHub codespaces https://github.com/codespaces VS Code Release notes https://code.visualstudio.com/updates Featuring: Nitya Narasimhan, PhD, Senior AI Advocate, Microsoft (@nitya)

Follow along using the transcript.

Visual Studio Code

752K subscribers