E15: Unlocking the Internet's Treasure with Rich Skrenta

Practically Intelligent por Sinan Ozdemir and Akshay Bhushan

Notas del episodio

In this episode of Practically Intelligent, Sinan and Akshay sit down with Rich Skrenta, the Executive Director of the Common Crawl Foundation. Rich shares his extensive experience in data aggregation and AI and how that ties into the history, mission, and future of Common Crawl—a nonprofit organization responsible for one of the largest open-source web data repositories in the world. The three discuss the challenges and opportunities of expanding Common Crawl's global reach, the critical role of curated data in training large language models, and the importance of maintaining open access to the internet in the age of cutting edge AI.Key topics include:

  • The importance of curated data in AI training
  • Challenges of expanding Common Crawl globally
  • The future of open internet access in the AI era
Palabras clave
llmchatgptgptrlhfopen sourceai