The complete guide to accessing web data
A series of self-contained webinars to take you from zero knowledge of web data to successfully starting your web data project.
Join us to get a better understanding of what web data is, how to get it, and best practices across use cases.
Course length | 5 session
Episode length | 30 mins + QA
Course fee | FREE
How to launch a large-scale web data extraction project
LexisNexis's web scraping journey from concept to iteration to running a large-scale project
Date: 29th March, 2023
Time: 2pm GMT | 10am ET | 7am PT
From defining our data requirements that best serve the business requirements, and selecting the right way to access web data for your project to scraping compliant, high-quality web-extracted data, we've come a long way in this journey to success with web data.
With this final webinar, we will help you connect the dots between all the different stages and apply your learnings in practice to launch a web data project from conception and iteration to design and execution.
Join our special guests Eric Platow, Senior Director of Data Science at Lexis Nexis, and Neha Setia Nagpal, Web data evangelist at Zyte as they talk about LexisNexis's journey to scrape 100k websites.
Learn the tips and tricks of extracting web data from older websites using traditional technologies as well as new-age websites built with sophisticated tech. Eric will also share a sneak peek of the web scraping process used by LexisNexis to scrape, clean, process, and consume web-extracted data.
In this webinar, learn how to
Launch a scalable web scraping project
Ace the techniques to scrape data from all kinds of websites
Define the rules and techniques of data extraction so you scrape efficiently
Overcome challenges faced by LexisNexis while scraping 100k websites
Won't be able to make it to the live session? Register anyway and we'll send you the recording.
Web Data 101: Planning for success with web data
Whether you know nothing about web data or you’re an experienced data consumer, these 4 tips and tricks will help you get off to a great start by helping you understand how your data requirements should match your business and technical requirements from the very start.
Learn how to set the foundation of any web data project on solid ground with Neha, our Web Data Evangelist, and David, Head of Solution Architecture at Zyte.
Understand what data you need, how to get it, and how to avoid some of the common mistakes made by rushing in without a plan.
In this Webinar set yourself up for success by learning to:
- Define Business Requirements vs Data Requirements
- What is web scraping and Data as a Service
- Select Data Attributes
- Understand the Web Data Maturity Model
Don't forget to grab your free checklist at the end of the webinar to help you start your web scraping project the right way.
Discovering the best way to access web data
In this webinar, we will try to figure out the best way to access the web data based on your requirements. A great way to find your best fit is to evaluate the scope triangle and assess the balance required between the cost, time, and quality of your web data extraction project.
Learn how to weigh your requirements and find your best fit to access web data with Neha Setia Nagpal, Web Scraping Evangelist at Zyte, and web scraping expert, Theresia Tanzil.
In this webinar, source the data you need by understanding:
- The scope of your project
- How to weigh your cost, time and quality requirements
- The different methods of extracting web data
- Pros and Cons of each the different web scraping methods
Conducting a web scraping legal compliance review
In this webinar, we’ll talk about web scraping laws and regulations around the world and share a few guidelines to follow when scraping the web so you know when you need to be cautious about the manner and type of data you scrape.
Join our Chief Legal Officer, Sanaea Daruwalla, and Web Scraping Evangelist, Neha Setia Nagpal to learn about web scraping best practices so you can scrape the web with peace of mind.
You will learn:
- The laws and regulations governing web scraping
- What to look for before you start your project
- How to not harm the websites you scrape
- How to avoid GDPR and CCPA violations
A comprehensive overview of web data quality assurance
With this webinar, we dive deeper into the web data quality assurance process to make sure the web data you extract meets the requirements of your business.
Join Neha Setia Nagpal, Web data evangelist at Zyte, Artur Sarduski, Data Scientist at Zyte, and Pierluigi Vinciguerra, Co-Founder, and CTO at Re Analytics - Databoutique.com as they share the secrets behind Zyte's best-in-class web data quality assurance process which ensures that the data collected is reliable and of the highest quality.
Learn how to evaluate the data's accuracy, completeness, consistency, and timeliness, and get helpful tips on regularly monitoring the crawlers for quick identification and resolution of any issues that may arise.
In this webinar, ensure the highest web data quality by learning
- Best practices to avoid QA issues
- Testing and monitoring your data to catch problems early
- Zyte's approach to high web data extraction quality
- Tips to fix web data QA issues
Meet the experts in web data extraction
Neha Setia Nagpal
Web Data Evangelist
Neha is a story-teller and loves to weave stories to explain tech concepts in a funny yet relatable way. Want to know how baking cakes and Machine Learning are similar? Feel free to message her.
Subscribe to her weekly newsletters - Extract Data Community
Co-Founder and CTO
Pierluigi has 12+ years of expertise in data management, from web data integration and scraping to business intelligence. At Re Analytics his team crawls 1+ Billion price points every month to extract consumer and luxury goods data.
Join his substack - The Web Scraping Club
Senior Director of Data Science LexisNexis
Eric is an expert in building scalable, reusable components that come together in order to make our data stewards become superheroes. He leads a team of data scientists, data engineers, and the data steward team through a transformation from fully manual data collection to automated data collection and processing.
Artur is a Data Scientist and works closely with quality and process optimization in web scraping. He thrives on developing innovative solutions for complex problem.
Ask him how he applies biology or financial analysis principles to web scraping.
Chief Legal Officer
Sanaea is one of the leading experts on web data extraction laws and has spoken about ethical web data extraction and legal compliance at many conferences including Extract Summit.
Head of Solution Architecture
David is accomplished at developing and managing solutions across web data. He is an expert at architecting scalable solutions and deliver quality projects.
Global New Business Sales Lead
Liam leads the New Business team at Zyte. Liam is an expert in understanding the role of web data in different projects. His New Business team matches Zyte's products and services with customers who need web data.
Web Scraping Expert
Theresia is a data strategist and knowledge management expert who helps businesses gain external competitive edge and reach internal clarity & focus through the process of transforming data into actionable knowledge.
Team Leader - Software Engineer
Paweł has several years of experience developing advanced crawling solutions using Scrapy framework. He loves contributing to open source and is one of the authors of ScrapyRT framework and has made contributions to Splash.