Recently, I was exploring AI tools from online resources and also I came across a web-scraping Chrome extension I thought what if I can scrap this data into beautiful social media
Here are the most powerful web-scraping Chrome extension
What is Web scraping
Web scraping is a powerful tool for collecting data from websites quickly and efficiently. For marketers, web scraping can be especially useful for collecting data on potential leads, target audiences, or competitors. However, manually collecting data from websites can be time-consuming and tedious, particularly when dealing with websites that have large amounts of data.
The Challenge of Data Collection
Collecting data from a large website with over 300 AI tools can be a daunting task. Manually copying and pasting the data for each tool would take hours or even days. This is where web scraping comes in handy. With a web scraping tool, it's possible to automatically collect data from a website, even if there are hundreds or thousands of pages to scrape.
How to scrap data from any website
Here's a YouTube channel to learn about how to use Bardeen to scrap data from any website.
After collecting the data in table form using the Bardeen extension, I transfer table data into google sheets by connecting my Google account.
Organizing Data with Google Sheets
Once the data was collected, I imported it into Google Sheets to organize it. Google Sheets is a powerful tool for data organization, allowing users to sort, filter, and analyze data in a variety of ways. I created separate columns for each data point and used filters to remove any unnecessary data.
Extracted data from AI tool using web scraping
- Website link
- Icon link
Here's an important phase that came up where I applied some google sheets functions to create necessary columns like,
- Absolute URL path to extract the domain name
- Used IMPORTXML() function to get meta image from extracted URL of AI tool
Also, we can import meta descriptions of websites using the IMPORTXML function but in my case, I had the description content of the AI tool.
Used google sheets functions in the above process
Get a domain from an absolute URL (https://abhidadhaniya.com to abhidadhaniya.com)