DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Please enter at least three characters to search
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Zones

Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

Last call! Secure your stack and shape the future! Help dev teams across the globe navigate their software supply chain security challenges.

Modernize your data layer. Learn how to design cloud-native database architectures to meet the evolving demands of AI and GenAI workloads.

Releasing software shouldn't be stressful or risky. Learn how to leverage progressive delivery techniques to ensure safer deployments.

Avoid machine learning mistakes and boost model performance! Discover key ML patterns, anti-patterns, data strategies, and more.

Trending

  • How to Submit a Post to DZone
  • Vibe Coding With GitHub Copilot: Optimizing API Performance in Fintech Microservices
  • Intro to RAG: Foundations of Retrieval Augmented Generation, Part 1
  • A Complete Guide to Modern AI Developer Tools

Scraping Amazon Product Pages (PDP) Without Writing Code in 3 Steps

I will walk you through scraping Amazon product details pages without setting up servers or writing a single piece of code using Integromat and Scrapezone.

By 
Alon Gehlber user avatar
Alon Gehlber
DZone Core CORE ·
Oct. 27, 20 · Tutorial
Likes (2)
Comment
Save
Tweet
Share
7.6K Views

Join the DZone community and get the full member experience.

Join For Free

Nowadays, when eCommerce is booming, scraping eCommerce websites for “alternative data” becomes essential to stay afloat in this competitive game. Some apply AI and text analysis in this data to extract consumer insights and competitive intelligence, while others rely on this data to optimize their pricing.

In most cases, web scraping requires setting up a headless browser like Puppeteer or Selenium and configuring it to fetch the right content from the required pages. In one of my previous articles, I covered several puppeteer tricks to avoid detection. This time I will try to cover how to collect e-commerce data without coding.

In this tutorial, I will walk you through scraping Amazon product details pages without setting up servers or writing a single piece of code using Integromat and Scrapezone.

To cover this basic exercise, this little program’s purpose is to scrape a list of Amazon product URLs daily and send the results to an email address of our choosing. 

Step 1: Create Accounts

There are two fundamental tools for this process, Integromat, and Scrapezone; both are free to signup and can run the first few preliminary jobs for free.

Start with creating a free Integromat account here.

The next step is to create a free Scrapezone account. Register a new account here and copy your username/password details that appear in the API Information tab on the dashboard’s Home page.

Step 2: Create the Scrape Receiving Scenario

Now that you have an account in both tools, the next step would be to create a scenario in Integromat. Login to Ingegromat and select ‘Create a new scenario’:

Search for webhook, and click ‘Continue.’

Click on ‘Custom webhook,’ add a new webhook, and copy the hook address to the clipboard.

This webhook will receive the scrape results from ScrapeZone.

To allow Integromat to define the incoming data structure, let’s send a sample scrape request using this webhook.

Open Terminal and type the following (Make sure you paste the webhook URL in the “callback_url” field:

Java
 




x


 
1
curl --user username:MyScrapingPassword \
2

          
3
--header "Content-Type: application/json" \
4

          
5
--request POST \
6

          
7
--data '{"query":["https://amazon.com/dp/B08J65DST5","https://amazon.com/dp/B08J65DST5"],"scraper_name":"amazon_product_display", “callback_url”: <Paste the webhook url here>”}' \
8

          
9
https://api.scrapezone.com/scrape



Wait for 30-60 seconds for the results to be sent back to Integromat. The status should change to “Successfully determined.”

Now let’s send the results to our email address.

Click on ‘Add another module.’

And select Email -> Send an Email.

You will be required to configure your email address as a connection; this is very straightforward.

Select your preferred email address for ‘To,’ and “Scrape Results” as the subject.

For content, select the ‘parsed_results_csv’ - to receive the CSV file with the scrape results.

If you prefer, select the parsed_results_json to receive the JSON results file.

Click ‘Ok,’ rename the scenario to ‘Scrape Results,’ and click ‘Save.’

Now you can test the scenario by re-sending the curl request.

Step 3: Create the Scheduled Scraping Scenario

Since we want to create a daily scrape, we will create a scenario that sends an HTTP request to ScrapeZone to initiate a scraping task.

Select ‘HTTP’ from the menu and then ‘Make a Basic Auth request.’

For the credentials, click ‘Add’ and type your scraping username and password from Scrapzone Dashboard.

Fill in the following details:

URL: https://api.scrapezone.com/scrape

Body Type: Raw

Content Type: JSON(application/json)

Request Content: 

Java
 




xxxxxxxxxx
1


 
1
{"query":["https://amazon.com/dp/B08J65DST5","https://amazon.com/dp/B08J65DST5"],"scraper_name":"amazon_product_display", “callback_url”: <Paste the webhook url here>”}



For scheduling the task, click on the clock image and select your preferred scheduling.

Getting the Data and Conclusion

That’s all! To test everything, make sure that both scenarios are turned on, go to the request sending scenario and click the blue ‘Play’ button.

You can see the scrape progress in Scrapezone Dashboard.

As soon as the scrape is done, the results are sent to your email address or available to download from their dashboard.

This guide is an example of a pretty straightforward web scraping scenario, but it can be easily applied to more complex procedures and periodic crawls. I hope it will give you some ideas of what can be done with this combination of these tools.

Opinions expressed by DZone contributors are their own.

Partner Resources

×

Comments
Oops! Something Went Wrong

The likes didn't load as expected. Please refresh the page and try again.

ABOUT US

  • About DZone
  • Support and feedback
  • Community research
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • support@dzone.com

Let's be friends:

Likes
There are no likes...yet! 👀
Be the first to like this post!
It looks like you're not logged in.
Sign in to see who liked this post!