Extract Wine Price Data

Extract Wine Price Data

Oct 19, 2021, 12:47:07 PM Business
Introduction

In this blog, we will discuss about how to build a web scraper that will get latest delivery status and price for liquor from local wine and different store.

At RetailGators, we can scrape the following data fields from total wine & wine store:

  • Name of Wine
  • Pricing of Wine
  • Size of Quantity
  • Stock of Liquor
  • Delivery Available or
  • URL of Website

data-field

We can save data in CSV or Excel format.

sample-data

It is Mandatory to Install-Package to Route Total Wine and Other Web Store Scraper

We can use Python 3 for libraries and this you can do in Cloud or VPS or a Raspberry Pi.

We can easily use these libraries: -

  • Python Request is for making various request to download HTML content. (http://docs.python-requests.org/en/master/user/install/)
  • Selectorlib for extracting data using the YAML file we have developed from different websites that we have downloaded.
  • Easily Install them with pip3.

Installing Request for pip3 selectorlib

Python Code

Contact us for full code which is use in this Blog.

https://www.retailgators.com/

You can make the file name products.py or you can paste the Python code is given in it.

from selectorlib import Extractor

import requests

import csv

e = Extractor.from_yaml_file('selectors.yml')

def scrape(url):

headers = {

'authority': 'www.totalwine.com',

'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.122 Safari/537.36',

'accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9',

'referer': 'https://www.totalwine.com/beer/united-states/c/001304',

'accept-language': 'en-US,en;q=0.9',

}

r = requests.get(url, headers=headers)

return e.extract(r.text, base_url=url)

with open("urls.txt",'r') as urllist, open('data.csv','w') as outfile:

writer = csv.DictWriter(outfile, fieldnames=["Name","Price","Size","InStock","DeliveryAvailable","URL"],quoting=csv.QUOTE_ALL)

writer.writeheader()

for url in urllist.read().splitlines():

data = scrape(url)

if data:

for r in data['Products']:

writer.writerow(r)

The Code can do below mention things: -

  • You can easily read the list of URLs and Wines from the file name urls.txt (This file contains URLs for TWM products page like Scotch, Beer, & Wines, etc.)
  • Using selectorlib YAML file, we can identify Total Wine pages’ data in the file name selectors.yml (Want to know more, how you can create the file you will come to know in this Blog).
  • Extract the Data
  • Download data in CSV Spreadsheet layout data.csv name.
Create A YAML File Name Selectors.Yml

Products:

from selectorlib import Extractor

css: article.productCard__2nWxIKmi

multiple: true

type: Text

children:

Price:

css: span.price__1JvDDp_x

type: Text

Name:

css: 'h2.title__2RoYeYuO a'

type: Text

Size:

css: 'h2.title__2RoYeYuO span'

type: Text

InStock:

css: 'p:nth-of-type(1) span.message__IRMIwVd1'

type: Text

URL:

css: 'h2.title__2RoYeYuO a'

type: Link

DeliveryAvailable:

css: 'p:nth-of-type(2) span.message__IRMIwVd1'

type: Text

Run Total Wine Store As Well As More Scraper

You need to add URL that require to extract the text file name URLs.txt with same folder.

In this urls.txt file,

https://www.totalwine.com/spirits/scotch/single-malt/c/000887?viewall=true&pageSize=120&aty=0,0,0,0

After that you need to route scraper in given command:

python3 products.py

You May Face Some Problem Using Code As Well As Other Tools Or Copied From The Internet

some-problem


  • For Example, a website validates its structure like CSS Selector we use for price in the selectors.yaml file name price_1 JvDDp_x will change as time passes or every day.
  • The Website may block an IP address or IPs from the Proxy provider.
  • The Website may block the design of retrieving the script’s uses.
  • The Site May Block a User-Agent.
  • The Site May add new data points or transforms a new one.

There are many reasons for taking full-service from enterprise companies like RetailGators work best than other tools, DIY scripts, and products. RetailGators is a one stop solution for eCommerce Data Scraping. If you need any help with Liquor Shop Data Scraping Services, then you can contact us for all your quotes.

Published by Retailgators

Comments

Reply heres...

Login / Sign up for adding comments.