You might be needing a list of all hotels in your city for any reason. Most of them can be found at booking.com (assuming it’s a city in Europe).
If you need hotel names, ratings and/or hotel url list from any city you can crawl booking for it. Coding it with Python and selenium is pretty easy. Below is the script that collects hotel names, booking.com hotel urls and ratings for city of Vienna. The list is finally saved to json file.
Go crazy with it…
#! /usr/bin/python
# coding: utf-8
__author__="selfconstruct3d"
__date__ ="$Jun 17, 2016 11:41:36 PM$"
# this script is used to collect basic hotel-info from booking.com
# hotel name, url, and user rating are extracted and saved to json file
from selenium import webdriver
import json
driver = webdriver.Firefox()
# output dict
hotelsDict = dict()
# pagination offset
booking_list_offset = 0
CITY_NAME = "Vienna"
for i in range (1,80):
# just paste booking.com link with city entered. arrival and departure dates are not inserted
driver.get('http://www.booking.com/searchresults.de.html?dcid=1&label=gen173nr-1DCAEoggJCAlhYSDNiBW5vcmVmaBKIAQGYAQe4AQrIAQzYAQPoAQGoAgM&lang=de&sid=e8b897b588f56aa2e25913117df47bcc&sb=1&src=searchresults&src_elem=sb&error_url=http%3A%2F%2Fwww.booking.com%2Fsearchresults.de.html%3Flabel%3Dgen173nr-1DCAEoggJCAlhYSDNiBW5vcmVmaBKIAQGYAQe4AQrIAQzYAQPoAQGoAgM%3Bsid%3De8b897b588f56aa2e25913117df47bcc%3Bdcid%3D1%3Bclass_interval%3D1%3Bdest_id%3D-1746443%3Bdest_type%3Dcity%3Bgroup_adults%3D2%3Bgroup_children%3D0%3Bhlrd%3D0%3Blabel_click%3Dundef%3Bno_rooms%3D1%3Boffset%3D0%3Breview_score_group%3Dempty%3Broom1%3DA%252CA%3Bsb_price_type%3Dtotal%3Bscore_min%3D0%3Bsrc%3Dindex%3Bsrc_elem%3Dsb%3Bss%3DBerlin%252C%2520Berlin%2520%2528Bundesland%2529%252C%2520Deutschland%3Bss_raw%3Dber%3Bssb%3Dempty%26%3B&ss=Wien%2C+Wien+%28Bundesland%29%2C+%C3%96sterreich&ssne=Berlin&ssne_untouched=Berlin&city=-1746443&room1=A%2CA&no_rooms=1&group_adults=2&group_children=0&ss_raw=wien&ac_popular_badge=1&ac_position=0&ac_langcode=de&dest_id=-1995499&dest_type=city&ac_pageview_id=d2db9ad66c2d0283&ac_suggestion_list_length=5&ac_suggestion_theme_list_length=1&rows=15&offset='+str(booking_list_offset))
hotelUrls = driver.find_elements_by_css_selector("a.hotel_name_link.url")
hotelNames = driver.find_elements_by_css_selector("span.sr-hotel__name")
hotelRatings = driver.find_elements_by_css_selector("span.average.js--hp-scorecard-scoreval")
for hotelurl, hotelRating in zip(hotelUrls, hotelRatings):
#get hotel name
name = hotelurl.text
# get url
url = hotelurl.get_attribute("href").split("?")[0]
# get rating
rating = hotelRating.text
print url, ",",name,",",rating
# set up dictionary structure
hotelsDict[url] = {}
hotelsDict[url]["name"] = name
hotelsDict[url]["rating"] = rating
#increase offset
booking_list_offset += 15
# save to json file
with open("crawlbooking-"+CITY_NAME+"-hotel-urls-ratings.json","w") as f:
json.dump(hotelsDict,f)

[…] post continues on the last one. Assuming you have the hotel list with urls from booking you can now extract addresses for each […]
LikeLike