A web scraping program which opens a google page, search "Python" and gets the links of all google's search result first of 4 pages
and append that link into a new list.
Below are requirement to run the below program:
- Python3
- Python Selenium package
- Webdrivers (We have used Chrome driver in this program)
Steps for webscraping:
- Import webdriver from selenium
- Create a driver object by giving the path of your webdriver
- Using driver open the webpage using get method
- Find the google search box using find_element method and enter the Keyword you want to search
- Lastly find all the links in the google search result.
Python Program
##Import Statement
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
driver = webdriver.Chrome("./chromedriver") #path for Chromedriver
google_page = driver.get('https://www.google.com') #Opens a google page in chrome browser
search_box = driver.find_element_by_xpath('//input[@name="q"]') # Find the google search box
search_box.send_keys('Python') #Enter the keyword "Python"
search_box.send_keys(Keys.ENTER) #Clicks the search button
pages_links =[]
## Getting links from first 4 pages in google search result
counter = 0
for i in range(counter,4):
page_no = driver.find_elements_by_xpath("//table[@class='AaVjTc']/tbody/tr/td/a") # Get the pages links which is at the bottom
page_no[i].click() #Click that page to go that page
value = driver.find_elements_by_xpath('//div[@class="r"]/a') #Gets Links of that page
## Loop for getting the href value of all the links in search result
for each_val in value:
link = each_val.get_attribute('href')
pages_links.append(link)
print("Complete links of four pages","\n".join(pages_links))
driver.quit() #Close the browser window.
Output:
DevTools listening on ws://127.0.0.1:64359/devtools/browser/9c3dcfc0-5d6c-419c-ac43-2e9998e503e4
Complete links of four pages https://developers.google.com/edu/python
https://github.com/python
https://github.com/python/cpython
https://www.coursera.org/specializations/python
https://www.coursera.org/learn/interactive-python-1
https://www.kaggle.com/learn/python
https://www.pluralsight.com/paths/python
https://marketplace.visualstudio.com/items?itemName=ms-python.python
https://www.youtube.com/watch?v=rfscVS0vtbw
https://www.geeksforgeeks.org/python-programming-language/
https://www.python.org/
https://en.wikipedia.org/wiki/Python_(programming_language)
https://www.w3schools.com/python/
https://codeinstitute.net/blog/what-is-python-used-for/
https://thehelloworldprogram.com/python/why-python-should-be-the-first-programming-language-you-learn/
https://medium.com/@trungluongquang/why-python-is-popular-despite-being-super-slow-83a8320412a9
https://support.datacamp.com/hc/en-us/articles/360038816113-Is-Python-free-
https://www.codecademy.com/learn/learn-python
https://www.tutorialspoint.com/python/index.htm
https://www.learnpython.org/
https://www.anaconda.com/products/individual
https://www.udacity.com/course/introduction-to-python--ud1110
https://www.infoworld.com/article/3204016/what-is-python-powerful-intuitive-programming.html
https://stackoverflow.com/questions/tagged/python
https://opensource.com/resources/python
https://pandas.pydata.org/
http://www.pythontutor.com/
https://devblogs.microsoft.com/python/
https://docs.python-guide.org/intro/learning/
https://www.activestate.com/products/python/
https://realpython.com/
https://www.programiz.com/python-programming/online-compiler/
https://aws.amazon.com/developer/language/python/
https://pypi.org/
https://www.datacamp.com/courses/intro-to-python-for-data-science
https://www.raspberrypi.org/documentation/usage/python
https://python.swaroopch.com/
https://www.edx.org/learn/python
https://google.github.io/styleguide/pyguide.html
https://www.udemy.com/topic/python/
