How to scrap data from the website using Python
Web scraping, web harvesting, or web data extraction is data scraping used for extracting data from websites. There are different ways to scrape websites such as online Services, APIs or writing your own code. In this article, we’ll see how to implement web scraping with python. We will use one of the websites I have built.
I will skip the installation of Python in this tutorial.
Using your prefered text editor, create a python file and name it whatever you want. I'll name mine scrapper.py
. We'll Import all the libraries that we'll need to build our scrapper. a library is a collection of precompiled routines that a program can use. The routines, sometimes called modules, are stored in object format.
Now let's get the url for the website from which we want to scrap data from. In this case, we'll use https://windhoeknamibia.github.io
Using our url, we'll now use the requests
library to fetch data from the website.
Create a soup object to get the title of the website
Create a soup object to find places!
Write all place names to a csv
file.
Let's create a soup object to help us get all the image src
links.
Now let's print out all the image links.
I hope this article helped you understand web scrapping and how to use python libraries to scrap websites. You can continue to do more practical examples using different websites.
Be careful not to scrap data from websites which do not give you permission to do so.
To know whether a website allows web scraping or not, you can look at the website’s “robots.txt” file. You can find this file by appending “/robots.txt” to the URL that you want to scrape.
To do!
Try to write all image src links into your csv file.Happy coding!