Always wanted to get some data from the web in a programmatic way? Well, check out my recent post in the Domino Data Blog where I discuss how to get data with the help of Beautiful Soup.
The aim is to show how we can create a script that grabs the pages we are interested in and obtain the information we are after. In the post I cover ho to complete the these steps:
- Identify the webpage with the information we need
- Download the source code
- Identify the elements of the page that hold the information we need
- Extract and clean the information
- Format and save the data for further analysis