5. Getting Comfortable with Different Kinds of Data Sources
Activity 5.01: Reading Tabular Data from a Web Page and Creating DataFrames
Solution:
These are the steps to complete this activity:
- Import
BeautifulSoup
and load the data by using the following command:from bs4 import BeautifulSoup import pandas as pd
- Open the Wikipedia file by using the following command:
fd = open("../datasets/List of countries by GDP (nominal) "\ "- Wikipedia.htm", "r", encoding = "utf-8") soup = BeautifulSoup(fd) fd.close()
Note
Don't forget to change the path of the dataset (highlighted) based on its location on your system
- Calculate the tables by using the following command:
all_tables = soup.find_all("table") print("Total number of tables are {} ".format(len(all_tables)))
There are nine tables in total.
- Find the right table using the
class
attribute by using...