The Linux Rain Linux General/Gaming News, Reviews and Tutorials

Geolocation using Python

By Kalyani Rajalingham, published 09/02/2021 in Tutorials


Geolocating is the process of retrieving location-related information about a given IP address. And yes! It can be done using Python! So, let’s get right to it.

The first thing to do is to recover the html code from the webpage using the following:

import requests
from bs4 import BeautifulSoup

url = 'https://tools.keycdn.com/geo'
url2 = requests.get(url).text

I have chosen this particular website because it uses GET requests, and so this code will work. However, if you choose another website with a POST request, then the code will have to be modified. Now, we need an input from the user - what is the IP address that they wish to look up?

input = input("What IP do you want to enter?: ")

Next, we create a new url. On the website, when you input an IP address, namely 8.8.8.8, it creates a get request of the following form:

https://tools.keycdn.com/geo?host=8.8.8.8

Now, if we look at this address, what we see if that the IP address comes right after the host and the equal sign. So let’s create one with the input we got (since we just asked the user for an input of IP address):

url_new="https://tools.keycdn.com/geo?host="+input

Ok, so now that we’ve got the new url. Why not send a get request? So, to do that, we do the following:

session = requests.Session()
val = session.get(url_new).text

At this point, we’ve got a whole html response page stored in“val”. What does this mean? It means that we’ve got all the information about the latitude, longitude, and location information stored in “val”. We just need to retrieve it.

How do we retrieve it?

fin = BeautifulSoup(val, "html5lib")
fin2 = fin.find_all("dd", attrs={'class':'col-8 text-monospace'})

In this case, all location-related information is stored under a “dd” tag with a class called col-8 text-monospace. So, I’m asking BeautifulSoup to retrieve it all for me!
And so BeautifulSoup has retrieved, but it has stored it as follows:

[<dd class="col-8 text-monospace">United States (US)</dd>,
 <dd class="col-8 text-monospace">North America (NA)</dd>, 
<dd class="col-8 text-monospace">37.751 (lat) / -97.822 (long)</dd>, 
<dd class="col-8 text-monospace">2021-02-04 13:23:59 (America/Chicago)</dd>, <dd class="col-8 text-monospace">8.8.8.8</dd>, 
<dd class="col-8 text-monospace">dns.google</dd>, 
<dd class="col-8 text-monospace">GOOGLE</dd>, 
<dd class="col-8 text-monospace">15169</dd>]

Now that’s great! But how do we get the values inside of the tags? We can use the get_text() function.

for value in fin2:
    value = value.get_text()
    print(value)

The latter will print out the following for 8.8.8.8:

United States (US)
North America (NA)
37.751 (lat) / -97.822 (long)
2021-02-04 13:26:23 (America/Chicago)
8.8.8.8
dns.google
GOOGLE
15169

And that’s how it’s done!

Happy Coding!



About the author

Kalyani Rajalingham (from Sri Lanka, lives in Canada), and is a Linux and code lover.

Tags: geo-location python scraping tutorials
blog comments powered by Disqus