Inverse Distance Weighting Interpolation in Python

This article will teach us how to do IDW interpolation in Python.

Pareekshith Katti

Updated Sep. 28, 22 · Tutorial

Likes (1)

Comment

Save

3.0K Views

Geospatial interpolation is a process used to estimate values for unknown points in a geographical area using known values.

Inverse Distance Weighting, or IDW for short, is one of the most popular methods used for geospatial data interpolation. This article will teach us how to do IDW interpolation in Python.

The IDW interpolation method assumes that the closer values are more related than the farther ones. IDW estimates the value at an unknown point by using the known values weighted by their distance from the point whose value is to be estimated.

We can also use another variable called "power" in IDW to control the influence of known values. Higher power will increase the influence of nearby values.

Here is a simple DIY implementation of IDW in Python.

    Python
   
   import numpy as np

def idw_custom(dist,val,power):

    numerator=np.sum(val/(dist**power))

    weights=np.sum(1/(dist**power))

    return numerator/weights

dist=np.array([1.38,3.04,7.36,7.59,9.67])

val=np.array([17.7,15.7,34.3,6.3,23.3])

print(idw_custom(dist,val,1))

The numerator contains the sum of the known values divided by distance, and the denominator includes the sum of the inverse of distances. The power can be used to control the influence of known values.

Using Built-in Libraries to Perform IDW Interpolation

In the above example, we used precalculated distances, but in most practical use cases, we have to calculate the distance ourselves. We can use the haversine distance to calculate it, but when we have a lot of points, this can be cumbersome. This is where we can use preexisting libraries that calculate the distance and perform the interpolation.

In our example, we will be interpolating PM2.5 values from air quality data for the city of Bengaluru.

    Python
   
   grid_space = 0.01
grid_lon = np.arrange(np.amin(lons), np.amax(lons), grid_space)
grid_lat = np.arrange(np.amin(lats), np.amax(lats), grid_space)

Let us first generate a grid of points for which we need to estimate values. We've set approximately 1 km of grid space.

"lons" contains a list of longitudes and "lats" contains a list of latitudes. We've generated the grid using the min and max of longitudes and latitudes.

    Python
   
 

   all_lats = np.meshgrid(grid_lon, grid_lat)[1].ravel()
all_lons = np.meshgrid(grid_lon, grid_lat)[0].ravel()
itrp=pd.DataFrame()
itrp['lat']=all_lats
itrp['lng']=all_lons
  

In the above code, we created a data frame containing the pairs of latitude and longitude for which we need to estimate values. We can also use ‘for loop’ to do the same, as shown in the code below.

    Python
   
 

   lat=[]
lng=[]
for i in range(len(grid_lat)):
    for j in range(len(grid_lon)):
        lat.append(grid_lat[i])
        lng.append(grid_lon[j])
itrp=pd.DataFrame()
itrp['lat']=lat
itrp['lng']=lng
  

We can use Sklearn's KNN implementation to simulate IDW. The code to do it is given below.

    Python
   
   x=sample[['lat','lng']]
y=sample['PM2.5']
from sklearn.neighbors import KNeighborsRegressor
model=KNeighborsRegressor(algorithm='kd_tree',n_neighbors=8,weights='distance').fit(x,y)

The "sample" dataframe contains the station air quality data for a single timestamp. We give latitude and longitude as explanatory variables and PM2.5 as the variable that needs to be interpolated. We should use "kd_tree" as the algorithm and set "n_neighbors" as the number of stations, which in this case is eight. We should also set "weights" as the distance to perform IDW.

    Python
   
   pred=model.predict(itrp[['lat','lng']])

We will use the predict method to estimate the values for our grid points, which are stored in the itrp dataframe.

We'll now load some shapefiles to help us visualize the interpolation results.

    Python
   
 

   data=gpd.read_file('Taluk_Boundary.shp')
data=data[data['KGISDistri']==20].iloc[0:4]
itrp=gpd.GeoDataFrame(itrp,geometry=gpd.points_from_xy(itrp.lng, itrp.lat))
stn=gpd.GeoDataFrame(stn,geometry=gpd.points_from_xy(stn.lng, stn.lat))
sample=gpd.GeoDataFrame(sample,geometry=gpd.points_from_xy(sample.lng, sample.lat))
sample.crs={'init' :'epsg:4326'}
sample=sample.to_crs(data.crs)
stn.crs={'init' :'epsg:4326'}
stn=stn.to_crs(data.crs)
itrp.crs={'init' :'epsg:4326'}
itrp=itrp.to_crs(data.crs)
  

The "data" contains the shapefile for the city of Bengaluru.

We converted "itrp," "sample," and "stn" to GeoDataFrame to plot the points.

Lastly, we set the coordinate reference system (CRS, for short) for all the newly created geodataframes. This should be the same as our shapefile "data.

The following is the result of the interpolation.

    Python
   
   ax=data.plot(color='white', edgecolor='black',figsize=(25,30))

itrp.plot(ax=ax, column='PM2.5', markersize=50,figsize=(25,30),legend=True)

sample.plot(ax=ax, marker='o', column='PM2.5', markersize=50,figsize=(25,30),label='SiteName')

for x, y, label in zip(sample.geometry.x, sample.geometry.y, sample.SiteName):

    ax.annotate(label.split(',')[0], xy=(x, y), xytext=(10, 10), textcoords="offset points")

We can also use other IDW implementations, one of which is shown below.

    Python
   
   from photutils.utils import ShepardIDWInterpolator as idw
vals=idw(sample[['lat','lng']].to_numpy(),sample['PM2.5'].to_numpy())
itrp['PM2.5']=vals(itrp[['lat','lng']])

Let us plot and see the results.

    Python
   
   ax=data.plot(color='white', edgecolor='black',figsize=(25,30))

itrp.plot(ax=ax, column='PM2.5', markersize=50,figsize=(25,30),legend=True)

sample.plot(ax=ax, marker='o', column='PM2.5', markersize=50,figsize=(25,30),label='SiteName')

for x, y, label in zip(sample.geometry.x, sample.geometry.y, sample.SiteName):

    ax.annotate(label.split(',')[0], xy=(x, y), xytext=(10, 10), textcoords="offset points")

As we can see, both methods' results are the same.

And that's the end of the tutorial. In this tutorial, we learned the basics of IDW by manually implementing IDW from scratch and using built-in libraries to perform IDW. Although there are a lot of other methods to perform interpolation, IDW is one of the easiest-to-understand and most robust methods. Hence, it's one of the most popular methods.

Interpolation Python (language)

Opinions expressed by DZone contributors are their own.

Related

Trending

Inverse Distance Weighting Interpolation in Python

This article will teach us how to do IDW interpolation in Python.

Using Built-in Libraries to Perform IDW Interpolation

Related

Partner Resources