Back in January, I gave a quick introduction on how to get started with PostGIS
In this post, I’m going to talk about how to do simple geospatial or geoprocessing tasks in PostGIS. With the upcoming native support coming in Tableau, I thought this would be a great opportunity to show the power and opportunity of PostGIS.
A little background
When I first got into geospatial 10+ years ago, the only software on the market was ESRI’s ArcMap. In the last few years, with the influx of open-source software, I have transitioned about 90% of my processes away from ESRI and to either QGIS or a database system.
When you are dealing with large datasets, you start to realize that ArcMap has performance issues when it comes to spatial joins and other geoprocessing tasks.
To get started, let’s look at some national park location data. I found a dataset on Kaggle that has the necessary location data.
First, you need to set up the table structure in PostGIS. I named the table national_parks and here’s how I set up the table:
Next, we have to load the data. Under tables, find your national_parks table and right click on it and follow the prompts. You after you import the data, the geom field will be empty.
To fill that geometry column in, we just have to run a simple update statement.
update national_parks set geom=st_SetSrid(st_MakePoint(“Longitude”, “Latitude”), 4326);
Now we have data in our database and we can start doing some basic PostGIS calculations.
Find places within x Distance
This is one of the most common calculations when it comes to analyzing data. We always want to know which places are within x location. With PostGIS this is pretty easy, it’s just a simple query. Let’s find the National parks within 1000 miles of Pittsburgh.
ST_DistanceSphere(‘SRID=4326;POINT(-80.0505401 40.431478)’::geometry,geom)/1609.344 as st_distanceSphere_in_miles
from public.national_parks a
where ST_DistanceSphere(‘SRID=4326;POINT(-80.0505401 40.431478)’::geometry,geom)/1609.344 <= 1000
order by st_distanceSphere_in_miles
The key part of this query is the ST_DistanceSphere function. This function calculates the distance in meters between two points. The 1609.344 is to convert meters to miles.
Point in Polygon
Let’s say that you are questioning which state each national park falls in. From the previous post, I showed how to import a shapefile, now we are going to do a special join to find which state the centroid falls in.
SELECT a.”Park_Name”, a.”State”, b.stusps, b.name
from public.national_parks a
inner join public.tl_2018_us_state b
on ST_WITHIN(a.geom, b.geom)
The ST_WITHIN function is the important part of this query. ST_WITHIN says when the point is within the state, then join the point with the state it falls in, else don’t.
Generate Random Points
Another cool thing you can do with PostGIS is to create random points within a state, county, block group, etc. The function is fairly simple:
SELECT stusps, name, geom, (ST_Dump(ST_GeneratePoints(geom, 1000))).geom::Geometry(point, 4326) AS geom_gen_points
where stusps = ‘PA’
What does this look like? (A little sneak peak to the Tableau 2019.2 Beta)
Or throwing them together on a dual-axis:
Why would you want to do this? A few reasons. Let’s say you wanted to create a dot density map, like Sarah Battersby did back in May 2018:
Or let’s say you have a 3 mile ring around a point, and you want to find the demographic attributes of that area. You could easily generate points, dump them, join them to the respective block group, etc, and then easily find the weighted demographics of that 3 mile ring.
This just scratches the surface of all the opportunity PostGIS. There are countless more use cases of PostGIS such as convex and concave hulls, and intersects and length. Maybe I’ll get into those next time.