Readit News logoReadit News
ronyfadel · 6 years ago
I'm currently working on a project that involves dataviz & mapping, and one of the things I've discovered along the way was WikiData's SPARQL query engine.

Mind = blown.

You can basically ask Wiki(pedia) almost anything you can think of (including the largest city in a bounding box) using the same query language.

Examples:

- Largest cities per country (https://w.wiki/UC4)

- Cities connected by the European route E40 (https://w.wiki/74E)

- Streets in France named after a woman (https://w.wiki/34K)

philshem · 6 years ago
You can even request a query from the community:

https://wikidata.org/wiki/Wikidata:Request_a_query

ronyfadel · 6 years ago
That's handy! To be honest, SPARQL is definitely not easy to use.
caf · 6 years ago
I quibble with the description of the shapes bounded by 10 degrees of longitude and latitude as "rectangles" - they're not even planar shapes, and their adjacent sides certainly aren't at right angles. Some of them don't even have four sides.

This bears considering when looking at the map, because some of those regions are much, much smaller than others.

paulirwin · 6 years ago
Mostly an aside to your comment, but it's somewhat related. This post's divisions of 10° latitude/longitude is pretty close to the Maidenhead grid system, used heavily by amateur radio operators like myself.

The difference is that each top-level grid "square" (to your point, not actually square, but that's what they're called) is 20° longitude by 10° latitude represented by two letters. While the computation of a 4 or more character grid locator code is complex enough that most people can't quite do that in their head, because of them being treated as if they were coordinate "rectangles", it is simple enough to translate lat/lon coordinates to a grid code and vice versa with pen and paper if needed in the case of an emergency, if you know the algorithm or have a reference sheet. The 10° latitude size means that the parallels are the same in the Maidenhead system (for the first two letters of a code) as in this post. It also has the added benefit of knowing that DN is north of DM, which are both west of EM. I've gotten to where I can roughly place someone I hear on the radio based on a mental map of grid codes, and have memorized many grid codes of large population centers.

The beauty of the grid code system is that you can further refine an area by adding on subsequent numbers and letters, much like degrees/minutes/seconds in coordinates, but requiring significantly less characters to read to others over the radio. And, you can use phonetics for the letters, i.e. "delta mike seven niner" (DM79) is roughly the entire Denver metro area. Fort Collins, CO on this parent post falls under DN, above that 10°-sized parallel.

More info on the Maidenhead system: https://en.wikipedia.org/wiki/Maidenhead_Locator_System

Map with two-letter grid codes: https://www.mapability.com/ei8ic/maps/gridworld.php

NopeNotToday · 6 years ago
He's not making a mathematical definition of the areas. 'area' or 'box' is a simple description that everyone can understand. And the map projection used does form squares.
chrisseaton · 6 years ago
> And the map projection used does form squares.

Some of those squares have zero-length edges.

beerandt · 6 years ago
Quadrangle is the proper term (at least for four-sided boundaries). Hence the common name for USGS topographic maps: quads.
tantalor · 6 years ago
> their adjacent sides certainly aren't at right angles

They certainly are. Parallels and meridians always meet at 90 degrees.

https://www.geogebra.org/m/kNtNZzsS

roelschroeven · 6 years ago
I'm not 100% sure about that. Are angles between circles on a sphere defined by the angles between their tangent lines? If it were true, you could for each point/meridian combination construct an infinite amount of different circles that would all meet at right angles with that meridian (just vary the circle's radius). That doesn't feel quite right.

According to Wikipedia [1], "in spherical geometry, angles are defined between great circles". Meridians are great circles, but parallels are not. Possibly the angle is simply not defined for circles that are not great circles?

[1] https://en.wikipedia.org/wiki/Spherical_geometry

doersino · 6 years ago
Right angles with regard to the latitude-longitude coordinate system, but not with regard to the surface of the earth. Recall that all meridians intersect at the poles.
caf · 6 years ago
Thankyou for the correction.
mjd · 6 years ago
Since you want to quibble, I'll just point out that the word “rectangles” does not appear in the article.
areyousure · 6 years ago
phreeza · 6 years ago
Maybe the author should have gone with the faces of a Disdyakis triacontahedron
dllu · 6 years ago
The dual of a geodesic polyhedron, e.g. an icosahedron whose faces have been subdivided is in fact a common and excellent choice for geospatial applications, with https://github.com/uber/h3 being a good implementation.

You can also subdivide the faces of a cube into smaller squares with a quadtree, e.g. https://s2geometry.io/

The Disdyakis triacontahedron's faces are probably too long and skinny for most geospatial applications though.

jsjohnst · 6 years ago
For anyone else not familiar with the exact shape:

https://en.wikipedia.org/wiki/Disdyakis_triacontahedron

lambdasquirrel · 6 years ago
Well it lends to an aliasing effect, that in Europe is even more comical. You completely lose the major cities of most countries like Ireland, Sweden, Denmark, and Poland. Central Europe is blotted out by Rome. The Romans finally conquer (most of) Germany. ;)
qw · 6 years ago
It also managed to include 5 Norwegian cities, but missed the largest city because it was in the same square as Berlin.
hammock · 6 years ago
You make a good point. What is an equal-area, tessellated shape that could be used in its place?

I guess a triangle. Or one of these:

https://upload.wikimedia.org/wikipedia/commons/thumb/6/67/Sp...

https://upload.wikimedia.org/wikipedia/commons/thumb/9/9b/Sp...

mjevans · 6 years ago
https://en.wikipedia.org/wiki/Geodesic_polyhedron

For just shapes the above is a nice visual start.

However a striking feature of the linked article is how frequently there's a major city right near a cell edge.

A more useful list might be constructed by some process involving assigning a "city" an imprint size based on population, and a surrounding 'metro area' based on absorbing any weaker cities that overlap until the process either repeats or is matched by a neighboring city.

A starting point for results that are already similar to that: https://en.wikipedia.org/wiki/Metropolitan_area

The US specific list: https://en.wikipedia.org/wiki/List_of_metropolitan_statistic...

joshvm · 6 years ago
The South Pole is a particularly interesting case. Clearly the Pole itself is 90 South and the choice of longitude is arbitrary there. However, Amundsen-Scott station, which is a hundred odd metres away, happens to be at ~140 E (89.997553 S!) and due to the map projection it appears as if it's in a totally different place.

It would be interesting to see an interactive version with a free choice of meridians.

oh_sigh · 6 years ago
But they are rectangles on the projection - they just aren't rectangles if you map the projection back onto the globe.

Deleted Comment

wcarey · 6 years ago
A fun thought experiment is to noodle through whether, given the method of choosing cities described, the choice of map projection would influence the resulting list of cities.
greggyb · 6 years ago
Latitude and longitude exist as boundaries outside the domain of map projections. A different projection may draw these lines or curves differently, but they still delineate the same geographic boundaries. E.g. a different projection may not show rectangles, but instead curved regions, but the cities existing in those regions would be the same.

If you are thinking of drawing arbitrary rectangles on arbitrary projections, then you'd have to come up with a rigorous way to define your rectangles. If you don't, then you're just drawing random shapes on other random shapes. Either way, yes, a different projection with the same rectangles drawn over it would yield different answers, and this should be immediately obvious.

jobigoud · 6 years ago
Even the exact same projection but a slightly shifted 0° meridien would yield a completely different list of cities.

Deleted Comment

Deleted Comment

foota · 6 years ago
I think something more interesting might be like a voronoi diagram of all cities such that the city is in like the top 5 within a 1000 miles radius or something (or just that but without the voronoi part). This should preserve the fact that only the largest cities in some area are present, but eliminate the arbitrariness of the boundaries. I think you would want to compute this with a sweep line algorithm.
bnjmn · 6 years ago
This isn't exactly what you asked for, but it seems pretty relevant to your stated interests: https://www.jasondavies.com/maps/voronoi/capitals/
foota · 6 years ago
My head nearly exploded trying to position the globe in the rotation I wanted :-) Neat though, thanks for sharing!
incompatible · 6 years ago
A map projection that preserves area would be better, otherwise there's an excessive number of cities in the far northern hemisphere, and a lack of detail in the tropics.
toxik · 6 years ago
Meanwhile, neither Sweden’s nor Norway’s capital is in the map. But two smaller Norwegian cities are.

I think this just teaches you the issue with discretization.

Tempest1981 · 6 years ago
I like xkcd's description of Mercator: https://xkcd.com/977/
foota · 6 years ago
I think the idea if to show the most populous cities in some area, not the density of highly populated cities, otherwise I would agree.
mjd · 6 years ago
If you did it, I'd enjoy seeing it.
jl6 · 6 years ago
I wonder if London could be the largest city in two boxes. The zero meridian line goes through Greenwich, which leaves most of London in the western box, but a still sizeable chunk of East London in the eastern box.

Is East London bigger than Hamburg?

Are there any other large cities that straddle the bounding boxes?

colourgarden · 6 years ago
Brussels (c.1.2m) and Cologne (c.1.1m) are both in that box, I believe.

It seems the population of the parts of Greater London east of the Meridian is something around one million[1] so it would be close.

1. https://www.citypopulation.de/en/uk/greaterlondon/

sdflhasjd · 6 years ago
And Hamburg at 10°E is also split, so really the question is: Is East London bigger than West Hamburg
allengeorge · 6 years ago
I’m fairly sure Toronto is larger than Chicago...

EDIT: Yeah, back in 2013: https://www.thestar.com/news/city_hall/2013/03/05/torontos_p... - and I’m pretty sure it’s almost at or crossed the 3M mark (COVID may have tossed a wrench in the latter).

Ah. It’s because Toronto is in the same box as _New York_ - my bad!

griffinkelly · 6 years ago
Chicago is having significant population loss, so that gap will only grow. I believe Houston is about to pass Chicago.
hellofunk · 6 years ago
Why is Chicago losing people?
vikramkr · 6 years ago
A lot of comments about metros areas and the like, such as how Jacksonville shows up because of city limits being larger than atlanta, but if you start getting I to trying to define what a city's metro area is, you run into a lot of issues with giant powerful cities like NY. Can you really count the population of Newark, New Jersey as part of NYC, NY? Even though newark is firmly within the NYC metro area, it's a separate city in a different state with it's own government etc. And what do you do about the bay? Is San Francisco part of san Jose? What about twin cities, like dallas and fort worth? This probably makes the most sense as a way of doing this map since at least it's clear cut.
frogpelt · 6 years ago
Jacksonville is the largest U.S. city by area, in the contiguous 48 states.

Alaska has several cities with enormous boundaries.

Source: https://en.wikipedia.org/wiki/List_of_United_States_cities_b...

vikramkr · 6 years ago
yes that's exactly what I'm referring to as the problem - the city limits are arbitrary administrative lines. What would not be counted in atlanta as simply part of the suburb would be counted towards the population of jacksonville because the boundaries just happen to be huge, which means jacksonville ends up larger than atlanta, even though more people are in the atlanta metro area and would call themselves as people from atlanta as there are people in the jacksonville metro area
thescriptkiddie · 6 years ago
One way to work around the arbitrariness of city limits would be to only count the population living within a fixed distance of the city center. I would suggest 5 or 6 km, which is approximately the distance that your can comfortably walk in 1 hour, and also (perhaps not coincidentally) the approximate radius of Paris. This would get you a list of dense cities, which is probably what people are imagining when they thing of "large" cities.

Another way would be to count everyone as part of the population of whatever city they are physically closest to, without regard to political boundaries. But this would probably just get you a list of sprawling metropolitan areas.

bitslayer · 6 years ago
A good way to define metropolitan areas is by commuter patterns. If a certain percentage of residents of a county or town all commute to the same adjoining larger town, then that gets counted.
shalmanese · 6 years ago
Not all cities have centers and some cities have multiple different centers that are quite far away from each other depending on what criterion you use.
alexhutcheson · 6 years ago
The choice of geographic unit depends what questions you're interested in answering. If you're trying to answer questions about government, tax base, city services, etc., then you definitely want to use legal/administrative divisions as your unit of analysis. If you're interested in answering demographic questions, like population growth or labor pool, then statistical areas like CBSAs are more useful, because differences in administrative boundaries would introduce inconsistencies in your data.

You can also define a region as an area of (relatively) continuous density - the Census does this and labels them "urbanized areas". Any one of the 3 approaches (administrative boundaries, commuting zones, density) can be reasonable, depending on what sort of questions you'd like to answer.

> Can you really count the population of Newark, New Jersey as part of NYC, NY?

If you're looking at metro areas (which are defined by commuting regions), then you definitely should, because a sizable fraction of the residents of Newark and the surrounding communities commute into NYC for work. The Census does define a sub-unit called metropolitan division, and Newark, NJ is one of these.

> Is San Francisco part of san Jose?

If your unit of analysis is MSA, then no - "San Francisco-Oakland-Hayward, CA" and "San Jose-Sunnyvale-Santa Clara, CA". This division recognizes that the two have separate (but overlapping) commuting zones - very few people commute from Richmond to Sunnyvale, or from Milpitas to San Francisco.

However, they are both within the "San Jose-San Francisco-Oakland, CA" Combined Statistical Area, which is a broader unit which recognizes that there are commuting relationships between the two areas - they are just weaker than the commuting relationships within the MSAs.

> What about twin cities, like dallas and fort worth?

Dallas-Fort Worth-Arlington, TX is an MSA. For the purposes of maps like this, you can generally just take the name of the city that come first in the MSA name and people will understand what you're referring to.

This slide deck from the US Census Bureau is pretty interesting for understanding their geographic taxonomy, and the trade-offs of different levels of observation: https://www.census.gov/content/dam/Census/data/developers/ge...

mytailorisrich · 6 years ago
Guangzhou is usually reported as larger than Shenzhen.

A specific issue with the population of Chinese cities, though, is that the administrative 'city' division in China can be quite larger than what might be expected in the West.

horsawlarway · 6 years ago
a lot these are pretty bad representations of real population, even in the US.

ex: Jacksonville is listed as the largest city in the southeast, and strictly speaking, this is true because it's a fairly dense population area. But it's a pretty misleading statement to say that it's the largest city.

Jacksonville's metro population is only about 1.5 million.

Atlanta is in the same box and Atlanta's metro population is nearly 4 times larger (5.9 million), Nashville also beats Jacksonville (1.9 million)

markphip · 6 years ago
I came to make the same comment. I was surprised to see that "technically" Jacksonville is nearly 2x larger than Atlanta but digging deeper it just seems about city borders. Jacksonville is pretty large and not that dense. Atlanta is pretty small and fairly dense. When you look at the metropolitan areas Atlanta is nearly 4x larger than Jacksonville metro area and that really just includes the area covered by MARTA so it is all pretty much "Atlanta".
mc32 · 6 years ago
To me that’s not misleading in that they count city proper rather than metro areas.

Both have their places, but I wouldn’t say one is more misleading than the other. One is administrative and one is more demographic.

bryanrasmussen · 6 years ago
Lies, Damned Lies, and Geographic Visualizations.
nathancahill · 6 years ago
Used to live on Baseline Rd. It originally split Nebraska Territory to the north and Kansas Territory to the south, originally surveyed in 1859. The Colorado territory wasn't formed until two years later in 1861. (Utah territory originally began somewhere in the Rockies beyond Boulder).