I've been wondering if using such exhaustive datasets would allow to discover the existing supply chain routes (which is usually a closed guarded trade secret). For that I would model the truck, ship and freight train traffic as edges, and mines/factories/companies as nodes in a graph. The weight of each edge is proportional to the number of containers that flow through it per unit of time. The question is, would it be possible to statistically infer the interdependence among nodes?
I started it and so far I got nice visualizations of air and maritime traffic. I should resume it when I find some time.
At my previous employer we had two people working on data analytics for container barge traffic within the Rhine delta in the Netherlands and Germany. They were able to overlay their predictions with the partial barge plannings that we actually had because we supplied software to inland container terminals.
Conclusion was that yes, you can predict locations of the terminals, but it's quite hard to find and produce repetitive schedules from the data because it requires the ships not change their schedule for too long, and the reality is just too dynamic to properly predict this.
For seagoing vessels I expect that they have more predictable schedules, but the hard part there is knowing container volumes, because there is no way to predict from public data what the volume of containers loaded/discharged is at a specific terminal. The closest metric is measuring the turnaround time of the ship, but that is not a great predictor for number of containers moved because handling times per container will differ per port, as will the number of cranes assigned to the ship, as will the ratio between 20ft and 40ft boxes, as well as the capability per port to do twinning (picking up 2x20ft boxes in one handling) and dual-cycling (discharging and loading a container as part of the same crane cycle).
Basically: it's a tough problem, with too many variables you cannot reasonably deduce.
(Also, even if you have the container volumes, I personally think the highest potential for optimization is in the hinterland, but that's a different story altogether)
> because there is no way to predict from public data what the volume of containers loaded/discharged is at a specific terminal
It's doable for a competitor. AIS has the vessels depth. If you assume the weight of a container to be similar you can monitor the vessels depth as they discharge and then load containers. You'll have to account for bunkering, empty containers, etc. Still, it's doable.
There's also some data (not sure if public, should be easy for any shipping company) for anything going to the US. You can use that data to further improve the previous estimates. See https://www.joc.com/regulation-policy/trade-data/united-stat.... Interestingly the JOC is terrible at predicting future trends despite having one of the most detailed data on their trade.
Note: the method above will not give you exact numbers. It's not actually needed to know the exact numbers, seeing the trends and the fluctuations/changes is already quite useful.
In yet other graphs, Great War mobilisation was largely a function of the rail networks. To this day, it's possible to buy a "rail" ticket between finland and sweden that winds up "air gapped" via bus transfer.
This is possible, although hard, and is being done at a number of businesses. Most of my experience in this space revolves around working with this business in a consulting capacity a few years ago - https://truebearinginsights.com. The short answer is that it gets easier when you focus on particular types and sizes of vessels.
Ships are classified into size-based categories. For example, a "capesize" vessel is a massive dry bulk ship of 100,000+ ton (and often much bigger) carrying capacity. Since they are the largest ships, there are many fewer of them in the world, they are size-restricted on the ports that they can visit, it is only economical to carry particular types of cargo in these volumes, and because of the volumes, the ports often sit close to the source.
Now, when you collect AIS data for capesize ships, you'll see trade lanes from Australia to China - these mostly carry Iron Ore from Australia's Pilbara region to steel/etc. factories in China (companies like Rio Tinto, BHP and FMG focus on this space). You'll see a similar trade lane from Brazil to China - this is also iron ore and led by a company called Vale. By analysing the trade lanes, cross referencing that with the raw materials that come out of each company, using Google Maps to investigate what the ships might be loading, etc., you can infer what the vessels carry and begin to extrapolate these routes. The same applies for many types of commodities, and is not limited to dry bulk.
It can be easier in some cases to figure out the trading routes for ocean transportation of containers because the carriers often publish their routes in advance, but this obviously doesn't apply to the rail and trucking side of things, containers are often transported a fair distance before reaching a port, and containers will often be trans-shipped (i.e., transferred from one vessel to another at a hub port like Singapore), so it will be difficult to find the root nodes here.
As you go down in vessel size, it gets harder. A small ship like a "handysize" could carry anything really - grain, potash, windmills, yachts (yachts often aren't insurable when crossing the open ocean so the owners typically have them shipped around the world instead). And there are tons of them. And their cargoes might be shipped long distances to the port. You can still find these nodes of course, but it will take a lot more work.
Fun fact - some ships turn off their AIS while being on seas and they are allowed to do so (non-commercial) which is a pain if you are sailing one handed or in bad weather conditions (fog).
I run one of the AIS stations that contributes to this. Of course, "RUN" is a bit of an overstatement, the little rPi unit basically just sits there unattended and feeds data received into the system without much need for any attention from me.
I am in the process of adding an uplink from my AIS650 on my boat to also send data received while underway, but need to finish up some other projects first.
I do not have much use for the commercial data side of this, but looking at past routes of various pleasure craft going to or from places I want to go can be informative for planning routes.
It's been a while since I have worked in this space but the underlying protocol (AIS) is incredibly lacking in terms of security due to the nature of plain text transmission & lack of authentication. I'm not sure if these issues have been addressed but below is some great research from 2014 on the matter [1][2].
I started it and so far I got nice visualizations of air and maritime traffic. I should resume it when I find some time.
Conclusion was that yes, you can predict locations of the terminals, but it's quite hard to find and produce repetitive schedules from the data because it requires the ships not change their schedule for too long, and the reality is just too dynamic to properly predict this.
For seagoing vessels I expect that they have more predictable schedules, but the hard part there is knowing container volumes, because there is no way to predict from public data what the volume of containers loaded/discharged is at a specific terminal. The closest metric is measuring the turnaround time of the ship, but that is not a great predictor for number of containers moved because handling times per container will differ per port, as will the number of cranes assigned to the ship, as will the ratio between 20ft and 40ft boxes, as well as the capability per port to do twinning (picking up 2x20ft boxes in one handling) and dual-cycling (discharging and loading a container as part of the same crane cycle).
Basically: it's a tough problem, with too many variables you cannot reasonably deduce.
(Also, even if you have the container volumes, I personally think the highest potential for optimization is in the hinterland, but that's a different story altogether)
It's doable for a competitor. AIS has the vessels depth. If you assume the weight of a container to be similar you can monitor the vessels depth as they discharge and then load containers. You'll have to account for bunkering, empty containers, etc. Still, it's doable.
There's also some data (not sure if public, should be easy for any shipping company) for anything going to the US. You can use that data to further improve the previous estimates. See https://www.joc.com/regulation-policy/trade-data/united-stat.... Interestingly the JOC is terrible at predicting future trends despite having one of the most detailed data on their trade.
Note: the method above will not give you exact numbers. It's not actually needed to know the exact numbers, seeing the trends and the fluctuations/changes is already quite useful.
(note that if one has access to SWIFT data, one can not only watch the goods transported in one direction but also the payments flowing in the other.)
In other graphs, I believe California has (had?) superior petrol environmental standards to the rest of the US because its pipeline network is largely independent: https://www.api.org/~/media/Oil-and-Natural-Gas-images/Pipel...
In yet other graphs, Great War mobilisation was largely a function of the rail networks. To this day, it's possible to buy a "rail" ticket between finland and sweden that winds up "air gapped" via bus transfer.
Ships are classified into size-based categories. For example, a "capesize" vessel is a massive dry bulk ship of 100,000+ ton (and often much bigger) carrying capacity. Since they are the largest ships, there are many fewer of them in the world, they are size-restricted on the ports that they can visit, it is only economical to carry particular types of cargo in these volumes, and because of the volumes, the ports often sit close to the source.
Now, when you collect AIS data for capesize ships, you'll see trade lanes from Australia to China - these mostly carry Iron Ore from Australia's Pilbara region to steel/etc. factories in China (companies like Rio Tinto, BHP and FMG focus on this space). You'll see a similar trade lane from Brazil to China - this is also iron ore and led by a company called Vale. By analysing the trade lanes, cross referencing that with the raw materials that come out of each company, using Google Maps to investigate what the ships might be loading, etc., you can infer what the vessels carry and begin to extrapolate these routes. The same applies for many types of commodities, and is not limited to dry bulk.
It can be easier in some cases to figure out the trading routes for ocean transportation of containers because the carriers often publish their routes in advance, but this obviously doesn't apply to the rail and trucking side of things, containers are often transported a fair distance before reaching a port, and containers will often be trans-shipped (i.e., transferred from one vessel to another at a hub port like Singapore), so it will be difficult to find the root nodes here.
As you go down in vessel size, it gets harder. A small ship like a "handysize" could carry anything really - grain, potash, windmills, yachts (yachts often aren't insurable when crossing the open ocean so the owners typically have them shipped around the world instead). And there are tons of them. And their cargoes might be shipped long distances to the port. You can still find these nodes of course, but it will take a lot more work.
[1]: https://www.vortexa.com/
[1]: https://www.kpler.com/
[2]: https://www.spinergie.com/
[3]: https://www.refinitiv.com/en/products/eikon-trading-software...
This is my space. A very crowded space, and incredibly chaotic, where data is outright wrong most of the time.
Personally, I don't prediction should be the end goal of these analyses, tracking and anomaly prediction would give more insight to the traders etc.
But of course I would only say this cause prediction is hard in this space.
DL (Deep Learning) Et al. don't really help. Traditional stats is more likely to be helpful in this case.
I am in the process of adding an uplink from my AIS650 on my boat to also send data received while underway, but need to finish up some other projects first.
I do not have much use for the commercial data side of this, but looking at past routes of various pleasure craft going to or from places I want to go can be informative for planning routes.
[1] https://www.blackhat.com/docs/asia-14/materials/Balduzzi/Asi...
[2] https://www.youtube.com/watch?v=5rt9dzu3I7U
The link is currently: https://marinetraffic24.com/pt/vesselfinder/
The English language page is: https://marinetraffic24.com/vesselfinder/
Hopefully a moderator like dang can edit the link.
Seems to work as expected in Chrome.