2025年1月OpenStreetMap渲染工具链技术变化和数据导入情况

2025年,又是一年。不知什么时候,导入OpenStreetMap离线数据成了我折腾电脑的一个习惯,和盘文玩一个道理,到了这个点儿,咱不撸几个手串就闲得慌。

上一次盘这个东西,还是在2024年的这个时间。 其实,我真的一点也用不着这个OpenStreetMap。如果硬说用得着,就是给我的taskBus 业余无线电(手串上的另一个核桃)提供一个不一样色彩的珠子作伴,这样它就不孤单了。 作为一个末日生存控,在架空的世界里想象只有基础的电力情况下,去联系失散的人类幸存者时需要具备的技术链条:

  1. 尽可能完整的全球地图数据,单机即可运行。
  2. SDR 软件定义无线电系统,作为业余无线电的补充。
  3. 极简工具链 :希望使用极少的工具链就能够完成尽可能多的工作。

从目前大模型的发展来看,不久的将来,上述环节都可以使用1部离线的大模型平台来解决。最后和前文说的一样,这个数据只供研究学习,不能作为WebGIS对外发布。

1. 技术细节变化

2025年的openStreetMap渲染器工具链和2024年相比,还是发生了显著变化。

1.1 openstreetmap-carto 换用lua脚本

目前,最新版的carto工程采用 lua 脚本作为 osm2pgsql 导入时的样式文件。文档说,这种样式文件和以前的.style文件相比,灵活性更强了。

flex样式文件是一个Lua脚本。你可以使用Lua语言的所有功能。它为编写复杂的预处理脚本提供了很大的自由度,这些脚本甚至可能使用外部库来扩展osm2pgsql的功能。但这也意味着,如果编写不当,脚本可能会弄乱系统的任何部分。要审视脚本的内容,只运行来自可信来源的Lua脚本。

这是默认的脚本 openstreetmap-carto-flex.lua 的一部分:


-- This is the osm2pgsql configuration for the OpenStreetMap Carto map style
-- using the osm2pgsql flex output.

-- It is written in a way that it can be used with or without the Themepark
-- framework. For more about Themepark see https://osm2pgsql.org/themepark/ .

-- ---------------------------------------------------------------------------

-- CONFIGURATION

-- Prefix for all output table names.
--
-- (This used to be set with the --prefix command line option, but note the
-- trailing '_' letter which was not needed with the command line option.)
local PREFIX = 'planet_osm_'

-- Set this to the database schema.
--
-- (This used to be set with the --output-pgsql-schema command line option.)
local SCHEMA = 'public'

-- ---------------------------------------------------------------------------

-- Needed for use with the Themepark framework
local themepark = ...

-- ---------------------------------------------------------------------------

-- A list of columns per table in the order they will appear in the database
-- tables. Columns can either be
-- * a string ('highway') in which case they will be added as 'text' column or
-- * a Lua table with a column definition for the define_table() command.
local table_columns = {
    
    
    point = {
    
    
        'access',
        'addr:housename',
        'addr:housenumber',
        'admin_level',
        'aerialway',
        'aeroway',
        'amenity',
        'barrier',
        'boundary',
        'building',
        'highway',
        'historic',
        'junction',
        'landuse',
        {
    
     column = 'layer', type = 'int4' },
        'leisure',
        'lock',
        'man_made',
        'military',
        'name',
        'natural',
        'oneway',
        'place',
        'power',
        'railway',
        'ref',
        'religion',
        'shop',
        'tourism',
        'water',
        'waterway',
        {
    
     column = 'tags', type = 'hstore' },
    },

///....

1.2 openstreetmap-carto 加入额外sql脚本

install.md里,详细介绍了如何导入额外的函数和索引。 对于2025年的osm2pgsql以及openstreetmap-carto,如果不按照新的install.md指南来做:

  • 不创建functions.sql里面包含的函数,则 mapnik 会报错。
    可以看到,funtions.sql里基于postgis实现了一些函数:
/* Additional database functions for openstreetmap-carto */

/* Access functions below adapted from https://github.com/imagico/osm-carto-alternative-colors/tree/591c861112b4e5d44badd108f4cd1409146bca0b/sql/roads.sql */

/* Simplified 'yes', 'destination', 'no', 'unrecognised', NULL scale for access restriction 
  'no' is returned if the rendering for highway category does not support 'restricted'.
  NULL is functionally equivalent to 'yes', but indicates the absence of a restriction 
  rather than a positive access = yes. 'unrecognised' corresponds to an uninterpretable 
  access restriction e.g. access=unknown or motorcar=occasionally */
CREATE OR REPLACE FUNCTION carto_int_access(accessvalue text, allow_restricted boolean)
	RETURNS text
	LANGUAGE SQL
	IMMUTABLE PARALLEL SAFE
AS $$
SELECT
	CASE
		WHEN accessvalue IN ('yes', 'designated', 'permissive') THEN 'yes'
		WHEN accessvalue IN ('destination',  'delivery', 'customers') THEN
			CASE WHEN allow_restricted = TRUE  THEN 'restricted' ELSE 'yes' END
		WHEN accessvalue IN ('no', 'permit', 'private', 'agricultural', 'forestry', 'agricultural;forestry') THEN 'no'
		WHEN accessvalue IS NULL THEN NULL
		ELSE 'unrecognised'
	END
$$;

/* Try to promote path to cycleway (if bicycle allowed), then bridleway (if horse)
   This duplicates existing behaviour where designated access is required */
CREATE OR REPLACE FUNCTION carto_path_type(bicycle text, horse text)
	RETURNS text
	LANGUAGE SQL
	IMMUTABLE PARALLEL SAFE
AS $$
SELECT
	CASE
		WHEN bicycle IN ('designated') THEN 'cycleway'
		WHEN horse IN ('designated') THEN 'bridleway'
		ELSE 'path'
	END
$$;

/* Return int_access value which will be used to determine access marking.
   Return values are documented above for carto_int_access function.

   Note that the code handling the promotion of highway=path assumes that
   promotion to cycleway or bridleway is based on the value of bicycle or
   horse respectively. A more general formulation would be, for example,
   WHEN 'cycleway' THEN carto_int_access(COALESCE(NULLIF(bicycle, 'unknown'), "access"), FALSE) */
CREATE OR REPLACE FUNCTION carto_highway_int_access(highway text, "access" text, foot text, bicycle text, horse text, motorcar text, motor_vehicle text, vehicle text)
  RETURNS text
  LANGUAGE SQL
  IMMUTABLE PARALLEL SAFE
AS $$
SELECT
	CASE
		WHEN highway IN ('motorway', 'motorway_link', 'trunk', 'trunk_link', 'primary', 'primary_link', 'secondary',
					 'secondary_link', 'tertiary', 'tertiary_link', 'residential', 'unclassified', 'living_street', 'service', 'road') THEN
			carto_int_access(
				COALESCE(
					NULLIF(motorcar, 'unknown'),
					NULLIF(motor_vehicle, 'unknown'),
					NULLIF(vehicle, 'unknown'),
					"access"), TRUE)
		WHEN highway = 'path' THEN
			CASE carto_path_type(bicycle, horse)
				WHEN 'cycleway' THEN carto_int_access(bicycle, FALSE)
				WHEN 'bridleway' THEN carto_int_access(horse, FALSE)
				ELSE carto_int_access(COALESCE(NULLIF(foot, 'unknown'), "access"), FALSE)
			END
		WHEN highway IN ('pedestrian', 'footway', 'steps') THEN carto_int_access(COALESCE(NULLIF(foot, 'unknown'), "access"), FALSE)
		WHEN highway = 'cycleway' THEN carto_int_access(COALESCE(NULLIF(bicycle, 'unknown'), "access"), FALSE)
		WHEN highway = 'bridleway' THEN carto_int_access(COALESCE(NULLIF(horse, 'unknown'), "access"), FALSE)
		ELSE carto_int_access("access", TRUE)
	END
$$;

  • 不创建indexes.sql里面包含的函数,则 mapnik 会巨慢。

indexes里面建立索引,可以大幅度增加渲染速度。

1.3 osm2pgsql 升级到2.0

与carto项目配合,osm2pgsql 2.0 支持flex lua格式的样式. flex输出允许灵活的配置,告诉osm2pgsql将哪些OSM数据存储在数据库中,以及具体存储位置和方式。它是通过Lua文件配置的,该文件定义了输出表的结构,并定义了将OSM数据映射到数据库数据格式的函数。可以使用-S,–style=file选项指定Lua文件的名称,同时使用-O flex或–output=flex选项指定flex输出的使用。老的 V1.X版本的.style模式未来会被弃用。

同时,可以详细阅读osm2pgsql对postgresql调优的策略,对学习pg也是受益匪浅。PostgreSQL服务器的常规安装附带了一个默认配置,该配置不适合大型数据库。您应该在postgresql.conf中更改这些设置,并在运行osm2pgsql之前重新启动postgresql,否则系统将比所需的慢得多。以下设置适用于具有128GB RAM和快速SSD的系统。第二列中的值是为典型设置提供良好起点的建议,您可能需要根据您的用例对其进行调整。第三列中的值是PostgreSQL 15的默认设置。

扫描二维码关注公众号,回复: 17569206 查看本文章

turn
还有表格之外的详细、额外的解释和增补介绍,请自行前往官网查看。

2. 导入情况

采用2025-01-06下载的全球pbf,有80多GB,我们在一台64GB内存、4TB SSD 的工作站上展开。由于中间发生了停电,在UPS支持下,VirtualBox宿主实施了休眠,到第二天才发现,所以耗时不再具备参考价值。

[user@map_virtualbox ~]$ osm2pgsql -c -s -O flex -S"./openstreetmap-carto/openstreetmap-carto-flex.lua" -C28000 -dgis --drop --flat-nodes "./cache/flat_node" './osm/planet-latest.osm.pbf' 
2025-01-06 21:23:30  osm2pgsql version 2.0.0
2025-01-06 21:23:30  Database version: 16.3
2025-01-06 21:23:30  PostGIS version: 3.4
2025-01-06 21:23:30  Initializing properties table '"public"."osm2pgsql_properties"'.
2025-01-06 21:23:30  Storing properties to table '"public"."osm2pgsql_properties"'.
2025-01-07 14:40:11  Reading input files done in 62201s (17h 16m 41s).                    
2025-01-07 14:40:11    Processed 9611853633 nodes in 8365s (2h 19m 25s) - 1149k/s
2025-01-07 14:40:11    Processed 1073625133 ways in 32736s (9h 5m 36s) - 33k/s
2025-01-07 14:40:11    Processed 12848375 relations in 21100s (5h 51m 40s) - 609/s
2025-01-07 14:40:11  No marked nodes or ways (Skipping stage 2).
2025-01-07 14:40:15  Dropping table 'planet_osm_nodes'
2025-01-07 14:40:15  Table 'planet_osm_nodes' dropped in 0s
2025-01-07 14:40:15  Dropping table 'planet_osm_ways'
2025-01-07 14:40:17  Table 'planet_osm_ways' dropped in 1s
2025-01-07 14:40:17  Dropping table 'planet_osm_rels'
2025-01-07 14:40:17  Table 'planet_osm_rels' dropped in 0s
2025-01-07 14:40:17  Done postprocessing on table 'planet_osm_nodes' in 0s
2025-01-07 14:40:17  Done postprocessing on table 'planet_osm_ways' in 0s
2025-01-07 14:40:17  Done postprocessing on table 'planet_osm_rels' in 0s
2025-01-07 14:40:17  Clustering table 'planet_osm_point' by geometry...
2025-01-07 14:40:17  Clustering table 'planet_osm_roads' by geometry...
2025-01-07 14:40:17  Clustering table 'planet_osm_line' by geometry...
2025-01-07 14:40:17  Clustering table 'planet_osm_polygon' by geometry...
2025-01-07 16:43:03  Creating index on table 'planet_osm_point' ("way")...
2025-01-07 16:52:31  Creating index on table 'planet_osm_roads' ("way")...
2025-01-07 17:13:23  Analyzing table 'planet_osm_roads'...
2025-01-07 17:51:19  Analyzing table 'planet_osm_point'...
2025-01-07 23:52:13  Creating index on table 'planet_osm_line' ("way")...
2025-01-08 01:24:13  Analyzing table 'planet_osm_line'...
2025-01-08 05:27:00  Creating index on table 'planet_osm_polygon' ("way")...
2025-01-08 07:51:23  Analyzing table 'planet_osm_polygon'...
2025-01-08 07:52:06  All postprocessing on table 'planet_osm_polygon' done in 61908s (17h 11m 48s).
2025-01-08 07:52:06  All postprocessing on table 'planet_osm_roads' done in 9258s (2h 34m 18s).
2025-01-08 07:52:06  All postprocessing on table 'planet_osm_point' done in 11468s (3h 11m 8s).
2025-01-08 07:52:06  All postprocessing on table 'planet_osm_line' done in 38747s (10h 45m 47s).
2025-01-08 07:52:06  Storing properties to table '"public"."osm2pgsql_properties"'.
2025-01-08 07:52:06  osm2pgsql took 124116s (34h 28m 36s) overall.

3. 渲染情况

在新的postgresql 16.3, osm2pgsql 2.0下,与2024年相比,rended 的效率感觉提升了2-4倍。 不过也可能是一种错觉。毕竟本次切换了更快的 NVMe SSD。

render

4 虚拟机镜像

虚拟机镜像依旧参照前文链接