使用python+selenium 调用浏览器打印功能将html转pdf，记录自己踩过的巨坑

如何调用浏览器自带的打印预览功能将HTML文件自动转为PDF文件。代码非常简单，只用了selenium一个第三方库，但有一个巨坑大家千万要注意！！！

安装方法：在cmd窗口中使用pip install selenium

完整代码如下：

import os
from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.chrome.options import Options
import json
from time import sleep

save_path = os.getcwd() # 当前文件所在的文件夹路径
options = Options()
settings = {
    "recentDestinations": [{
        "id": "Save as PDF",
        "origin": "local",
        "account": ""
    }],
    "selectedDestinationId": "Save as PDF",
    "version": 2,  # 另存为pdf，1 是默认打印机
    "isHeaderFooterEnabled": False,  # 是否勾选页眉和页脚
    "isCssBackgroundEnabled": True,  # 是否勾选背景图形
    "mediaSize": {
        "height_microns": 297000,
        "name": "ISO_A4",
        "width_microns": 210000,
        "custom_display_name": "A4",
    },
}
prefs = {
    'printing.print_preview_sticky_settings.appState': json.dumps(settings),
    'savefile.default_directory': save_path,
}
options.add_argument('--enable-print-browser') # 这一行试了，可用可不用
options.add_argument('--kiosk-printing')  # 静默打印，无需用户点击打印页面的确定按钮
options.add_experimental_option('prefs', prefs)
service = Service(executable_path="D:\\chromedriver.exe") # 谷歌浏览器驱动路径
driver = webdriver.Chrome(service=service, options=options)

url = "https://www.baidu.com/"
driver.get(url)
# 1.自定义pdf文件名字
# driver.execute_script(f'document.title="自定义文件名.pdf";window.print();')
# 2.默认pdf文件名字
driver.execute_script('window.print();')
# sleep这一行非常关键，时间短了，导致pdf还未生成，浏览器就关闭了。
# 如果html图片较多，保存的pdf文件较大，或者如果电脑配置不好，等待时间可以再设置长一点。
sleep(3)

使用python+selenium 调用浏览器打印功能将html转pdf，记录自己踩过的巨坑

driver.execute_script('window.print();')这行代码执行完毕后，一定要加sleep(3)，等待时长根据个人电脑配置和PDF文件的大小决定。就因为没有加等待，让我找bug找了好几个小时。

猜你喜欢