python下的pdf操作,pypdf2为不二之选,除了...这个:
用这段代码(网上一把把的)来填充pdf中的form field,表单域
# -*- coding: UTF-8 -*- from PyPDF2 import PdfFileWriter, PdfFileReader infile = "mb2.pdf" outfile = "c.pdf" pdf = PdfFileReader(open(infile, "rb"), strict=False) pdf2 = PdfFileWriter() field_dictionary = {'idnumber':'11ddas111','now_year':'2018-2-1','name':'好的'} pdf2.addPage(pdf.getPage(0)) pdf2.updatePageFormFieldValues(pdf2.getPage(0), field_dictionary) outputStream = open(outfile, "wb") pdf2.write(outputStream)
结果产生的pdf,用acrobat reader打开时不显示表单域的内容,只有在鼠标点击后才能显示,失去焦点后再次消失,只有重新copy paste之后才能显示:
当鼠标点击后会显示:
网上遇到类似情况的也不少:
https://stackoverflow.com/questions/47369740/pypdf2-appends-the-same-file-over-and-over 修改fields 提及,pypdf2 表单域不能在acrobat reader中显示,,不能填充cb和rb:
PyPdf2 seems to be the best option despite all the bugs python packages have for pdfs such as
fields not showing in acroreader
and
being unable to fill checkboxes or radio buttons
.
There does appear to be a bug (with pdfs generally? maybe) where the pdf file is
not redrawn
. If one
clicks on the field one can see
the new text that PyPDF2 entered however one then
has to manually copy and paste in order to see that change permanently
.
最终在这个issue里找到了解决方案:
https://github.com/mstamy2/PyPDF2/issues/355
and thanks this great guy (https://github.com/ademidun) 给出了参考:
扫描二维码关注公众号,回复:
211426 查看本文章
Okay, I think I have figured it out. If you read section 12.7.2 (page 431) of the PDF 1.7 specification, you will see that you need to set the NeedAppearances flag of the Acroform.
ok,我们不生产代码,只做代码的搬运工:D
如下是解决方案:
# -*- coding: UTF-8 -*- from PyPDF2 import PdfFileWriter, PdfFileReader from PyPDF2.generic import BooleanObject, NameObject, IndirectObject def set_need_appearances_writer(writer): # See 12.7.2 and 7.7.2 for more information: http://www.adobe.com/content/dam/acom/en/devnet/acrobat/pdfs/PDF32000_2008.pdf try: catalog = writer._root_object # get the AcroForm tree if "/AcroForm" not in catalog: writer._root_object.update({ NameObject("/AcroForm"): IndirectObject(len(writer._objects), 0, writer)}) need_appearances = NameObject("/NeedAppearances") writer._root_object["/AcroForm"][need_appearances] = BooleanObject(True) return writer except Exception as e: print('set_need_appearances_writer() catch : ', repr(e)) return writer infile = "mb2.pdf" outfile = "c.pdf" pdf = PdfFileReader(open(infile, "rb"), strict=False) if "/AcroForm" in pdf.trailer["/Root"]: pdf.trailer["/Root"]["/AcroForm"].update( {NameObject("/NeedAppearances"): BooleanObject(True)}) pdf2 = PdfFileWriter() set_need_appearances_writer(pdf2) if "/AcroForm" in pdf2._root_object: pdf2._root_object["/AcroForm"].update( {NameObject("/NeedAppearances"): BooleanObject(True)}) field_dictionary = {'idnumber':'11ddas111','now_year':'2018-2-1','name':'好的'} pdf2.addPage(pdf.getPage(0)) pdf2.updatePageFormFieldValues(pdf2.getPage(0), field_dictionary) outputStream = open(outfile, "wb") pdf2.write(outputStream)