pypdf2填充form field表单域后不显示问题的解决

python下的pdf操作,pypdf2为不二之选,除了...这个:

用这段代码(网上一把把的)来填充pdf中的form field,表单域

# -*- coding: UTF-8 -*- 
from PyPDF2 import PdfFileWriter, PdfFileReader

infile = "mb2.pdf"
outfile = "c.pdf"

pdf = PdfFileReader(open(infile, "rb"), strict=False)

pdf2 = PdfFileWriter()

field_dictionary = {'idnumber':'11ddas111','now_year':'2018-2-1','name':'好的'}

pdf2.addPage(pdf.getPage(0))
pdf2.updatePageFormFieldValues(pdf2.getPage(0), field_dictionary)

outputStream = open(outfile, "wb")
pdf2.write(outputStream)

 结果产生的pdf,用acrobat reader打开时不显示表单域的内容,只有在鼠标点击后才能显示,失去焦点后再次消失,只有重新copy paste之后才能显示:

 

 当鼠标点击后会显示:



 

网上遇到类似情况的也不少:

https://stackoverflow.com/questions/47369740/pypdf2-appends-the-same-file-over-and-over 修改fields 提及,pypdf2 表单域不能在acrobat reader中显示,,不能填充cb和rb:
   PyPdf2 seems to be the best option despite all the bugs python packages have for pdfs such as fields not showing in acroreader and being unable to fill checkboxes or radio buttons .  
There does appear to be a bug (with pdfs generally? maybe) where the pdf file is not redrawn . If one clicks on the field one can see the new text that PyPDF2 entered however one then has to manually copy and paste in order to see that change permanently .

最终在这个issue里找到了解决方案:

https://github.com/mstamy2/PyPDF2/issues/355

and thanks this great guy (https://github.com/ademidun) 给出了参考:

扫描二维码关注公众号,回复: 211426 查看本文章

Okay, I think I have figured it out. If you read section 12.7.2 (page 431) of the PDF 1.7 specification, you will see that you need to set the NeedAppearances flag of the Acroform.

ok,我们不生产代码,只做代码的搬运工:D

如下是解决方案:

# -*- coding: UTF-8 -*- 
from PyPDF2 import PdfFileWriter, PdfFileReader
from PyPDF2.generic import BooleanObject, NameObject, IndirectObject

def set_need_appearances_writer(writer):
    # See 12.7.2 and 7.7.2 for more information: http://www.adobe.com/content/dam/acom/en/devnet/acrobat/pdfs/PDF32000_2008.pdf
    try:
        catalog = writer._root_object
        # get the AcroForm tree
        if "/AcroForm" not in catalog:
            writer._root_object.update({
                NameObject("/AcroForm"): IndirectObject(len(writer._objects), 0, writer)})

        need_appearances = NameObject("/NeedAppearances")
        writer._root_object["/AcroForm"][need_appearances] = BooleanObject(True)
        return writer

    except Exception as e:
        print('set_need_appearances_writer() catch : ', repr(e))
        return writer

infile = "mb2.pdf"
outfile = "c.pdf"

pdf = PdfFileReader(open(infile, "rb"), strict=False)
if "/AcroForm" in pdf.trailer["/Root"]:
    pdf.trailer["/Root"]["/AcroForm"].update(
        {NameObject("/NeedAppearances"): BooleanObject(True)})

pdf2 = PdfFileWriter()
set_need_appearances_writer(pdf2)
if "/AcroForm" in pdf2._root_object:
    pdf2._root_object["/AcroForm"].update(
        {NameObject("/NeedAppearances"): BooleanObject(True)})

field_dictionary = {'idnumber':'11ddas111','now_year':'2018-2-1','name':'好的'}

pdf2.addPage(pdf.getPage(0))
pdf2.updatePageFormFieldValues(pdf2.getPage(0), field_dictionary)

outputStream = open(outfile, "wb")
pdf2.write(outputStream)

猜你喜欢

转载自kissmett.iteye.com/blog/2410098