NeedAppearances=pdfrw.PdfObject('true') forces manual pdf save in Acrobat Reader

Question:

We have a pdf form file example.pdf which has 3 columns:

name_1,
company_1, and
client_1

Our data to fill is in Hebrew as well as English.
Our goal is to have a file which can be opened RTL in both a Browser and Acrobat Reader.
Our goal is met when we manually save the exported file from the following code, but we
would like not to have to save it manually or, if no other option, save it programmatically.

import pdfrw


INVOICE_TEMPLATE_PATH = 'example.pdf'
INVOICE_OUTPUT_PATH = 'output.pdf'


ANNOT_KEY = '/Annots'
ANNOT_FIELD_KEY = '/T'
ANNOT_VAL_KEY = '/V'
ANNOT_RECT_KEY = '/Rect'
SUBTYPE_KEY = '/Subtype'
WIDGET_SUBTYPE_KEY = '/Widget'


def write_fillable_pdf(input_pdf_path, output_pdf_path, data_dict):
    template_pdf = pdfrw.PdfReader(input_pdf_path)
    template_pdf.Root.AcroForm.update(pdfrw.PdfDict(NeedAppearances=pdfrw.PdfObject('true')))
    annotations = template_pdf.pages[0][ANNOT_KEY]
    for annotation in annotations:
        if annotation[SUBTYPE_KEY] == WIDGET_SUBTYPE_KEY:
            if annotation[ANNOT_FIELD_KEY]:
                key = annotation[ANNOT_FIELD_KEY][1:-1]
                if key in data_dict.keys():
                    annotation.update(
                        pdfrw.PdfDict(AP=data_dict[key], V='{}'.format(data_dict[key]), Ff=1)
                    )
    pdfrw.PdfWriter().write(output_pdf_path, template_pdf)



data_dict = {
    'name_1': 'עידו',
    'company_1': 'IBM',
    'client_1': 'אסם'
}

if __name__ == '__main__':
    write_fillable_pdf(INVOICE_TEMPLATE_PATH, INVOICE_OUTPUT_PATH, data_dict)

We figured that NeedAppearances has something to do with needing to save manually.
When the exported file is opened in Acrobat Reader a certain work is applied by Acrobat Reader on the file. For this reason upon exit the program asks if we would like to save the file.
This operation is vital for us but we need it automatically.

What is this operation and how to do it programmatically in our code? before or after export..

Asked By: SDIdo

||

Answers:

With pdfrw you can set NeedAppearances to True with the following code:

from pdfrw import PdfReader, PdfDict, PdfObject

def set_need_appearances(pdf_reader: PdfReader):
   pdf_reader.Root.AcroForm.update(PdfDict(NeedAppearances=PdfObject("true")))
   return pdf_reader

With PyPDF2 you can set need appearances with a built in method on the PdfWriter class:

pdf_writer = PdfWriter()
pdf_writer.set_need_appearances_writer()
Answered By: nihilok

I was facing the same issue when I had the NeedAppearances set to true. I found the below piece of code working for my pdfs. Please let me know if this works for you.

from pikepdf import Pdf

with Pdf.open('source_pdf.pdf') as pdf:
    pdf.generate_appearance_streams()
    pdf.save('output.pdf')

I think generate_appearance_streams() was able to generate the appearance streams instead of letting the PDF reader do for it, hence no manual save required when opened with Adobe Acrobat Reader.

I have found a solution. Using pdfrw, I had already implemented setting NeedAppearances to true, but it still wasn’t saving. When I would open the file in Acrobat, I’d be asked if I wanted to save.

I tried saving the file with pdfrw and reopening with pikepdf and PyPDF2 with only more issues. I tried setting annotation["/AP"] = "" with some success, and del annotation["/AP"] with more success, but it wasn’t complete/absolute – I was still being asked if I wanted to save.

The solution is to have Adobe programmatically save the file. When I finally programmed that in, the values in form fields showed up everywhere (Explorer PDF previewer, SharePoint previewer, etc., in addition to the usual PDF viewers) and I was no longer asked to save the file.

This is for the current version of Acrobat. If you are still using an older version, you’ll need to use ArcoExch.AVDoc and won’t need to create an instance of the PDDoc object:

import pdfrw
import win32com.client

pdf = PdfReader("your path")
output_path = "your path"

<actions on the PDF here>

pdf.Root.AcroForm.update(PdfDict(NeedAppearances=PdfObject("true")))
PdfWriter().write(output_path, pdf)

# Create an instance of the Acrobat Application object
acrobat_app = win32com.client.Dispatch("AcroExch.App")

# Create an instance of the PDDoc object
pdf_doc = win32com.client.Dispatch("AcroExch.PDDoc")

# Open the PDF file
pdf_doc.Open(output_path)

# Save the PDF
pdf_doc.Save(1, output_path)

# Close the PDF
pdf_doc.Close()

# Quit Acrobat
acrobat_app.Exit()

‘NeedAppearances = true’ can be anywhere between reading and writing the PDF, it doesn’t seem to matter. I’ve even tried removing it completely with this solution and it still works.

Answered By: Sassbearilla
Categories: questions Tags: , , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.