Working with picture in AlternateContent tag
Question:
I need to move an element from one document to another by using python-docx
. The element is AlternateContent
which represents shapes and figures in Office Word, the issue here is that one of the elements contains an image like this:
<AlternateContent>
<Choice Requires="wpc">
<drawing>
<inline distT="0" distB="0" distL="0" distR="0" wp14_anchorId="0DCE320C" wp14_editId="0DCE320D">
<extent cx="5826587" cy="2494357" />
<effectExtent l="0" t="0" r="0" b="1270" />
<docPr id="1108" name="Zeichenbereich 5" />
<cNvGraphicFramePr>
<graphicFrameLocks xmlns_a="http://schemas.openxmlformats.org/drawingml/2006/main" noChangeAspect="1" />
</cNvGraphicFramePr>
<graphic xmlns_a="http://schemas.openxmlformats.org/drawingml/2006/main">
<graphicData uri="http://schemas.microsoft.com/office/word/2010/wordprocessingCanvas">
<wpc>
<pic xmlns_pic="http://schemas.openxmlformats.org/drawingml/2006/picture">
<nvPicPr>
<cNvPr id="687" name="Picture 28" />
<cNvPicPr>
<picLocks noChangeAspect="1" noChangeArrowheads="1" />
</cNvPicPr>
</nvPicPr>
<blipFill>
<blip r_embed="rId20">
<extLst>
<ext uri="{28A0092B-C50C-407E-A947-70E740481C1C}">
<useLocalDpi xmlns_a14="http://schemas.microsoft.com/office/drawing/2010/main" val="0" />
</ext>
</extLst>
</blip>
<srcRect />
<stretch>
<fillRect />
</stretch>
</blipFill>
</pic>
</wpc>
</graphicData>
</graphic>
</inline>
</drawing>
</Choice>
</AlternateContent>
What I did is extract the image by getting its rid
from r:embed
and then save it to the disk, after I re-add the image using add_picture()
from the Run
class, sadly this process cannot be achieved because from above example the <pic>
tag is not included in a run
.
So my question is how I can save the element AlternateContent
into python object then re-add it to a Word document?
Answers:
Because of the fact that this functionality is not fully supported through python-docx
API and processing images is kind of complicated -because of the multiple parts that must be handled (ImagePart, Relationship, rId)- the work had to be done at a low level by going into lxml
, in these steps:
- Save the images into the disk when reading the file.
- Get
rid
of the image to add it separately.
- Build a
pic:pic
element using lxml
functions (SubElement
and Element
)
- Repeat the process to handle all the pictures.
for image_elem in list_of_images:
image_path = image_elem("image_path")
rel_id, _ = _run.part.get_or_add_image(image_path)
image_name = image_path.split("\")[-1]
add_image_to_shape(shape_element, rel_id, image_name)
The function add_image_to_shape
is done like this:
shape = [elem for elem in shape_element.iterdescendants(tag='{http://schemas.microsoft.com/office/word/2010/wordprocessingCanvas}wpc')][0]
image = etree.SubElement(shape, 'pic')
nvPicPr = etree.SubElement(image, 'nvPicPr')
cNvPr = etree.SubElement(nvPicPr, 'cNvPr')
cNvPr.set('id', '1')
cNvPr.set('descr', image_name)
cNvPicPr = etree.SubElement(nvPicPr, 'cNvPicPr')
picLocks = etree.SubElement(cNvPicPr, 'picLocks')
picLocks.set('noChangeAspect', '1')
picLocks.set('noChangeArrowheads', '1')
blipFill = etree.SubElement(image, 'blipFill')
blip = etree.SubElement(blipFill, 'blip')
blip.set('embed', rel_id)
srcRect = etree.SubElement(blipFill, 'srcRect')
stretch = etree.SubElement(blipFill, 'stretch')
fillRect = etree.SubElement(stretch, 'fillRect')
I need to move an element from one document to another by using python-docx
. The element is AlternateContent
which represents shapes and figures in Office Word, the issue here is that one of the elements contains an image like this:
<AlternateContent>
<Choice Requires="wpc">
<drawing>
<inline distT="0" distB="0" distL="0" distR="0" wp14_anchorId="0DCE320C" wp14_editId="0DCE320D">
<extent cx="5826587" cy="2494357" />
<effectExtent l="0" t="0" r="0" b="1270" />
<docPr id="1108" name="Zeichenbereich 5" />
<cNvGraphicFramePr>
<graphicFrameLocks xmlns_a="http://schemas.openxmlformats.org/drawingml/2006/main" noChangeAspect="1" />
</cNvGraphicFramePr>
<graphic xmlns_a="http://schemas.openxmlformats.org/drawingml/2006/main">
<graphicData uri="http://schemas.microsoft.com/office/word/2010/wordprocessingCanvas">
<wpc>
<pic xmlns_pic="http://schemas.openxmlformats.org/drawingml/2006/picture">
<nvPicPr>
<cNvPr id="687" name="Picture 28" />
<cNvPicPr>
<picLocks noChangeAspect="1" noChangeArrowheads="1" />
</cNvPicPr>
</nvPicPr>
<blipFill>
<blip r_embed="rId20">
<extLst>
<ext uri="{28A0092B-C50C-407E-A947-70E740481C1C}">
<useLocalDpi xmlns_a14="http://schemas.microsoft.com/office/drawing/2010/main" val="0" />
</ext>
</extLst>
</blip>
<srcRect />
<stretch>
<fillRect />
</stretch>
</blipFill>
</pic>
</wpc>
</graphicData>
</graphic>
</inline>
</drawing>
</Choice>
</AlternateContent>
What I did is extract the image by getting its rid
from r:embed
and then save it to the disk, after I re-add the image using add_picture()
from the Run
class, sadly this process cannot be achieved because from above example the <pic>
tag is not included in a run
.
So my question is how I can save the element AlternateContent
into python object then re-add it to a Word document?
Because of the fact that this functionality is not fully supported through python-docx
API and processing images is kind of complicated -because of the multiple parts that must be handled (ImagePart, Relationship, rId)- the work had to be done at a low level by going into lxml
, in these steps:
- Save the images into the disk when reading the file.
- Get
rid
of the image to add it separately. - Build a
pic:pic
element usinglxml
functions (SubElement
andElement
) - Repeat the process to handle all the pictures.
for image_elem in list_of_images:
image_path = image_elem("image_path")
rel_id, _ = _run.part.get_or_add_image(image_path)
image_name = image_path.split("\")[-1]
add_image_to_shape(shape_element, rel_id, image_name)
The function add_image_to_shape
is done like this:
shape = [elem for elem in shape_element.iterdescendants(tag='{http://schemas.microsoft.com/office/word/2010/wordprocessingCanvas}wpc')][0]
image = etree.SubElement(shape, 'pic')
nvPicPr = etree.SubElement(image, 'nvPicPr')
cNvPr = etree.SubElement(nvPicPr, 'cNvPr')
cNvPr.set('id', '1')
cNvPr.set('descr', image_name)
cNvPicPr = etree.SubElement(nvPicPr, 'cNvPicPr')
picLocks = etree.SubElement(cNvPicPr, 'picLocks')
picLocks.set('noChangeAspect', '1')
picLocks.set('noChangeArrowheads', '1')
blipFill = etree.SubElement(image, 'blipFill')
blip = etree.SubElement(blipFill, 'blip')
blip.set('embed', rel_id)
srcRect = etree.SubElement(blipFill, 'srcRect')
stretch = etree.SubElement(blipFill, 'stretch')
fillRect = etree.SubElement(stretch, 'fillRect')