Shape numbers/indexes of each pptx slide within existing presentation

Question:

I am new to python pptx library and my question is: How can I define the list of shapes, the shape numbers/indexes (shapetree) and shape types of each pptx slide within an existing presentation using Python Library pptx? I would like to update an existing ppt presentation and it seems that the first step would be to locate exact shape identifiers on each slide to access them with the updates. Would you point me to an existing solution or possibly examples?

Asked By: Sveta

||

Answers:

I assume by "define" you mean something like "discover", since there’s not usually a good reason to change the existing values.

A good way to start is by looping through and printing some attributes:

prs = Presentation("my-deck.pptx")
for slide in prs.slides:
    for shape in slide.shapes:
        print("id: %s, type: %s" % (shape.shape_id, shape.shape_type))

You can get as elaborate as you want with this, using any of the slide and/or shape attributes listed in the API documentation here:
https://python-pptx.readthedocs.io/en/latest/api/shapes.html#shape-objects-in-general

To look up a shape by id (or name) you need code like this:

def find_shape_by_id(shapes, shape_id):
    """Return shape by shape_id."""
    for shape in shapes:
        if shape.shape_id == shape_id:
            return shape
    return None

or if you doing a lot of it you can use a dict for that job:

shapes_by_id = dict((s.shape_id, s) for s in shapes)

Which then gives you all the handy methods like:

>>> 7 in shapes_by_id
True
>>> shapes_by_id[7]
<pptx.shapes.Shape object at 0x...>
Answered By: scanny

Here is a my own function to get information on the key elements of a pptx presentation.

def presentation_elements(ppt_path):
    """ Get all elements in a Powerpoint presentation

    Parameters
    ----------
    ppt_path : str / Path
        full path to powerpoint file

    Returns
    -------
    elements : pd.DataFrame
        information on all elements in dataframe

    Notes
    -----
    Slide Number follows Excel convention, use Slide Id to address slide.

    To verify correct shapes in Excel, use:
    Home > Arrange > Selection Pane ...
    """

    ppt = Presentation(ppt_path)

    elements = []
    for num, slide in enumerate(ppt.slides):
        for shape in slide.shapes:
            shape_info = pd.Series(
                [num + 1, slide.name, slide.slide_id,
                 shape.name, shape.shape_id, shape.shape_type],
                index=['Slide No', 'Slide Name', 'Slide Id',
                       'Shape Name', 'Shape Id', 'Shape Type'])

            elements.append(shape_info)

    elements = pd.concat(elements, axis=1).T

    return elements
Answered By: UCCH
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.