USPS Package Track API is not returning XML child elements for TrackSummary

Question:

Please see the temporary solution at the end.

Summary (added 12/24/22 for clarification):

USPS’s tracking API is not returning responses in the same format as their documentation. The actual format makes it difficult to extract the event date since there is no EventDate XML element. Worst case, I can use regex, but was wondering if there was a way to receive API responses as showing in USPS’s documentation.

Details

In USPS’s Track and Confirm API documentation page 19, the sample response shows <TrackSummary> with child elements (<EventTime>, <EventDate>, etc.):

Screenshot of USPS’s sample response

Here’s USPS’s sample response in text:

<TrackResponse>
 <TrackInfo ID=" XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX ">
 <GuaranteedDeliveryDate>June 24, 2022</GuaranteedDeliveryDate>
 <TrackSummary>
 <EventTime>9:00 am</EventTime>
 <EventDate>June 22, 2022</EventDate>
 <Event>Delivered, To Agent</Event>
 <EventCity>AMARILLO</EventCity>
 <EventState>TX</EventState>
 <EventZIPCode>79109</EventZIPCode>
 <EventCountry/>
 <FirmName/>
 <Name>RXXXXXX XXXXXXX</Name>
 <AuthorizedAgent>false</AuthorizedAgent>
 <DeliveryAttributeCode>23</DeliveryAttributeCode>
 <GMT>14:00:00</GMT>
 <GMTOffset>-05:00</GMTOffset>
 </TrackSummary>

However, when performing the call, the actual XML response lacks these children elements within TrackSummary:

<?xml version="1.0" encoding="UTF-8"?>
<TrackResponse>
    <TrackInfo ID="9405511206213782679396">
        <TrackSummary>Your item departed our WEST PALM BEACH FL DISTRIBUTION CENTER destination facility on December 23, 2022 at 12:40 pm. The item is currently in transit to the destination.</TrackSummary>
        <TrackDetail>Arrived at USPS Regional Facility, December 23, 2022, 4:49 am, WEST PALM BEACH FL DISTRIBUTION CENTER</TrackDetail>
        <TrackDetail>In Transit to Next Facility, 12/22/2022, 9:41 pm</TrackDetail>
        <TrackDetail>In Transit to Next Facility, 12/22/2022, 1:36 pm</TrackDetail>
        <TrackDetail>Departed USPS Facility, 12/22/2022, 5:58 am, HARRISBURG, PA 17112</TrackDetail>
        <TrackDetail>Arrived at USPS Regional Origin Facility, 12/21/2022, 10:12 pm, HARRISBURG PA PACKAGE SORTING CENTER</TrackDetail>
        <TrackDetail>Departed Post Office, December 21, 2022, 4:34 pm, DALLASTOWN, PA 17313</TrackDetail>
        <TrackDetail>USPS picked up item, December 21, 2022, 2:37 pm, DALLASTOWN, PA 17313</TrackDetail>
        <TrackDetail>Shipping Label Created, USPS Awaiting Item, December 21, 2022, 2:16 pm, DALLASTOWN, PA 17313</TrackDetail>
    </TrackInfo>
</TrackResponse>

This can be reproduced with Lob’s USPS Postman workspace

The problem I’m trying to solve is obtaining the date from the TrackSummary data, which now requires regex since USPS’s API is not returning an EventDate child element.

Is there an option when making the request to return these helpful XML child elements? I couldn’t find one in the documentation and the sample responses I’ve seen all contain these child elements.

I’ve tried forming the request in Python and with Lob’s USPS workspace and both XML responses lack the TrackSummary child elements.

Long-term solution (in progress 12/26/22)

@Parfait pointed out that I should use the Package Tracking “Fields” API instead of the Package Track API.

Here’s how I’m currently forming the XML request with Package Track API:

from lxml import etree

def generate_url_tracking(tracking_numbers: list[str]) -> str:
    """generate the USPS tracking request url
    :param: tracking_numbers - list of strings of tracking numbers
    :return url: str tracking url for calling the USPS API
    """
    xml = generate_xml_tracking(tracking_numbers)
    url = f"{base_url}{url_vars['track']}{xml}"
    return url

def generate_xml_tracking(tracking_numbers: list[str]) -> str:
    """
    Generate USPS track and confirm API xml
    :param tracking_numbers: list of strings of tracking numbers
    :return: xml string
    """
    xml = etree.Element("TrackRequest", {"USERID": config("USPS_USER")})
    # loop through tracking numbers
    for tracking in tracking_numbers:
        etree.SubElement(xml, "TrackID", {"ID": tracking})
    xml_string = etree.tostring(xml, encoding="utf8", method="xml").decode()
    return xml_string

I’ll update this to the Package Tracking “Fields” API request when I get time.

Temporary Solution (12/25/22)

Until USPS’s actual responses match their API docs, this solution extracts the last updated date from <TrackSummary> for several different statuses (pre-shipment, delivered, RTS, etc.)

The TRACK_SUMMARIES dict has the different statuses it’s tested against. Some statuses without dates (no_info, out_for_delivery_no_date) return None.

import re
from dateutil.parser import ParserError, parse

TRACK_SUMMARIES = {
    "delivered": """Your
     item was delivered in or at the mailbox at 10:23 am on December 24, 2022 in HOBE SOUND, FL 33455.""",
    "out_for_delivery": "Out for Delivery, December 13, 2021, 6:10 am, ARLINGTON, VA 22204.",
    "out_for_delivery_no_date": "Out for Delivery, Expected Delivery Between 9:45am and 1:45pm",
    "arrived_at_post_office": """Arrived at Post Office,
     Arrived at USPS Regional Origin Facility, December 11, 2021, 9:23 pm, HARRISBURG PA PACKAGE SORTING CENTER""",
    "acceptance": "Acceptance, December 10, 2021, 12:54 pm, DALLASTOWN, PA 17313",
    "pre_shipment": "Pre-Shipment Info Sent to USPS, USPS Awaiting Item, December 27, 2021",
    "rts": """Your item was returned to the sender on January 31, 2022 at 9:14 am in YORK, PA 17402
     because of an incorrect address.""",
    "no_info": "The Postal Service could not locate the tracking information for your request",
    "label_prepared": "A shipping label has been prepared for your item at 10:47 am on December 16, 2021 in WINSTON",
    "forwarded": """Your item was forwarded to a different address at 5:13 pm on January 4, 2022
        in REDDING, CA. This was because of forwarding instructions or because the
        address or ZIP Code on the label was incorrect.
        """,
}

def get_last_updated(track_summary: str) -> Optional[datetime]:
    """Takes the USPS TrackSummary string and return the last updated datetime"""
    # remove the zip code since it interferes with the date parser
    track_summary = re.sub(r"d{5}", "", track_summary)
    months_regex = "January|February|March|April|May|June|July|August|September|October|November|December"
    first_result = re.search(rf"(?={months_regex}).*", track_summary)
    # return early if there's no Month
    if not first_result:
        return
    first_result = first_result.group()
    # some summaries have am/pm and some don't
    result_for_parser = re.search(r".*(?<=am|pm)", first_result)
    if result_for_parser:
        result_for_parser = result_for_parser.group()
    else:
        result_for_parser = first_result
    try:
        # fuzzy parsing is required for dates in certain summaries
        result = parse(result_for_parser, fuzzy=True)
    except ParserError:
        return
    return result

Sources:

Using the dateutil parser
Regex for finding months

Asked By: Nathan Smeltzer

||

Answers:

xml.etree.ElementTree is good job to find a child by XPath

it provides limited support for XPath expressions for locating elements in a tree. But it is good enough to find TrackSummary data

To find ‘TrackSummary’ children of the top-level

root.find(".//TrackSummary").text ->
Your item departed our WEST PALM BEACH FL DISTRIBUTION CENTER destination facility on December 23, 2022 at 12:40 pm. The item is currently in transit to the destination.

This python demo

import xml.etree.ElementTree as ET
import datetime

document = """
<?xml version="1.0" encoding="UTF-8"?>
<TrackResponse>
    <TrackInfo ID="9405511206213782679396">
        <TrackSummary>Your item departed our WEST PALM BEACH FL DISTRIBUTION CENTER destination facility on December 23, 2022 at 12:40 pm. The item is currently in transit to the destination.</TrackSummary>
        <TrackDetail>Arrived at USPS Regional Facility, December 23, 2022, 4:49 am, WEST PALM BEACH FL DISTRIBUTION CENTER</TrackDetail>
        <TrackDetail>In Transit to Next Facility, 12/22/2022, 9:41 pm</TrackDetail>
        <TrackDetail>In Transit to Next Facility, 12/22/2022, 1:36 pm</TrackDetail>
        <TrackDetail>Departed USPS Facility, 12/22/2022, 5:58 am, HARRISBURG, PA 17112</TrackDetail>
        <TrackDetail>Arrived at USPS Regional Origin Facility, 12/21/2022, 10:12 pm, HARRISBURG PA PACKAGE SORTING CENTER</TrackDetail>
        <TrackDetail>Departed Post Office, December 21, 2022, 4:34 pm, DALLASTOWN, PA 17313</TrackDetail>
        <TrackDetail>USPS picked up item, December 21, 2022, 2:37 pm, DALLASTOWN, PA 17313</TrackDetail>
        <TrackDetail>Shipping Label Created, USPS Awaiting Item, December 21, 2022, 2:16 pm, DALLASTOWN, PA 17313</TrackDetail>
    </TrackInfo>
</TrackResponse>
"""

def find_between( s, first, last ):
    try:
        start = s.index( first ) + len( first )
        end = s.index( last, start )
        return s[start:end]
    except ValueError:
        return ""

root = ET.fromstring(document)

date_time_obj = datetime.datetime.strptime(find_between(root.find(".//TrackSummary").text,' on ', '.'), '%B %d' + ", " + '%Y at %I:%M %p')
print('Date:', date_time_obj.date())
print('Time:', date_time_obj.time())
print('Date-time:', date_time_obj)

Result

$ python track-summary.py
Date: 2022-12-23
Time: 12:40:00
Date-time: 2022-12-23 12:40:00

Updated for Reg expression parsing

Base on your updated question for Temporary Solution (12/25/22)
I added parsing part with import re library.

Code

import re
import numpy as np
from datetime import date, time, datetime

def get_date(date_string):
    months = np.array(['January','February','March','April','May','June','July','August','September','October','November','December'])
    pattern = re.compile(r'(January|February|March|April|May|June|July|August|September|October|November|December)s(d{2}|d{1}),s(d{4})')
    match = re.search(pattern, date_string)
    if not match:
        d = None
    else:
        month_data = match.groups()[0]
        month = np.where(months==month_data)[0][0] + 1
        day = int(match.groups()[1])
        year = int(match.groups()[2])
        try:
            d = date(year, month, day)
        except ValueError:
            d = None  # or handle error in a different way
    return d

def get_hour_min(hour, min, am_pm):
    hour = int(hour)
    min = int(min)
    add_hour = 0
    if (am_pm == 'pm'):
        if (hour != 12):
            add_hour = 12
    return [hour+add_hour,  min]

def get_time(date_string):
    pattern = re.compile(r'(d{2}|d{1}):(d{2})s*(am|pm)')
    matches = re.findall(pattern, date_string)
    if (len(matches) == 2):
        hour, min = get_hour_min(matches[0][0], matches[0][1], matches[0][2])
        start_t = time(hour, min, 0)
        hour, min = get_hour_min(matches[1][0], matches[1][1], matches[1][2])
        end_t = time(hour, min, 0)
        return [start_t, end_t]

    match = re.search(pattern, date_string)
    if not match:
        t = None
    else:
        hour, min = get_hour_min(match.groups()[0], match.groups()[1], match.groups()[2])
        try:
            t = time(hour, min, 0)
        except ValueError:
            t = None  # or handle error in a different way
    return [t, None]

TRACK_SUMMARIES = {
    "delivered": """Your
     item was delivered in or at the mailbox at 10:23 am on December 24, 2022 in HOBE SOUND, FL 33455.""",
    "out_for_delivery": "Out for Delivery, December 13, 2021, 6:10 am, ARLINGTON, VA 22204.",
    "out_for_delivery_no_date": "Out for Delivery, Expected Delivery Between 9:45am and 1:45pm",
    "arrived_at_post_office": """Arrived at Post Office,
     Arrived at USPS Regional Origin Facility, December 11, 2021, 9:23 pm, HARRISBURG PA PACKAGE SORTING CENTER""",
    "acceptance": "Acceptance, December 10, 2021, 12:54 pm, DALLASTOWN, PA 17313",
    "pre_shipment": "Pre-Shipment Info Sent to USPS, USPS Awaiting Item, December 27, 2021",
    "rts": """Your item was returned to the sender on January 31, 2022 at 9:14 am in YORK, PA 17402
     because of an incorrect address.""",
    "no_info": "The Postal Service could not locate the tracking information for your request",
    "label_prepared": "A shipping label has been prepared for your item at 10:47 am on December 16, 2021 in WINSTON",
    "forwarded": """Your item was forwarded to a different address at 5:13 pm on January 4, 2022
        in REDDING, CA. This was because of forwarding instructions or because the
        address or ZIP Code on the label was incorrect.
        """,
}

tracks = {}
# parsing and tuple list by key ( example : delivered, out_for_delivery and so on )
for key in TRACK_SUMMARIES:
    value = TRACK_SUMMARIES[key].replace("n", "")
    found_date = get_date(value)
    start_time, end_time = get_time(value)
    tracks[key] = [ found_date, start_time, end_time, value ]
    # print(key, '->', value)
    # if (found_date != None):
    #     print('found date: ' + found_date.strftime("%m/%d/%Y"))
    # if (start_time != None):
    #     if(end_time == None):
    #         print('time: ' + start_time.strftime("%H:%M:%S"))
    #     else:
    #         print('start time: ' + start_time.strftime("%H:%M:%S") + ' end time: ' + end_time.strftime("%H:%M:%S"))
    # print('=========================================================================')

# decoding from tuple list by key ( tracks['delivered'], tracks['out_for_delivery'] and so on )
for key in tracks.keys():
    found_date, start_time, end_time, value = tracks[key]
    
    found_date = found_date.strftime("%m/%d/%Y") if found_date != None else None
    start_time = start_time.strftime("%H:%M:%S") if start_time != None else None
    end_time = end_time.strftime("%H:%M:%S") if end_time != None else None

    print(value)
    print(key)
    if (found_date != None):
        print('found date: ' + found_date)
    if (start_time != None):
        if(end_time == None):
            print('time: ' + start_time)
        else:
            print('start time: ' + start_time + ' end time: ' + end_time)
    print('------------------------------------------------------------------------')

Result

$ python reg-express.py
Your     item was delivered in or at the mailbox at 10:23 am on December 24, 2022 in HOBE SOUND, FL 33455.
delivered
found date: 12/24/2022
time: 10:23:00
------------------------------------------------------------------------
Out for Delivery, December 13, 2021, 6:10 am, ARLINGTON, VA 22204.
out_for_delivery
found date: 12/13/2021
time: 06:10:00
------------------------------------------------------------------------
Out for Delivery, Expected Delivery Between 9:45am and 1:45pm
out_for_delivery_no_date
start time: 09:45:00 end time: 13:45:00
------------------------------------------------------------------------
Arrived at Post Office,     Arrived at USPS Regional Origin Facility, December 11, 2021, 9:23 pm, HARRISBURG PA PACKAGE SORTING CENTER
arrived_at_post_office
found date: 12/11/2021
time: 21:23:00
------------------------------------------------------------------------
Acceptance, December 10, 2021, 12:54 pm, DALLASTOWN, PA 17313
acceptance
found date: 12/10/2021
time: 12:54:00
------------------------------------------------------------------------
Pre-Shipment Info Sent to USPS, USPS Awaiting Item, December 27, 2021
pre_shipment
found date: 12/27/2021
------------------------------------------------------------------------
Your item was returned to the sender on January 31, 2022 at 9:14 am in YORK, PA 17402     because of an incorrect address.
rts
found date: 01/31/2022
time: 09:14:00
------------------------------------------------------------------------
The Postal Service could not locate the tracking information for your request
no_info
------------------------------------------------------------------------
A shipping label has been prepared for your item at 10:47 am on December 16, 2021 in WINSTON
label_prepared
found date: 12/16/2021
time: 10:47:00
------------------------------------------------------------------------
Your item was forwarded to a different address at 5:13 pm on January 4, 2022        in REDDING, CA. This was because of forwarding instructions or because the        address or ZIP Code on the label was incorrect.
forwarded
found date: 01/04/2022
time: 17:13:00
------------------------------------------------------------------------

Date/time patterns

I extract from your TRACK_SUMMARIES dictionary data.
This is time and date pattern, some line no date and some has Between time.

10:23 am on December 24, 2022
December 13, 2021, 6:10 am
Between 9:45am and 1:45pm
December 10, 2021, 12:54 pm
December 27, 2021
January 31, 2022 at 9:14 am
at 10:47 am on December 16, 2021
at 5:13 pm on January 4, 2022

Date parsing

(January|February|March|April|May|June|July|August|September|October|November|December)s(d{2}|d{1}),s(d{4})

enter image description here

enter image description here
Matched item with groups – it use in code.

enter image description here

Time parsing

(d{2}|d{1}):(d{2})s*(am|pm)

enter image description here

enter image description here

Matched item with groups – it use in code.

enter image description here

References

Find string between two substrings

Converting Strings Using datetime

Regexper

regular expression 101

Answered By: Bench Vue
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.