Cannot parse SQS Json response

Question:

I have created a consumer to gather bounces and complaints in an SQS Queue. It is working for the majority of my messages, but there are some messages coming in that are throwing an Exception.

The JSON output that works:

{"notificationType":"Complaint","complaint":{"feedbackId":"0100018217177664-7d73c160-0be8-49aa-b9c4-bd1d1aebd362-000000","complaintSubType":null,"complainedRecipients":[{"emailAddress":"[email protected]"}],"timestamp":"2022-07-19T15:33:09.000Z","userAgent":"Yahoo!-Mail-Feedback/2.0","complaintFeedbackType":"abuse","arrivalDate":"2022-07-08T12:58:35.000Z"},"mail":{"timestamp":"2022-07-08T12:58:34.633Z","source":"[email protected]","sourceArn":"arn:aws:ses:us-east-1:669689702539:identity/example.com","sourceIp":"12.34.56.789","callerIdentity":"ses-smtp-user.20190509-104648","sendingAccountId":"1234567890","messageId":"01000181dde3fb09-77fa66a7-6425-4a17-974c-5b49c8ab930d-000000","destination":["[email protected]"],"headersTruncated":false,"headers":[{"name":"Received","value":"from mail-aws-va-1 ([12.34.56.789]) by email-smtp.amazonaws.com with SMTP (SimpleEmailService-d-9BN0NGUJI) id MfVVhu8nS86Hc4osVk6w for [email protected]; Fri, 08 Jul 2022 12:58:34 +0000 (UTC)"},{"name":"Received","value":"by mail-aws-va-1 (Postfix, from userid 111) id E1DA561FD2; Fri,  8 Jul 2022 07:57:50 -0500 (CDT)"},{"name":"Received","value":"from localhost.localdomain (some-server [192.168.20.2]) by mail-aws-va-1 (Postfix) with ESMTP id E5776621E0 for <[email protected]>; Fri,  8 Jul 2022 07:57:16 -0500 (CDT)"},{"name":"Date","value":"Fri, 8 Jul 2022 07:57:16 -0500"},{"name":"To","value":"[email protected]"},{"name":"From","value":""Some Gift Shop Inc." <[email protected]>"},{"name":"Reply-to","value":""Some Gift Shop Inc." <[email protected]>"},{"name":"Subject","value":"Some Subject"},{"name":"Message-ID","value":"<[email protected]>"},{"name":"X-Priority","value":"3"},{"name":"X-Mailer","value":"PHPMailer (phpmailer.sourceforge.net) [version 2.0.0 rc1]"},{"name":"X-Sender","value":"[email protected]"},{"name":"List-Unsubscribe","value":"<mailto:[email protected]>, <https://www.bedfordflorist.net/ecamp/unsubscribe/5789/592848>"},{"name":"MIME-Version","value":"1.0"},{"name":"Content-Transfer-Encoding","value":"8bit"},{"name":"Content-Type","value":"text/html; charset="iso-8859-1""}],"commonHeaders":{"from":[""Some Gift Shop Inc." <[email protected]>"],"replyTo":[""Some Gift Shop Inc." <[email protected]>"],"date":"Fri, 8 Jul 2022 07:57:16 -0500","to":["[email protected]"],"messageId":"<[email protected]>","subject":"Some Subject"}}}

I can grab all the objects I need with the following:

for message in queue.receive_messages(WaitTimeSeconds=10, MaxNumberOfMessages=10):
        try:
            # Grab the json objects
            body = json.loads(message.body)
            headers = body['mail']['headers'][13]['value']
            email = body['complaint']['complainedRecipients'][0]['emailAddress']
            timestamp = body['complaint']['timestamp']
            formatted_time = datetime.strptime(timestamp, '%Y-%m-%dT%H:%M:%S.%fZ').strftime("%Y-%m-%d %H:%M:%S")
            subject = body['mail']['commonHeaders']['subject']
            region = body['mail']['sourceArn'].split(':')[3]
            ecamp_id = headers.split('|')[0]
            client_id = headers.split('|')[1]
            ecamp_source = headers.split('|')[2]
        except KeyError as e:
            print("Key Error: " + str(e))
        except IndexError as i:
            print("Index Error: " + str(i))

However, this other JSON is giving me problems:

{
  "Type" : "Notification",
  "MessageId" : "99acffbc-585e-5f26-a9d0-a4da29d03df7",
  "TopicArn" : "arn:aws:sns:us-west-2:669689702539:sns-oregon-complaint-topic",
  "Message" : "{"notificationType":"Complaint","complaint":{"feedbackId":"010101010101010101010-blahblah","complaintSubType":null,"complainedRecipients":[{"emailAddress":"[email protected]"}],"timestamp":"2022-08-05T01:15:58.000Z","userAgent":"Yahoo!-Mail-Feedback/2.0","complaintFeedbackType":"abuse","arrivalDate":"2022-08-05T00:51:34.000Z"},"mail":{"timestamp":"2022-08-05T00:51:32.988Z","source":"[email protected]","sourceArn":"arn:aws:ses:us-west-2:1234567890:identity/example.com","sourceIp":"12.34.56.789","callerIdentity":"mail-aws-or-1","sendingAccountId":"123456789","messageId":"010101826b7c6dfc-95c44e97-db49-467b-a65c-e27344874d74-000000","destination":["[email protected]"],"headersTruncated":false,"headers":[{"name":"Received","value":"from mail-aws-or-1 ([12.34.56.789]) by email-smtp.amazonaws.com with SMTP (SimpleEmailService-d-SPTLPQQAI) id 1uYbytwalUOhARFznQm0 for [email protected]; Fri, 05 Aug 2022 00:51:32 +0000 (UTC)"},{"name":"Received","value":"by mail-aws-or-1 (Postfix, from userid 111) id 48D951A294F; Thu,  4 Aug 2022 19:51:06 -0500 (CDT)"},{"name":"Received","value":"from localhost.localdomain (server_name [192.168.20.2]) by mail-aws-or-1 (Postfix) with ESMTP id AEA191A2932 for <[email protected]>; Thu,  4 Aug 2022 19:51:04 -0500 (CDT)"},{"name":"Date","value":"Thu, 4 Aug 2022 19:51:04 -0500"},{"name":"To","value":"[email protected]"},{"name":"From","value":"Some Business Name <[email protected]>"},{"name":"Reply-to","value":"Some Business Name <[email protected]>"},{"name":"Subject","value":"Some Subject'!"},{"name":"Message-ID","value":"<[email protected]>"},{"name":"X-Priority","value":"3"},{"name":"X-Mailer","value":"PHPMailer (phpmailer.sourceforge.net) [version 2.0.0 rc1]"},{"name":"X-Sender","value":"[email protected]"},{"name":"List-Unsubscribe","value":"<mailto:[email protected]>, <https://www.somebusinessname.com/ecamp/unsubscribe/7331/7388957>"},{"name":"Content-Description","value":"7331|347977|SomeClassName.class"},{"name":"MIME-Version","value":"1.0"},{"name":"Content-Transfer-Encoding","value":"8bit"},{"name":"Content-Type","value":"text/html; charset=\"iso-8859-1\""}],"commonHeaders":{"from":["Petal Perfect Flower Shop <[email protected]>"],"replyTo":["Some Business Name <[email protected]>"],"date":"Thu, 4 Aug 2022 19:51:04 -0500","to":["[email protected]"],"messageId":"<[email protected]>","subject":"Some Subject'!"}}}",
  "Timestamp" : "2022-08-05T01:15:58.917Z",
  "SignatureVersion" : "1",
  "Signature" : "xxsdflkj0983244r098ujsadflkjlkjawe0rjoisajdf09ui234rijlksadf",
  "SigningCertURL" : "https://sns.us-west-2.amazonaws.com/SimpleNotificationService-213847068461846841354.pem",
  "UnsubscribeURL" : "https://sns.us-west-2.amazonaws.com/?Action=Unsubscribe&SubscriptionArn=arn:aws:sns:us-west-2:669689702539:sns-oregon-complaint-topic:8c681b37-39c8-40b6-8b8e-16e58ef374c2"
}

I’m getting Key Error: 'mail'

When I check the type with print(type(body['Message']) I’m getting <class 'str'>. So, in order to correctly parse these other types of messages, do I need to resort to regex? Is there a way to change this string to a dictionary so I can still access elements with body['Message'] ?

Asked By: DevOpsSauce

||

Answers:

The reason is your second response contains a string which contains the escaped JSOn.

You should check for availability of keys which are optional. This can be done in several ways, here is one possible solution:

body = json.loads(message.body)
#...
if 'mail' in body :
    mail = body['mail']
#...

See also this question mor details and other options.

To parse the escaped string, you need to load this string as JSON again:

message = json.loads(body['Message'])
Answered By: Fruchtzwerg
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.