How to access file in /tmp directory of AWS Lambda
Question:
I have downloaded a file from a URL into the /tmp directory of AWS Lambda(as this was the only writable path in Lambda).
My motive is to create an Alexa Skill which will download file from an URL. Hence I created a lambda function.
How can I access the downloaded file from /tmp folder in lambda?
My code is : –
#!/usr/bin/python
# -*- coding: utf-8 -*-
from __future__ import print_function
import xml.etree.ElementTree as etree
from datetime import datetime as dt
import os
import urllib
import requests
from urllib.parse import urlparse
def lambda_handler(event, context):
""" Route the incoming request based on type (LaunchRequest, IntentRequest,
etc.) The JSON body of the request is provided in the event parameter.
"""
print('event.session.application.applicationId=' + event['session'
]['application']['applicationId'])
# if (event['session']['application']['applicationId'] !=
# "amzn1.echo-sdk-ams.app.[unique-value-here]"):
# raise ValueError("Invalid Application ID")
if event['session']['new']:
on_session_started({'requestId': event['request']['requestId'
]}, event['session'])
if event['request']['type'] == 'LaunchRequest':
return on_launch(event['request'], event['session'])
elif event['request']['type'] == 'IntentRequest':
return on_intent(event['request'], event['session'])
elif event['request']['type'] == 'SessionEndedRequest':
return on_session_ended(event['request'], event['session'])
def on_session_started(session_started_request, session):
""" Called when the session starts """
print('on_session_started requestId='
+ session_started_request['requestId'] + ', sessionId='
+ session['sessionId'])
def on_launch(launch_request, session):
""" Called when the user launches the skill without specifying what they
want
"""
print('on_launch requestId=' + launch_request['requestId']
+ ', sessionId=' + session['sessionId'])
# Dispatch to your skill's launch
return get_welcome_response()
def on_intent(intent_request, session):
""" Called when the user specifies an intent for this skill """
print('on_intent requestId=' + intent_request['requestId']
+ ', sessionId=' + session['sessionId'])
intent = intent_request['intent']
intent_name = intent_request['intent']['name']
# Dispatch to your skill's intent handlers
if intent_name == 'DownloadFiles':
return get_file(intent, session)
elif intent_name == 'AMAZON.HelpIntent':
return get_welcome_response()
else:
raise ValueError('Invalid intent')
def on_session_ended(session_ended_request, session):
""" Called when the user ends the session.Is not called when the skill returns should_end_session=true """
print('on_session_ended requestId='
+ session_ended_request['requestId'] + ', sessionId='
+ session['sessionId'])
# add cleanup logic here
# --------------- Functions that control the skill's behavior ------------------
def get_welcome_response():
""" If we wanted to initialize the session to have some attributes we could add those here """
session_attributes = {}
card_title = 'Welcome'
speech_output =
"Welcome to file download Application. Please ask me to download files by saying, Ask auto downloader for download"
# If the user either does not reply to the welcome message or says something
# that is not understood, they will be prompted again with this text.
reprompt_text =
"Please ask me to download files by saying, Ask auto downloader for download"
should_end_session = False
return build_response(session_attributes,
build_speechlet_response(card_title,
speech_output, reprompt_text,
should_end_session))
def get_file(intent, session):
""" Grabs the files from the path that have to be downloaded """
card_title = intent['name']
session_attributes = {}
should_end_session = True
username = '*******'
password = '*******'
url = 'https://drive.google.com/drive/my-drive/abc.pdf'
filename = os.path.basename(urlparse(url).path)
# urllib.urlretrieve(url, "code.zip")
r = requests.get(url, auth=(username, password))
if r.status_code == 200:
with open("/tmp/" + filename, 'wb') as out:
for bits in r.iter_content():
out.write(bits)
speech_output = 'The file filename has been downloaded'
return build_response(session_attributes,
build_speechlet_response(card_title,
speech_output, reprompt_text,
should_end_session))
# --------------- Helpers that build all of the responses ----------------------
def build_speechlet_response(
title,
output,
reprompt_text,
should_end_session,
):
return {
'outputSpeech': {'type': 'PlainText', 'text': output},
'card': {'type': 'Simple', 'title': 'SessionSpeechlet - '
+ title, 'content': 'SessionSpeechlet - ' + output},
'reprompt': {'outputSpeech': {'type': 'PlainText',
'text': reprompt_text}},
'shouldEndSession': should_end_session,
}
def build_response(session_attributes, speechlet_response):
return {'version': '1.0', 'sessionAttributes': session_attributes,
'response': speechlet_response}
Answers:
Just open the file as you would do normally:
with open('/tmp/'+ filename, 'rb') as file:
...
This works all the time for me. Have you tried this and experienced any issues?
Please note that Lambda runs in a container. When you download once, the file will be in the /tmp
folder until this container lives. After a container is launched to serve your function, it lives for tipically 10-30 minutes (could be less or more, it’s not an official fixed time). So, instead of always downloading the file, you should check if the file isn’t already in the /tmp
directory. If yes, you obviously don’t have to download again! 😉
To make this check use:
if not os.path.isfile('/tmp/' + filename):
download...
For anyone who ends up here and is using a Docker image as the Lambda function: AWS cleans out /tmp either when the Docker image is uploaded to ECS or when the Lambda function is executed.
This means that if you are relying on anything being in /tmp (e.g. you are copying a file to /tmp in your Dockerfile) you will observe that the Docker image runs fine locally, and contains the file in /tmp as expected, but you get an error when you try to run this same Docker image as a Lambda function in AWS because that file is not in /tmp.
I would suggest putting the file in the LAMBDA_TASK_ROOT (these days that is the /var/task directory), which is read-only. If you need to modify this file, then I would read it from the LAMBDA_TASK_ROOT directory, and write it to /tmp.
I have downloaded a file from a URL into the /tmp directory of AWS Lambda(as this was the only writable path in Lambda).
My motive is to create an Alexa Skill which will download file from an URL. Hence I created a lambda function.
How can I access the downloaded file from /tmp folder in lambda?
My code is : –
#!/usr/bin/python
# -*- coding: utf-8 -*-
from __future__ import print_function
import xml.etree.ElementTree as etree
from datetime import datetime as dt
import os
import urllib
import requests
from urllib.parse import urlparse
def lambda_handler(event, context):
""" Route the incoming request based on type (LaunchRequest, IntentRequest,
etc.) The JSON body of the request is provided in the event parameter.
"""
print('event.session.application.applicationId=' + event['session'
]['application']['applicationId'])
# if (event['session']['application']['applicationId'] !=
# "amzn1.echo-sdk-ams.app.[unique-value-here]"):
# raise ValueError("Invalid Application ID")
if event['session']['new']:
on_session_started({'requestId': event['request']['requestId'
]}, event['session'])
if event['request']['type'] == 'LaunchRequest':
return on_launch(event['request'], event['session'])
elif event['request']['type'] == 'IntentRequest':
return on_intent(event['request'], event['session'])
elif event['request']['type'] == 'SessionEndedRequest':
return on_session_ended(event['request'], event['session'])
def on_session_started(session_started_request, session):
""" Called when the session starts """
print('on_session_started requestId='
+ session_started_request['requestId'] + ', sessionId='
+ session['sessionId'])
def on_launch(launch_request, session):
""" Called when the user launches the skill without specifying what they
want
"""
print('on_launch requestId=' + launch_request['requestId']
+ ', sessionId=' + session['sessionId'])
# Dispatch to your skill's launch
return get_welcome_response()
def on_intent(intent_request, session):
""" Called when the user specifies an intent for this skill """
print('on_intent requestId=' + intent_request['requestId']
+ ', sessionId=' + session['sessionId'])
intent = intent_request['intent']
intent_name = intent_request['intent']['name']
# Dispatch to your skill's intent handlers
if intent_name == 'DownloadFiles':
return get_file(intent, session)
elif intent_name == 'AMAZON.HelpIntent':
return get_welcome_response()
else:
raise ValueError('Invalid intent')
def on_session_ended(session_ended_request, session):
""" Called when the user ends the session.Is not called when the skill returns should_end_session=true """
print('on_session_ended requestId='
+ session_ended_request['requestId'] + ', sessionId='
+ session['sessionId'])
# add cleanup logic here
# --------------- Functions that control the skill's behavior ------------------
def get_welcome_response():
""" If we wanted to initialize the session to have some attributes we could add those here """
session_attributes = {}
card_title = 'Welcome'
speech_output =
"Welcome to file download Application. Please ask me to download files by saying, Ask auto downloader for download"
# If the user either does not reply to the welcome message or says something
# that is not understood, they will be prompted again with this text.
reprompt_text =
"Please ask me to download files by saying, Ask auto downloader for download"
should_end_session = False
return build_response(session_attributes,
build_speechlet_response(card_title,
speech_output, reprompt_text,
should_end_session))
def get_file(intent, session):
""" Grabs the files from the path that have to be downloaded """
card_title = intent['name']
session_attributes = {}
should_end_session = True
username = '*******'
password = '*******'
url = 'https://drive.google.com/drive/my-drive/abc.pdf'
filename = os.path.basename(urlparse(url).path)
# urllib.urlretrieve(url, "code.zip")
r = requests.get(url, auth=(username, password))
if r.status_code == 200:
with open("/tmp/" + filename, 'wb') as out:
for bits in r.iter_content():
out.write(bits)
speech_output = 'The file filename has been downloaded'
return build_response(session_attributes,
build_speechlet_response(card_title,
speech_output, reprompt_text,
should_end_session))
# --------------- Helpers that build all of the responses ----------------------
def build_speechlet_response(
title,
output,
reprompt_text,
should_end_session,
):
return {
'outputSpeech': {'type': 'PlainText', 'text': output},
'card': {'type': 'Simple', 'title': 'SessionSpeechlet - '
+ title, 'content': 'SessionSpeechlet - ' + output},
'reprompt': {'outputSpeech': {'type': 'PlainText',
'text': reprompt_text}},
'shouldEndSession': should_end_session,
}
def build_response(session_attributes, speechlet_response):
return {'version': '1.0', 'sessionAttributes': session_attributes,
'response': speechlet_response}
Just open the file as you would do normally:
with open('/tmp/'+ filename, 'rb') as file:
...
This works all the time for me. Have you tried this and experienced any issues?
Please note that Lambda runs in a container. When you download once, the file will be in the /tmp
folder until this container lives. After a container is launched to serve your function, it lives for tipically 10-30 minutes (could be less or more, it’s not an official fixed time). So, instead of always downloading the file, you should check if the file isn’t already in the /tmp
directory. If yes, you obviously don’t have to download again! 😉
To make this check use:
if not os.path.isfile('/tmp/' + filename):
download...
For anyone who ends up here and is using a Docker image as the Lambda function: AWS cleans out /tmp either when the Docker image is uploaded to ECS or when the Lambda function is executed.
This means that if you are relying on anything being in /tmp (e.g. you are copying a file to /tmp in your Dockerfile) you will observe that the Docker image runs fine locally, and contains the file in /tmp as expected, but you get an error when you try to run this same Docker image as a Lambda function in AWS because that file is not in /tmp.
I would suggest putting the file in the LAMBDA_TASK_ROOT (these days that is the /var/task directory), which is read-only. If you need to modify this file, then I would read it from the LAMBDA_TASK_ROOT directory, and write it to /tmp.