Best way to generate random file names in Python
Question:
In Python, what is a good, or the best way to generate some random text to prepend to a file(name) that I’m saving to a server, just to make sure it does not overwrite. Thank you!
Answers:
Python has facilities to generate temporary file names, see http://docs.python.org/library/tempfile.html. For instance:
In [4]: import tempfile
Each call to tempfile.NamedTemporaryFile()
results in a different temp file, and its name can be accessed with the .name
attribute, e.g.:
In [5]: tf = tempfile.NamedTemporaryFile()
In [6]: tf.name
Out[6]: 'c:\blabla\locals~1\temp\tmptecp3i'
In [7]: tf = tempfile.NamedTemporaryFile()
In [8]: tf.name
Out[8]: 'c:\blabla\locals~1\temp\tmpr8vvme'
Once you have the unique filename it can be used like any regular file. Note: By default the file will be deleted when it is
closed. However, if the delete
parameter is False, the file is not
automatically deleted.
Full parameter set:
tempfile.NamedTemporaryFile([mode='w+b'[, bufsize=-1[, suffix=''[, prefix='tmp'[, dir=None[, delete=True]]]]]])
it is also possible to specify the prefix for the temporary file (as one of the various parameters that can be supplied during the file creation):
In [9]: tf = tempfile.NamedTemporaryFile(prefix="zz")
In [10]: tf.name
Out[10]: 'c:\blabla\locals~1\temp\zzrc3pzk'
Additional examples for working with temporary files can be found here
You could use the UUID module for generating a random string:
import uuid
filename = str(uuid.uuid4())
This is a valid choice, given that an UUID generator is extremely unlikely to produce a duplicate identifier (a file name, in this case):
Only after generating 1 billion UUIDs every second for the next 100 years, the probability of creating just one duplicate would be about 50%. The probability of one duplicate would be about 50% if every person on earth owns 600 million UUIDs.
a common approach is to add a timestamp as a prefix/suffix to the filename to have some temporal relation to the file. If you need more uniqueness you can still add a random string to this.
import datetime
basename = "mylogfile"
suffix = datetime.datetime.now().strftime("%y%m%d_%H%M%S")
filename = "_".join([basename, suffix]) # e.g. 'mylogfile_120508_171442'
If you want to preserve the original filename as a part of the new filename, unique prefixes of uniform length can be generated by using MD5 hashes of the current time:
from hashlib import md5
from time import localtime
def add_prefix(filename):
prefix = md5(str(localtime()).encode('utf-8')).hexdigest()
return f"{prefix}_{filename}"
Calls to the add_prefix(‘style.css’) generates sequence like:
a38ff35794ae366e442a0606e67035ba_style.css
7a5f8289323b0ebfdbc7c840ad3cb67b_style.css
The OP requested to create random filenames not random files. Times and UUIDs can collide. If you are working on a single machine (not a shared filesystem) and your process/thread will not stomp on itself, use os.getpid()
to get your own PID and use this as an element of a unique filename. Other processes would obviously not get the same PID. If you are multithreaded, get the thread id. If you have other aspects of your code in which a single thread or process could generate multiple different temp files, you might need to use another technique. A rolling index can work (if you aren’t keeping them so long or using so many files you would worry about rollover). Keeping a global hash/index to "active" files would suffice in that case.
So sorry for the longwinded explanation, but it does depend on your exact usage.
Adding my two cents here:
In [19]: tempfile.mkstemp('.png', 'bingo', '/tmp')[1]
Out[19]: '/tmp/bingoy6s3_k.png'
According to the python doc for tempfile.mkstemp, it creates a temporary file in the most secure manner possible. Please note that the file will exist after this call:
In [20]: os.path.exists(tempfile.mkstemp('.png', 'bingo', '/tmp')[1])
Out[20]: True
You could use the random package:
import random
file = random.random()
If you need no the file path, but only the random string having predefined length you can use something like this.
>>> import random
>>> import string
>>> file_name = ''.join(random.choice(string.ascii_lowercase) for i in range(16))
>>> file_name
'ytrvmyhkaxlfaugx'
I personally prefer to have my text to not be only random/unique but beautiful as well, that’s why I like the hashids lib, which generates nice looking random text from integers.
Can installed through
pip install hashids
Snippet:
import hashids
hashids = hashids.Hashids(salt="this is my salt", )
print hashids.encode(1, 2, 3)
>>> laHquq
Short Description:
Hashids is a small open-source library that generates short, unique, non-sequential ids from numbers.
As date and time both change after each second so you need to concatenate data-time with uuid (Universally Unique Identifiers)
here is the complete code for your answer
import uuid
imageName = '{}{:-%Y%m%d%H%M%S}.jpeg'.format(str(uuid.uuid4().hex), datetime.now())
import random
def Generate(): #function generates a random 6 digit number
code = ''
for i in range(6):
code += str(random.randint(0,9))
return code
print(Generate()+".txt")
In some other cases if you need the random file name to be sensible, use the faker
module. This will produce "sensible" file names with common extension. This method might have name collision after some time. I think prepend with uuid
is probably better.
pip install faker
Then,
from faker import Faker
fake = Faker()
for _ in range(10):
print(fake.file_name())
Link to faker
documentation: https://faker.readthedocs.io/en/master/index.html
In Python, what is a good, or the best way to generate some random text to prepend to a file(name) that I’m saving to a server, just to make sure it does not overwrite. Thank you!
Python has facilities to generate temporary file names, see http://docs.python.org/library/tempfile.html. For instance:
In [4]: import tempfile
Each call to tempfile.NamedTemporaryFile()
results in a different temp file, and its name can be accessed with the .name
attribute, e.g.:
In [5]: tf = tempfile.NamedTemporaryFile()
In [6]: tf.name
Out[6]: 'c:\blabla\locals~1\temp\tmptecp3i'
In [7]: tf = tempfile.NamedTemporaryFile()
In [8]: tf.name
Out[8]: 'c:\blabla\locals~1\temp\tmpr8vvme'
Once you have the unique filename it can be used like any regular file. Note: By default the file will be deleted when it is
closed. However, if the delete
parameter is False, the file is not
automatically deleted.
Full parameter set:
tempfile.NamedTemporaryFile([mode='w+b'[, bufsize=-1[, suffix=''[, prefix='tmp'[, dir=None[, delete=True]]]]]])
it is also possible to specify the prefix for the temporary file (as one of the various parameters that can be supplied during the file creation):
In [9]: tf = tempfile.NamedTemporaryFile(prefix="zz")
In [10]: tf.name
Out[10]: 'c:\blabla\locals~1\temp\zzrc3pzk'
Additional examples for working with temporary files can be found here
You could use the UUID module for generating a random string:
import uuid
filename = str(uuid.uuid4())
This is a valid choice, given that an UUID generator is extremely unlikely to produce a duplicate identifier (a file name, in this case):
Only after generating 1 billion UUIDs every second for the next 100 years, the probability of creating just one duplicate would be about 50%. The probability of one duplicate would be about 50% if every person on earth owns 600 million UUIDs.
a common approach is to add a timestamp as a prefix/suffix to the filename to have some temporal relation to the file. If you need more uniqueness you can still add a random string to this.
import datetime
basename = "mylogfile"
suffix = datetime.datetime.now().strftime("%y%m%d_%H%M%S")
filename = "_".join([basename, suffix]) # e.g. 'mylogfile_120508_171442'
If you want to preserve the original filename as a part of the new filename, unique prefixes of uniform length can be generated by using MD5 hashes of the current time:
from hashlib import md5
from time import localtime
def add_prefix(filename):
prefix = md5(str(localtime()).encode('utf-8')).hexdigest()
return f"{prefix}_{filename}"
Calls to the add_prefix(‘style.css’) generates sequence like:
a38ff35794ae366e442a0606e67035ba_style.css
7a5f8289323b0ebfdbc7c840ad3cb67b_style.css
The OP requested to create random filenames not random files. Times and UUIDs can collide. If you are working on a single machine (not a shared filesystem) and your process/thread will not stomp on itself, use os.getpid()
to get your own PID and use this as an element of a unique filename. Other processes would obviously not get the same PID. If you are multithreaded, get the thread id. If you have other aspects of your code in which a single thread or process could generate multiple different temp files, you might need to use another technique. A rolling index can work (if you aren’t keeping them so long or using so many files you would worry about rollover). Keeping a global hash/index to "active" files would suffice in that case.
So sorry for the longwinded explanation, but it does depend on your exact usage.
Adding my two cents here:
In [19]: tempfile.mkstemp('.png', 'bingo', '/tmp')[1]
Out[19]: '/tmp/bingoy6s3_k.png'
According to the python doc for tempfile.mkstemp, it creates a temporary file in the most secure manner possible. Please note that the file will exist after this call:
In [20]: os.path.exists(tempfile.mkstemp('.png', 'bingo', '/tmp')[1])
Out[20]: True
You could use the random package:
import random
file = random.random()
If you need no the file path, but only the random string having predefined length you can use something like this.
>>> import random
>>> import string
>>> file_name = ''.join(random.choice(string.ascii_lowercase) for i in range(16))
>>> file_name
'ytrvmyhkaxlfaugx'
I personally prefer to have my text to not be only random/unique but beautiful as well, that’s why I like the hashids lib, which generates nice looking random text from integers.
Can installed through
pip install hashids
Snippet:
import hashids
hashids = hashids.Hashids(salt="this is my salt", )
print hashids.encode(1, 2, 3)
>>> laHquq
Short Description:
Hashids is a small open-source library that generates short, unique, non-sequential ids from numbers.
As date and time both change after each second so you need to concatenate data-time with uuid (Universally Unique Identifiers)
here is the complete code for your answer
import uuid
imageName = '{}{:-%Y%m%d%H%M%S}.jpeg'.format(str(uuid.uuid4().hex), datetime.now())
import random
def Generate(): #function generates a random 6 digit number
code = ''
for i in range(6):
code += str(random.randint(0,9))
return code
print(Generate()+".txt")
In some other cases if you need the random file name to be sensible, use the faker
module. This will produce "sensible" file names with common extension. This method might have name collision after some time. I think prepend with uuid
is probably better.
pip install faker
Then,
from faker import Faker
fake = Faker()
for _ in range(10):
print(fake.file_name())
Link to faker
documentation: https://faker.readthedocs.io/en/master/index.html