What does the random.sample() method in Python do?
Question:
I want to know the use of random.sample()
method and what does it give? When should it be used and some example usage.
Answers:
According to documentation:
random.sample(population, k)
Return a k length list of unique elements
chosen from the population sequence. Used for random sampling without
replacement.
Basically, it picks k unique random elements, a sample, from a sequence:
>>> import random
>>> c = list(range(0, 15))
>>> c
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14]
>>> random.sample(c, 5)
[9, 2, 3, 14, 11]
random.sample
works also directly from a range:
>>> c = range(0, 15)
>>> c
range(0, 15)
>>> random.sample(c, 5)
[12, 3, 6, 14, 10]
In addition to sequences, random.sample
works with sets too:
>>> c = {1, 2, 4}
>>> random.sample(c, 2)
[4, 1]
However, random.sample
doesn’t work with arbitrary iterators:
>>> c = [1, 3]
>>> random.sample(iter(c), 5)
TypeError: Population must be a sequence or set. For dicts, use list(d).
random.sample()
also works on text
example:
> text = open("textfile.txt").read()
> random.sample(text, 5)
> ['f', 's', 'y', 'v', 'n']
n is also seen as a character so that can also be returned
you could use random.sample()
to return random words from a text file if you first use the split method
example:
> words = text.split()
> random.sample(words, 5)
> ['the', 'and', 'a', 'her', 'of']
random.sample(population, k)
It is used for randomly sampling a sample of length 'k'
from a population. returns a 'k'
length list of unique elements chosen from the population sequence or set
it returns a new list and leaves the original population unchanged and the resulting list is in selection order so that all sub-slices will also be valid random samples
I am putting up an example in which I am splitting a dataset randomly. It is basically a function in which you pass x_train(population)
as an argument and return indices of 60%
of the data as D_test
.
import random
def randomly_select_70_percent_of_data_from_1_to_length(x_train):
return random.sample(range(0, len(x_train)), int(0.6*len(x_train)))
from random import *
lst1 = sample(range(0, 1000), 100)
lst2 = sample(range(0, 1000), 100)
print(lst1)
print(lst2)
print(set(lst1).intersection(set(lst2)))
I want to know the use of random.sample()
method and what does it give? When should it be used and some example usage.
According to documentation:
random.sample(population, k)
Return a k length list of unique elements
chosen from the population sequence. Used for random sampling without
replacement.
Basically, it picks k unique random elements, a sample, from a sequence:
>>> import random
>>> c = list(range(0, 15))
>>> c
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14]
>>> random.sample(c, 5)
[9, 2, 3, 14, 11]
random.sample
works also directly from a range:
>>> c = range(0, 15)
>>> c
range(0, 15)
>>> random.sample(c, 5)
[12, 3, 6, 14, 10]
In addition to sequences, random.sample
works with sets too:
>>> c = {1, 2, 4}
>>> random.sample(c, 2)
[4, 1]
However, random.sample
doesn’t work with arbitrary iterators:
>>> c = [1, 3]
>>> random.sample(iter(c), 5)
TypeError: Population must be a sequence or set. For dicts, use list(d).
random.sample()
also works on text
example:
> text = open("textfile.txt").read()
> random.sample(text, 5)
> ['f', 's', 'y', 'v', 'n']
n is also seen as a character so that can also be returned
you could use random.sample()
to return random words from a text file if you first use the split method
example:
> words = text.split()
> random.sample(words, 5)
> ['the', 'and', 'a', 'her', 'of']
random.sample(population, k)
It is used for randomly sampling a sample of length 'k'
from a population. returns a 'k'
length list of unique elements chosen from the population sequence or set
it returns a new list and leaves the original population unchanged and the resulting list is in selection order so that all sub-slices will also be valid random samples
I am putting up an example in which I am splitting a dataset randomly. It is basically a function in which you pass x_train(population)
as an argument and return indices of 60%
of the data as D_test
.
import random
def randomly_select_70_percent_of_data_from_1_to_length(x_train):
return random.sample(range(0, len(x_train)), int(0.6*len(x_train)))
from random import *
lst1 = sample(range(0, 1000), 100)
lst2 = sample(range(0, 1000), 100)
print(lst1)
print(lst2)
print(set(lst1).intersection(set(lst2)))