randomly seek a small sequence of a particular length in a larger sequence in python

Question:

I want to randomly seek a subsequence of length 4 from a larger sequence.

I tried the following code:

import system
import random

    X = 'ATGCATGCTAGCTAGTAAACGTACGTACGTACGATGCTAATATAGAGGGGCTTCGTACCCCTGA'
    Y = [random.choice(X) for i in range(4)]
    print(Y)

But it selects 4 distinct elements from X and not a sequence of length 4 in continuity.

Asked By: Anuja Sawant

||

Answers:

Instead of trying to choose a character from X using random.choice , if you want a sequence of length 4 in continuity, choose an index between 0 and length of X – 4 , and take the 4 elements from that index. Example –

>>> X = 'ATGCATGCTAGCTAGTAAACGTACGTACGTACGATGCTAATATAGAGGGGCTTCGTACCCCTGA'
>>> import random
>>> i = random.randint(0,len(X)-4)
>>> X[i:i+4]
'TGCA'
>>> i
1
Answered By: Anand S Kumar

You could randomly select a starting index, then use slicing to extract that substring

def random_slice(s, n):
    index = random.randint(0, len(s)-n)
    return s[index : index + n]

>>> random_slice(X, 4)
'GCTA'
>>> random_slice(X, 4)
'CGTA'
>>> random_slice(X, 4)
'TATA'
>>> random_slice(X, 4)
'AGCT'
Answered By: Cory Kramer
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.