Get NameError when printing out the result, but I've assigned a variable to it

Question:

seq = 'TGCCTTGGGCACCATGCAGTACCAAACGGAACGATAGTG'

for nucleotide in seq:
    if nucleotide == 'A':
        a_nt = seq.count('A')
    elif nucleotide == 'G':
        g_nt = seq.count('G')
    elif nucleotide == 'C':
        c_nt = seq.count('C')
    elif nucleotide == 'T':
        t_nt = seq.count('T')
    elif nucleotide == 'N':
        n_nt = seq.count('N')
    else:
        sys.exit("Did not code")

    print(a_nt, g_nt, c_nt, t_nt, n_nt)

Error:

NameError: name 'n_nt' is not defined. Did you mean: 'a_nt'?

If the nucleotide is not in ‘AGCTN’, sys.exit("no this code").
Even counts of N is zero, it should be printed out.

If I print out a, g, c, and t, it works well. But n_nt is not working.

Asked By: sleepbug

||

Answers:

in your code the variables in the if statements won’t necessarily be assigned and that can cause the error that you got.

you can do something like this instead of your code.

seq = 'TGCCTTGGGCACCATGCAGTACCAAACGGAACGATAGTG'
seq_count = {char: seq.count(char) for char in seq}
print(seq_count)

it creates a dictionary where every key is a char that is in seq and the value is the number of times the char appears in seq

Answered By: Omer Dagry

Just count everything without the for loop, then all variables are set, even if zero:

seq = 'TGCCTTGGGCACCATGCAGTACCAAACGGAACGATAGTG'

a_nt = seq.count('A')
g_nt = seq.count('G')
c_nt = seq.count('C')
t_nt = seq.count('T')
n_nt = seq.count('N')

print(a_nt, g_nt, c_nt, t_nt, n_nt)

# or more efficient
from collections import Counter
counts = Counter(seq)
for letter in 'AGCTN':
    print(counts[letter], end=' ')

Output:

11 11 10 7 0
11 11 10 7 0
Answered By: Mark Tolonen

I suggest using collections.Counter

from collections import Counter

possible_nucleotides = ["A", "G", "C", "N", "T"]
seq = "TGCCTTGGGCACCATGCAGTACCAAACGGAACGATAGTG"

seq_counts = Counter(seq)

missing_nucleotides = {x: 0 for x in set(possible_nucleotides) - set(seq_counts.keys())}

seq_counts.update(missing_nucleotides)

then seq_counts will look like this:

Counter({'G': 11, 'A': 11, 'C': 10, 'T': 7, 'N': 0})

Keep in mind that updating Counter is purely optional as trying to access specific key will return 0 if not present

Answered By: w8eight
Categories: questions Tags:
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.