How can I check the probabilities of printing variables with random and count its frequency?

Question:

I would like to check/test the print probabilities of the three variables (separate and taken individually),

for example i would like to take many random draws and count the frequency of each value, or something similar. How can I?

import random

a = "Word A1", "Word A2", "Word A3", "Word A4"

b = "Word B1", "Word B2", "Word B3", "Word B4"

c = "Word C1", "Word C2", "Word C3", "Word C4"


a_random = random.choice(a)
b_random = random.choice(b)
c_random = random.choice(c)

sentence = a_random + ", " + b_random + ", " + c_random

print(sentence)
Asked By: Evangelos Dellas

||

Answers:

How to examine the properties of repeated randomness

I think you are puzzled by why the sequence of calls to the random number generator does not systematically cycle through all the options, one after the other?

If it did, it wouldn’t be random.

Try code like this, which shows you what happens on 4 calls of a random word A alone. I have called the 4 versions of word A, "P", "Q", "R", "S", for brevity, to avoid repeating "word_A" and to avoid using digits (which could be confused with the frequencies).

It runs it thousands times and tabulates how frequent each sequence of word A is.

  • Which patterns come the most frequently?

  • Is PPPP more common than PQRS? (Try running the program many times)

  • If so, why; if not, why not?

  • In what proportion of cases does "P" not show up at all.

  • In what proportion of cases is the second symbol the same as the first?

The study of the answers to these is the basis of probability and statistics.

import random

n_repetitions = 10000
n_options = 4
length = 4

histo = {}
for i in range(n_repetitions):
    string = ""
    for character in range(length):
      string += chr(ord("O")+random.randint(1,n_options))
    if string not in histo:
       histo[string] =0
    histo[string]+=1

keys = sorted(histo.keys())
print("Seq  Frequency")
for key in keys:
  print (key, histo[key]) 

Example output

But each run is different!

Seq  Frequency
PPPP 48
PPPQ 36
PPPR 39
PPPS 47
PPQP 34
PPQQ 43
PPQR 44
PPQS 39
PPRP 32
PPRQ 36
PPRR 42
PPRS 36
PPSP 36
PPSQ 33
PPSR 38
PPSS 29
PQPP 38
PQPQ 30
PQPR 36
... etc, to SSSS
Answered By: ProfDFrancis

for example i would like to take many random draws and count the frequency of each value, or something similar. How can I?

The simplest way to do that would be to write a loop and count how often each random draw occured. Example:

import random
from collections import Counter

a = "Word A1", "Word A2", "Word A3", "Word A4"

b = "Word B1", "Word B2", "Word B3", "Word B4"

c = "Word C1", "Word C2", "Word C3", "Word C4"

a_counter = Counter()
for i in range(1000):
    a_random = random.choice(a)
    a_counter[a_random] += 1

b_counter = Counter()
for i in range(1000):
    b_random = random.choice(b)
    b_counter[b_random] += 1

c_counter = Counter()
for i in range(1000):
    c_random = random.choice(c)
    c_counter[c_random] += 1

print(a_counter)
print(b_counter)
print(c_counter)

The output shows how many times each word is selected.

Counter({'Word A1': 252, 'Word A4': 251, 'Word A2': 251, 'Word A3': 246})
Counter({'Word B1': 265, 'Word B4': 265, 'Word B3': 250, 'Word B2': 220})
Counter({'Word C1': 266, 'Word C3': 264, 'Word C4': 236, 'Word C2': 234})
Answered By: Nick ODell

I would do something like this (one liner!):

import numpy as np

# Change these as needed
groups = [["Word A1", "Word A2", "Word A3", "Word A4"], 
          ["Word B1", "Word B2", "Word B3", "Word B4"],
          ["Word C1", "Word C2", "Word C3", "Word C4"]]
m = 100

# Choose 1 word from each group of words (of which there are n), m times 
results = np.array([[np.random.randint(0, len(group)) for group in groups] for _ in range(m)])

This will produce an m x n array of choices. For example, one iteration of your example might produce an array [1,2,0], which would correspond to the sentence "Word A2, Word B3, Word C1". To find out how many times Word A2 had been selected, you could simply use np.unique:

import numpy as np

# keep in mind that word_a2_index = 1

unique, counts = np.unique(results[:,1], return_counts=True)

dict(zip(unique, counts)) # {0: 7, 1: 4, 2: 1, 3: 2, 4: 1}

Or to reconstruct sentences:

def reconstruct(row):
    ", ".join([groups[i][row[i]] for i in range(len(row))])

np.apply_along_axis(reconstruct, axis=1, results)
Answered By: v0rtex20k
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.