Generating a random number from highest count numbers in a list of tuples in Python

Question:

Let’s say I have a list of tuples like so:

The second index of the tuple is a count of how many times the first index appeared in a dataset.

[(24, 11),
 (12, 10), (48, 10),
 (10, 9), (26, 9), (59, 9), (39, 9), (53, 9), (21, 9), (52, 9), (50, 9),
 (41, 8), (33, 8), (44, 8), (46, 8), (38, 8), (20, 8), (57, 8),
 (23, 7), (6, 7), (3, 7), (37, 7), (51, 7),
 (34, 6), (54, 6), (36, 6), (14, 6), (17, 6), (58, 6), (15, 6), (29, 6),
 (13, 5), (32, 5), (9, 5), (40, 5), (45, 5), (1, 5), (31, 5), (11, 5), (30, 5), (5, 5), (56, 5), (35, 5), (47, 5),
 (2, 4), (19, 4), (42, 4), (25, 4), (43, 4), (4, 4), (18, 4), (16, 4), (49, 4), (8, 4), (22, 4), (7, 4), (27, 4),
 (55, 3),
 (28, 2)]

Example

(24, 11) = (number, count)

As you can see there are multiples of the same number in the second index. Is there a way you could collect the first six of the counts and put them into another list?

For example collect all the 11, 10, 9, 8, 7 counts and so on and then generate a number of six in length from that collection.

I am trying to generate a random number from the 6 most common numbers.

Update

This is how I managed to do it

def get_highest_lotto_count(data) -> list:
    """Takes all numbers from 2nd index then extracts the top 6 numbers"""
    numbers = data["lotto"]
    highest_count_numbers: list = [num[1] for num in numbers]
    high_count_nums = list(set(highest_count_numbers))
    high_count_nums.reverse()

    return high_count_nums[:6]

data["lotto"] is the list provided above. I stripped all the 2nd index numbers (the counts) and converted to a set to remove duplicates.

This then gave me all the counts, I then took the first six from the reversed list.

def common_lotto_generator() -> list:
    """
    This takes the top 6 numbers from get_highest_lotto_count and generates a list
    from the all the numbers that have the same 2nd index.

    Then generates a random 6 digit number from the list.
    """
    high_count_numbers = get_highest_lotto_count(collect_duplicate_lotto_numbers())
    data = collect_duplicate_lotto_numbers()
    numbers = data["lotto"]

    common_number_drawn: list = [
        num[0] for num in numbers if num[1] in high_count_numbers
    ]

    return random.sample(common_number_drawn, 6)

Then I called the above function to get the list of 6 numbers and the added the data again so i could get all the tuples that matched the 2nd index from the list of 6.

Asked By: mrpbennett

||

Answers:

I’m not complete sure whether the solution below answers your question. I’m puzzled because the top 6 count does not include the 7 and 8 frequency (whereas you seem to hint at this in your last comment).

The code sorts the tuples based on the second key and subsequently selects its entries at random.

#!/usr/local/bin/python3
import random

# Tuplelist
TupleList = [(24, 11),
 (12, 10), (48, 10),
 (10, 9), (26, 9), (59, 9), (39, 9), (53, 9), (21, 9), (52, 9), (50, 9),
 (41, 8), (33, 8), (44, 8), (46, 8), (38, 8), (20, 8), (57, 8),
 (23, 7), (6, 7), (3, 7), (37, 7), (51, 7),
 (34, 6), (54, 6), (36, 6), (14, 6), (17, 6), (58, 6), (15, 6), (29, 6),
 (13, 5), (32, 5), (9, 5), (40, 5), (45, 5), (1, 5), (31, 5), (11, 5), (30, 5), (5, 5), (56, 5), (35, 5), (47, 5),
 (2, 4), (19, 4), (42, 4), (25, 4), (43, 4), (4, 4), (18, 4), (16, 4), (49, 4), (8, 4), (22, 4), (7, 4), (27, 4),
 (55, 3),
 (28, 2)]

# Sort tuples
TupleList.sort(key = lambda x: x[1])

# Select most frequent tuples
NumberOfMaxElements = 6
MaxElements = TupleList[-NumberOfMaxElements:]
print("Most frequent tuples:")
print(MaxElements)

# Random draws
print("Some random draws:")
NumberOfValues = 20
for iter in range(NumberOfValues):
    RandomElement = random.randint(0, NumberOfMaxElements-1)
    RandomDraw = MaxElements[RandomElement][0]
    print(RandomDraw)
Answered By: Hanno Reuvers

Your question is a bit unclear. You could try:

from heapq import nlargest
from random import choice

pairs = [(24, 11),
 (12, 10), (48, 10),
 ...
 (28, 2)
]

top_counts = set(nlargest(6, set(count for _, count in pairs)))
top_counts_numbers = [
    number for number, count in pairs if count in top_counts
]
print(choice(top_counts_numbers))
  • The first part uses heapqs nlargest() to get the 6 largest counts: {6, 7, 8, 9, 10, 11}. As mentioned by others, that isn’t exactly the counts you provide. Your wording here is a bit fuzzy. You could also use sorted() to do that. (I’m converting the result into a set because sets provide a fast lookup, which is done in the next step. But you don’t need that.)
  • Selecting the corresponding numbers via a list comprehension:
    [24, 12, 48, 10, 26, 59, 39, 53, 21, 52, 50, 41, 33, 44, 46, 38, 20, 57, 23,
    6, 3, 37, 51, 34, 54, 36, 14, 17, 58, 15, 29]
    
  • Using choice() to select a random number from them.
Answered By: Timus
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.