Why does random.choices with weights don't match my expectation?

Question:

When I execute this code, I expect both functions to return roughly 0.6. Yet, always_sunny returns approx. 0.6 as expected, but guess_with_weight returns approx. 0.52 instead.

import random

k = 1000000

def sample():
    return random.choices(population=['rain', 'sunny'], weights=[0.4, 0.6], k=k)

def always_sunny(xs):
    return sum(
        [
            (1 if x == 'sunny' else 0) 
            for x in xs
        ]
    )

def guess_with_weight(xs):
    return sum(
        [
            (1 if x == y else 0) 
            for x, y in zip(xs, sample())
        ]
    )

xs = sample()
print((
    always_sunny(xs)/ k * 1.0,
    guess_with_weight(xs) / k * 1.0
))

Why is that? If there is 60% chance of being sunny, and I always guess ‘sunny’ I should be right 60% of the time, right (and the code agrees). If I instead guess ‘rain’ 40% of the time, and ‘sunny’ 60% of the time, shouldn’t I also be right 60% of the time? Why does the code suggest I’m only right about 52% of the time?

Asked By: BasilTomato

||

Answers:

60% of the time, the right answer is "sunny". In 60% of that 60% (36%), you guess "sunny", and you’re right, and in 40% of that 60% (24%), you guess "rain", and you’re wrong.

40% of the time, the right answer is "rain". In 60% of 40% (24%) you’re wrong, and in 40% of 40% (16%) you’re right.

Adding those two scenarios together, we get:

  • 36% + 16% = 52% right
  • 24% + 24% = 48% wrong

Because "sunny" is the most likely answer, and because the probability of any given element being "sunny" vs "rain" is independent of all the others, "sunny" is always the best guess; guessing "sunny" 100% of the time allows you to have a 100% success rate for the 60% of the time that "sunny" is the answer, for an overall success rate of 60%. Any strategy that doesn’t involve guessing "sunny" 100% of the time is therefore going to have a lower success rate than 60%.

Another way to look at it: guessing "sunny" all the time (the best strategy) gets you 60%, and guessing "rain" all the time (the worst strategy) gets you 40%. Any strategy that involves a mix of guesses is going to land you somewhere in between, proportionate to whatever ratio of best vs worst guesses you make. 52% is exactly 60% of the way from 40% to 60%.

Answered By: Samwise
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.