How to generate random numbers in pre-defined range which sums up to a fixed number in R/Python

Question:

I have a simple data generation question. I would request for any kind of help with the code in R or Python. I am pasting the table first.

Total Num1_betw_1_to_4 Num2_betw_1_to_3 Num3_betw_1_to_3
9 3 3 3
7 1 3 3
9 4 3 2
9 3 3 3
5 2 2 1
7 3 2 2
9 3 3 3
7 2 3 2
5
6
2
4
9

In the above table, first column values are given. Now I want to generate 3 values in column 2, 3 and 4 which sum up to value in column 1 for each row. But each of the column 2, 3 and 4 have some predefined data ranges like: column 2 value must lie between 1 and 4, column 3 value must lie between 1 and 3, and, column 4 value must lie between 1 and 3.

I have printed first 8 rows for your understanding. In real case, only "Total" column values will be given and remaining 3 columns will be blank for which values have to be generated.

Any help would be appreciated with the code.

Asked By: Zico

||

Answers:

here is an algorithm to generate numbers in a range:
ex range = (0,20)

import random
num = 20
temp=0
res = []
while temp != 20:
        res.append(random.randint(0,num))
        temp+= res[-1]
        num -=  res[-1]
print(res)
print(temp)

Hope this helps you abit and try to optimize the idea further.
sorry it’s late gotta go

Answered By: Fares_Hassen

This is straightforward in R.

First make a data frame of all possible allowed values of each column:

df <- expand.grid(Num1_1_to_4 = 1:4,
                  Num2_1_to_3 = 1:3,
                  Num3_1_to_3 = 1:3)

Now throw away any rows that don’t sum to 7:

df <- df[rowSums(df) == 7,]

Finally, sample this data frame:

df[sample(nrow(df), 1),]
#>    Num1_1_to_4 Num2_1_to_3 Num3_1_to_3
#> 19           3           2           2
Answered By: Allan Cameron

Here is a base R solution. The input ranges and totals must be in the formats below:

  • ranges is a list of integer vectors of length 2;
  • sums is a vector of sums.

The output is a matrix with as many rows as the length of the sums vector and with as many columns as the length of ranges.

rintsum <- function(ranges, sums) {
  f <- function(r, s) {
    n <- length(r)
    x <- integer(n)
    while(x[n] < r[[n]][1] || x[n] > r[[n]][2]) {
      for(i in seq_along(x)[-n]) {
        x[i] <- sample(r[[i]][1]:r[[i]][2], 1L)
      }
      x[n] <- s - sum(x[-n])
    }
    x
  }
  t(sapply(sums, (s) f(ranges, s)))
}

Total <- c(9, 7, 9, 9, 5, 7, 9, 7)
ranges <- list(c(1, 4), c(1, 3), c(1, 3))

set.seed(2022)
rintsum(ranges, Total)
#>      [,1] [,2] [,3]
#> [1,]    4    3    2
#> [2,]    2    3    2
#> [3,]    4    3    2
#> [4,]    4    3    2
#> [5,]    1    2    2
#> [6,]    3    2    2
#> [7,]    4    3    2
#> [8,]    4    2    1

Created on 2022-10-23 with reprex v2.0.2

Answered By: Rui Barradas
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.