In Rust, what is the proper way to replicate Python's "repeat" parameter in itertools.product?
Question:
In Python, I can do:
from itertools import product
k = 3
for kmer in product("AGTC", repeat=k):
print(kmer)
In Rust, I can force the behavior of k=3
by:
#[macro_use] extern crate itertools;
for kmer in iproduct!("AGTC".chars(), "AGTC".chars(), "AGTC".chars()){
println!("{:?}", kmer);
}
But what if I wanted k=4
or k=5
?
Answers:
Writing a proper generalisation for any type for any k would be hard because the return type could be tuples of any size. As you want to work only on String
, it’s quite easier: playground
fn kproduct(seq: String, k: u32) -> Vec<String> {
match k {
0 => vec![],
1 => seq.chars().map(|c| c.to_string()).collect(),
2 => iproduct!(seq.chars(), seq.chars()).map(|(a, b)| format!("{}{}", a, b)).collect(),
_ => iproduct!(kproduct(seq.clone(), k - 1), seq.chars()).map(|(a, b)| format!("{}{}", a, b)).collect(),
}
}
I’m answering this question after 4 years both because the accepted answer is too convoluted, and due to the fact that Python’s itertools.product
is a generic function (whereas the accepted answer works only for String
s).
Moreover, notice that the kproduct
function defined in the accepted answer is recursive, and Rust doesn’t guarantee tail-call optimization.
Using the third-party itertools crate, we can define a product_repeat
function in two ways: either by defining a standard top-level function, or by adding a ProductRepeat
trait for all Iterator
s.
This is the top-level function:
use itertools::{Itertools, MultiProduct};
/// Rust version of Python's itertools.product().
/// It returns the cartesian product of the input iterables, and it is
/// semantically equivalent to `repeat` nested for loops.
///
/// # Arguments
///
/// * `it` - An iterator over a cloneable data structure
/// * `repeat` - Number of repetitions of the given iterator
pub fn product_repeat<I>(it: I, repeat: usize) -> MultiProduct<I>
where
I: Iterator + Clone,
I::Item: Clone {
std::iter::repeat(it)
.take(repeat)
.multi_cartesian_product()
}
If you instead prefer augmenting the Iterator trait, you can do it as follows:
pub trait ProductRepeat: Iterator + Clone
where Self::Item: Clone {
fn product_repeat(self, repeat: usize) -> MultiProduct<Self> {
std::iter::repeat(self)
.take(repeat)
.multi_cartesian_product()
}
}
impl<T: Iterator + Clone> ProductRepeat for T
where T::Item: Clone {}
Here is a demo in the Rust playground.
In Python, I can do:
from itertools import product
k = 3
for kmer in product("AGTC", repeat=k):
print(kmer)
In Rust, I can force the behavior of k=3
by:
#[macro_use] extern crate itertools;
for kmer in iproduct!("AGTC".chars(), "AGTC".chars(), "AGTC".chars()){
println!("{:?}", kmer);
}
But what if I wanted k=4
or k=5
?
Writing a proper generalisation for any type for any k would be hard because the return type could be tuples of any size. As you want to work only on String
, it’s quite easier: playground
fn kproduct(seq: String, k: u32) -> Vec<String> {
match k {
0 => vec![],
1 => seq.chars().map(|c| c.to_string()).collect(),
2 => iproduct!(seq.chars(), seq.chars()).map(|(a, b)| format!("{}{}", a, b)).collect(),
_ => iproduct!(kproduct(seq.clone(), k - 1), seq.chars()).map(|(a, b)| format!("{}{}", a, b)).collect(),
}
}
I’m answering this question after 4 years both because the accepted answer is too convoluted, and due to the fact that Python’s itertools.product
is a generic function (whereas the accepted answer works only for String
s).
Moreover, notice that the kproduct
function defined in the accepted answer is recursive, and Rust doesn’t guarantee tail-call optimization.
Using the third-party itertools crate, we can define a product_repeat
function in two ways: either by defining a standard top-level function, or by adding a ProductRepeat
trait for all Iterator
s.
This is the top-level function:
use itertools::{Itertools, MultiProduct};
/// Rust version of Python's itertools.product().
/// It returns the cartesian product of the input iterables, and it is
/// semantically equivalent to `repeat` nested for loops.
///
/// # Arguments
///
/// * `it` - An iterator over a cloneable data structure
/// * `repeat` - Number of repetitions of the given iterator
pub fn product_repeat<I>(it: I, repeat: usize) -> MultiProduct<I>
where
I: Iterator + Clone,
I::Item: Clone {
std::iter::repeat(it)
.take(repeat)
.multi_cartesian_product()
}
If you instead prefer augmenting the Iterator trait, you can do it as follows:
pub trait ProductRepeat: Iterator + Clone
where Self::Item: Clone {
fn product_repeat(self, repeat: usize) -> MultiProduct<Self> {
std::iter::repeat(self)
.take(repeat)
.multi_cartesian_product()
}
}
impl<T: Iterator + Clone> ProductRepeat for T
where T::Item: Clone {}
Here is a demo in the Rust playground.