What does the redirection mean in apache beam (python)

Question:

In apache beam python sdk , I often see ‘>>’ operator in pipeline procedure.

https://beam.apache.org/documentation/programming-guide/#pipeline-io

lines = p | 'ReadFromText' >> beam.io.ReadFromText('path/to/input-*.csv')

What does this mean?

Asked By: Yu Watanabe

||

Answers:

>> is the right bitwise shift operator in Python. The equivalent dunder (double underscore) method is __rrshift__().

The implementation of Apache Beam in Python simply redefines __rrshift__() for the PTransform class so that names can be added to the transform. It’s just special syntax. In your example, "ReadFromText" is the name of the transform.

Reference:
https://github.com/apache/beam/blob/4844af152fface83961dd3f4e89022d1e5bef6d6/sdks/python/apache_beam/transforms/ptransform.py#L569-L570

Answered By: Andrew Nguonly
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.