How to iterate over rows of a Frame in DataTable
Question:
With pandas I usually iterate the rows of a DataFrame with itertuples
or iterrows
. How can I do this kind of iteration on a Frame from Python DataTable?
Exemple of Pandas iteration that I need:
for row in df_.itertuples():
print(row)
Answers:
From the documentation for DataTables:
A Frame is column-oriented in the sense that internally the data is stored separately for each column. Each column has its own name and type. Types may be different for different columns but cannot vary within each column.
That being said, you can iterate over rows as:
from datatable import dt, f, by, g, join, sort, update, ifelse
data = {"A": [1, 2, 3, 4, 5],
"B": [4, 5, 6, 7, 8],
"C": [7, 8, 9, 10, 11],
"D": [5, 7, 2, 9, -1]}
# datatable
DT = dt.Frame(data)
# select single row
print(DT[2, :])
# Select several rows by their indices
print(DT[[2,3,4], :])
# Select a slice of rows by position
print(DT[2:5, :])
# Select rows on multiple conditions, using OR
print(DT[(f.A>3) | (f.B<5), :])
For more row iteration examples and comparison with Pandas DataFrame, have a look at this official page. In case you have some issues, you may raise the issue over at DataTable official github repo.
You can use .to_tuples
to achieve row iteration.
from datatable import dt, f, by, g, join, sort, update, ifelse
data = {"A": [1, 2, 3, 4, 5],
"B": [4, 5, 6, 7, 8],
"C": [7, 8, 9, 10, 11],
"D": [5, 7, 2, 9, -1]}
DT = dt.Frame(data)
for row in DT.to_tuples():
print(row)
With pandas I usually iterate the rows of a DataFrame with itertuples
or iterrows
. How can I do this kind of iteration on a Frame from Python DataTable?
Exemple of Pandas iteration that I need:
for row in df_.itertuples():
print(row)
From the documentation for DataTables:
A Frame is column-oriented in the sense that internally the data is stored separately for each column. Each column has its own name and type. Types may be different for different columns but cannot vary within each column.
That being said, you can iterate over rows as:
from datatable import dt, f, by, g, join, sort, update, ifelse
data = {"A": [1, 2, 3, 4, 5],
"B": [4, 5, 6, 7, 8],
"C": [7, 8, 9, 10, 11],
"D": [5, 7, 2, 9, -1]}
# datatable
DT = dt.Frame(data)
# select single row
print(DT[2, :])
# Select several rows by their indices
print(DT[[2,3,4], :])
# Select a slice of rows by position
print(DT[2:5, :])
# Select rows on multiple conditions, using OR
print(DT[(f.A>3) | (f.B<5), :])
For more row iteration examples and comparison with Pandas DataFrame, have a look at this official page. In case you have some issues, you may raise the issue over at DataTable official github repo.
You can use .to_tuples
to achieve row iteration.
from datatable import dt, f, by, g, join, sort, update, ifelse
data = {"A": [1, 2, 3, 4, 5],
"B": [4, 5, 6, 7, 8],
"C": [7, 8, 9, 10, 11],
"D": [5, 7, 2, 9, -1]}
DT = dt.Frame(data)
for row in DT.to_tuples():
print(row)