TypeError: Type Tuple cannot be instantiated; use tuple() instead


I have written a program with the following piece of code:

import pandas as pd
import numpy as np
from typing import Tuple

def split_data(self, df: pd.DataFrame, split_quantile: float) -> Tuple(pd.DataFrame, pd.DataFrame):
    '''Split data sets into two parts - train and test data sets.'''
    df = df.sort_values(by='datein').reset_index(drop=True)
    quantile = int(np.quantile(df.index, split_quantile))
    return (
        df[df.index <= quantile].reset_index(drop=True),
        df[df.index > quantile].reset_index(drop=True)

The program returns the following error: TypeError: Type Tuple cannot be instantiated; use tuple() instead. I understand, that I can solve my code by replacing Tuple(pd.DataFrame, pd.DataFrame) with tuple(), however I loose the part of an information, that my tuple would consist of two pandas data frames.

Could you, please, help me, how to solve the error and not to loose the information in the same time?

Asked By: Jaroslav Bezděk



Use square brackets:

Tuple[pd.DataFrame, pd.DataFrame]

From the docs:

Tuple type; Tuple[X, Y] is the type of a tuple of two items with the first item of type X and the second of type Y. The type of the empty tuple can be written as Tuple[()].

EDIT: With the release of python 3.9, you can now do this with the builtins.tuple type rather than having to import typing. For example:

>>> tuple[pd.DataFrame, pd.DataFrame]
tuple[pandas.core.frame.DataFrame, pandas.core.frame.DataFrame]

You still have to use square brackets.

Answered By: rassar
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.