Uses sklearn.datasets.make_classification to create a classification problem
and returns train and test DataFrames.
Keyword arguments are parameters in sklearn.datasets.make_classification and
sklearn.model_selection.train_test_split.
Args:
n_samples (int, optional): The number of samples. Defaults to 100.
n_features (int, optional): The number of features. Defaults to 20.
stratify (bool, optional): Stratifies the train, test split by the output variable y. Defaults to False.
shuffle_features (bool, optional): Whether or not to shuffle the samples and the features. Defaults to False.
shuffle_split (bool, optional): Whether or not to shuffle the data before doing the train-test split. Defaults to True.
random_state (int or None, optional): Determines the seed for the randomness associated with creating an splitting the synthetic data set. Defaults to None.
Returns:
train, test: The training and test set of the classification problem
Uses sklearn.datasets.make_regression to create a regression problem
and returns train and test DataFrames.
Keyword arguments are parameters in sklearn.datasets.make_regression and
sklearn.model_selection.train_test_split.
Args:
n_samples (int, optional): The number of samples. Defaults to 100.
n_features (int, optional): The number of features. Defaults to 20.
stratify (bool, optional): Stratifies the train, test split by the output variable y. Defaults to False.
shuffle_features (bool, optional): Whether or not to shuffle the samples and the features. Defaults to False.
shuffle_split (bool, optional): Whether or not to shuffle the data before doing the train-test split. Defaults to True.
random_state (int or None, optional): Determines the seed for the randomness associated with creating an splitting the synthetic data set. Defaults to None.
Returns:
Tuple: [description]