finol.data_layer.DatasetLoader¶
- class finol.data_layer.DatasetLoader[source]¶
Class to load different types of datasets.
Methods
access_data(folder_path)Load raw data files from a specified folder path and return a list of DataFrames.
augment_data(df)Augment the provided DataFrame based on the configuration.
calculate_zscore(df)Calculate the z-scores for numeric features in the provided DataFrame.
clean_data(df)Clean the DataFrame by removing rows with missing values.
Perform feature engineering on the input DataFrame to generate various types of features.
Load the raw data, perform data pre-processing operations, and prepare DataLoader for training, validation, and testing.
make_label(raw_df, df)Generate labels, i.e. price relatives.
normalize_data(df, zscore)Normalize all numeric features in DataFrame.
plot_single_candlestick(index, row)split_data(df)Split the DataFrame into train, validation, and test sets.
- access_data(folder_path)[source]¶
Load raw data files from a specified folder path and return a list of DataFrames.
- Parameters:
folder_path (str) – Path to the folder containing raw data files.
- Returns:
List of DataFrames containing the loaded raw data.
- Return type:
List[DataFrame]
- augment_data(df)[source]¶
Augment the provided DataFrame based on the configuration.
- Parameters:
df (DataFrame) – Input DataFrame to be augmented.
- Returns:
Augmented DataFrame with window data.
- Return type:
Tuple[DataFrame, int]
- calculate_zscore(df)[source]¶
Calculate the z-scores for numeric features in the provided DataFrame.
- Parameters:
df (DataFrame) – DataFrame containing the data for z-score calculation.
- Returns:
Z-score object for the numeric features in the DataFrame, in some cases the return can be None.
- Return type:
Optional[object]
- clean_data(df)[source]¶
Clean the DataFrame by removing rows with missing values.
- Parameters:
df (DataFrame) – Input DataFrame to be cleaned.
- Returns:
DataFrame with rows containing any missing values removed.
- Return type:
DataFrame
- feature_engineering(df)[source]¶
Perform feature engineering on the input DataFrame to generate various types of features.
- Parameters:
df (DataFrame) – Input DataFrame to be engineered.
- Returns:
Tuple containing the engineered DataFrame, detailed feature list, and number of features in each category.
- Return type:
Tuple[DataFrame, List[str], Dict[str, int]]
- load_dataset()[source]¶
Load the raw data, perform data pre-processing operations, and prepare DataLoader for training, validation, and testing.
- Returns:
Dictionary containing various data loaders and information about the dataset.
- Return type:
Dict
- make_label(raw_df, df)[source]¶
Generate labels, i.e. price relatives.
- Parameters:
raw_df (DataFrame) – Raw DataFrame containing ‘CLOSE’ prices.
df (DataFrame) – DataFrame to merge the labels with.
- Returns:
DataFrame containing the generated labels.
- Return type:
DataFrame