modbot.training.data_handling.DataGenerator
- class modbot.training.data_handling.DataGenerator(df, **kwargs)[source]
Bases:
Sequence
Generator class for batching
Methods
Get batches of randomly selected texts and targets
Get batches of randomly selected texts and targets with specified weights
Method called at the end of every epoch.
Attributes
Get index chunks for batching.
Get chunk size for splitting data loading
Get dataset chunks.
Indices for data samples
Get number of batches based on batch size
Get number of chunks to divide full dataset into for smaller reads
Number of data samples
- property batch_chunks
Get index chunks for batching. Each chunk corresponds to a batch
- property chunk_size
Get chunk size for splitting data loading
- property chunks
Get dataset chunks. Used to only keep part of full dataset sample in memory.
- get_deterministic_batch(i)[source]
Get batches of randomly selected texts and targets
- Parameters
i (int) – Index of chunk used to select slice of full dataframe
- Returns
arrs – List of data batches
- Return type
list
- get_random_batch(_)[source]
Get batches of randomly selected texts and targets with specified weights
- Returns
arrs – List of randomly selected data batches
- Return type
list
- property indices
Indices for data samples
- property n_batches
Get number of batches based on batch size
- property n_chunks
Get number of chunks to divide full dataset into for smaller reads
- property n_samples
Number of data samples
- on_epoch_end()
Method called at the end of every epoch.