modbot.training.data_handling.DataGenerator
- class modbot.training.data_handling.DataGenerator(df, **kwargs)[source]
 Bases:
SequenceGenerator class for batching
Methods
Get batches of randomly selected texts and targets
Get batches of randomly selected texts and targets with specified weights
Method called at the end of every epoch.
Attributes
Get index chunks for batching.
Get chunk size for splitting data loading
Get dataset chunks.
Indices for data samples
Get number of batches based on batch size
Get number of chunks to divide full dataset into for smaller reads
Number of data samples
- property batch_chunks
 Get index chunks for batching. Each chunk corresponds to a batch
- property chunk_size
 Get chunk size for splitting data loading
- property chunks
 Get dataset chunks. Used to only keep part of full dataset sample in memory.
- get_deterministic_batch(i)[source]
 Get batches of randomly selected texts and targets
- Parameters
 i (int) – Index of chunk used to select slice of full dataframe
- Returns
 arrs – List of data batches
- Return type
 list
- get_random_batch(_)[source]
 Get batches of randomly selected texts and targets with specified weights
- Returns
 arrs – List of randomly selected data batches
- Return type
 list
- property indices
 Indices for data samples
- property n_batches
 Get number of batches based on batch size
- property n_chunks
 Get number of chunks to divide full dataset into for smaller reads
- property n_samples
 Number of data samples
- on_epoch_end()
 Method called at the end of every epoch.