modbot.preprocessing
Preprocessing methods
Functions
|
Check line for specified phrases |
|
Check phrases for reclassification |
|
Check messages for reclassification if they have a non-wholesome probability above the minimum threshold |
|
Check whether message contains a link |
|
Sanitize messages for easier classification |
|
Santitize message. |
|
Filter emotes from all texts |
|
Filter emotes from a single line |
|
Filter log. |
|
Populate info dictionary with info from IRC line |
|
Join words back together after splitting |
|
Lemmatize words |
|
Tokenize string |
|
Tokenize texts |
|
Read csv file and create DataFrame |
|
Remove stop words from word list |
|
Segment message into words |
|
Separate messages with and without links |
|
Separate to_check from all other messages |
|
Write data to outfile |
Classes
|
Class to handle different types of log cleaning |
|
Class to store messages and message info |