Available Modules¶
List of all files, classes and methods available in the library.
dataset.py¶
-
class
keras_wrapper.dataset.
Data_Batch_Generator
(set_split, net, dataset, num_iterations, batch_size=50, normalization=False, normalization_type=None, data_augmentation=True, wo_da_patch_type='whole', da_patch_type='resize_and_rndcrop', da_enhance_list=None, mean_substraction=False, predict=False, random_samples=-1, shuffle=True, temporally_linked=False, init_sample=-1, final_sample=-1)¶ Batch generator class. Retrieves batches of data.
-
generator
()¶ Gets and processes the data :return: generator with the data
-
-
class
keras_wrapper.dataset.
Dataset
(name, path, pad_symbol='<pad>', unk_symbol='<unk>', null_symbol='<null>', silence=False)¶ Class for defining instances of databases adapted for Keras. It includes several utility functions for easily managing data splits, image loading, mean calculation, etc.
-
apply_label_smoothing
(y, discount, vocabulary_len, discount_type='uniform')¶ Applies label smoothing to a one-hot codified vector. :param y_text: Input to smooth :param discount: Discount to apply :param vocabulary_len: Length of the one-hot vectors :param discount_type: Type of smoothing. Types supported:
‘uniform’: Subtract a ‘label_smoothing_discount’ from the label and distribute it uniformly among all labels.Returns:
-
build_bpe
(codes, merges=-1, separator='@@', vocabulary=None, glossaries=None)¶ Constructs a BPE encoder instance. Currently, vocabulary and glossaries options are not implemented. :param codes: File with BPE codes (created by learn_bpe.py) :param separator: Separator between non-final subword units (default: ‘@@’)) :param vocabulary: Vocabulary file. If provided, this script reverts any merge operations that produce an OOV. :param glossaries: The strings provided in glossaries will not be affected
by the BPE (i.e. they will neither be broken into subwords, nor concatenated with other subwords.Returns: None
-
build_moses_detokenizer
(language='en')¶ Constructs a BPE encoder instance. Currently, vocabulary and glossaries options are not implemented. :param codes: File with BPE codes (created by learn_bpe.py) :param separator: Separator between non-final subword units (default: ‘@@’)) :param vocabulary: Vocabulary file. If provided, this script reverts any merge operations that produce an OOV. :param glossaries: The strings provided in glossaries will not be affected
by the BPE (i.e. they will neither be broken into subwords, nor concatenated with other subwords.Returns: None
-
build_moses_tokenizer
(language='en')¶ Constructs a Moses tokenizer instance. :param language: Tokenizer language. :return: None
-
build_vocabulary
(captions, data_id, do_split=True, min_occ=0, n_words=0, split_symbol=' ', use_extra_words=True, use_unk_class=False, is_val=False)¶ Vocabulary builder for data of type ‘text’
Parameters: - use_extra_words –
- captions – Corpus sentences
- data_id – Dataset id of the text
- do_split – Split sentence by words or use the full sentence as a class.
- split_symbol – symbol used for separating the elements in each sentence
- min_occ – Minimum occurrences of each word to be included in the dictionary.
- n_words – Maximum number of words to include in the dictionary.
- is_val – Set to True if the input ‘captions’ are values and we want to keep them sorted
Returns: None.
-
calculateTrainMean
(data_id)¶ Calculates the mean of the data belonging to the training set split in each channel.
-
convert_3DLabels_to_bboxes
(predictions, original_sizes, threshold=0.5, idx_3DLabel=0, size_restriction=0.001)¶ Converts a set of predictions of type 3DLabel to their corresponding bounding boxes.
Parameters: - idx_3DLabel –
- size_restriction –
- predictions – 3DLabels predicted by Model_Wrapper. If type is list it will be assumed that position 0 corresponds to 3DLabels
- original_sizes – original sizes of the predicted images width and height
- threshold – minimum overlapping threshold for considering a prediction valid
Returns: predicted_bboxes, predicted_Y, predicted_scores for each image
-
static
convert_GT_3DLabels_to_bboxes
(gt)¶ Converts a GT list of 3DLabels to a set of bboxes.
Parameters: gt – list of Dataset output of type 3DLabels Returns: [out_list, original_sizes], where out_list contains a list of samples with the following info [GT_bboxes, GT_Y], and original_sizes contains the original width and height for each image
-
static
detokenize_bpe
(caption, separator='@@')¶ Reverts BPE segmentation (https://github.com/rsennrich/subword-nmt) :param caption: Caption to detokenize. :param separator: BPE separator. :return: Detokenized version of caption.
-
detokenize_moses
(caption, language='en', lowercase=False, return_str=True, unescape=True)¶ Applies the Moses detokenization. Relying on sacremoses’ implementation of the Moses tokenizer.
Parameters: - caption – Sentence to tokenize
- language – Language (will build the tokenizer for this language)
- lowercase – Whether to lowercase or not the sentence
- agressive_dash_splits – Option to trigger dash split rules .
- return_str – Return string or list
- escape – Escape HTML special chars
Returns:
-
static
detokenize_none
(caption)¶ Dummy function: Keeps the caption as it is. :param caption: String to de-tokenize. :return: Same caption.
-
static
detokenize_none_char
(caption)¶ Character-level detokenization. Respects all symbols. Joins chars into words. Words are delimited by the <space> token. If found an special character is converted to the escaped char. # List of escaped chars (by moses tokenizer)
& -> & | -> | < -> < > -> > ‘ -> ' ” -> " [ -> [ ] -> ]Parameters: caption – String to de-tokenize. :return: Detokenized version of caption.
-
getClassID
(class_name, data_id)¶ Returns: the class data_id (int) for a given class string.
-
getFramesPaths
(idx_videos, data_id, set_name, max_len, data_augmentation)¶ Recovers the paths from the selected video frames.
-
getImageFromPrediction_3DSemanticLabel
(img, n_classes)¶ Get the segmented image from the prediction of the model using the semantic classes of the dataset together with their corresponding colours.
Parameters: - img – Prediction of the model.
- n_classes – Number of semantic classes.
Returns: out_img: The segmented image with the class colours.
-
getX
(set_name, init, final, normalization_type='(-1)-1', normalization=False, meanSubstraction=False, dataAugmentation=False, wo_da_patch_type='whole', da_patch_type='resize_and_rndcrop', da_enhance_list=None, get_only_ids=False)¶ Gets all the data samples stored between the positions init to final
Parameters: - set_name – ‘train’, ‘val’ or ‘test’ set
- init – initial position in the corresponding set split. Must be bigger or equal than 0 and smaller than final.
- final – final position in the corresponding set split.
# ‘raw-image’, ‘video’, ‘image-features’ and ‘video-features’-related parameters :param normalization: indicates if we want to normalize the data. # ‘image-features’ and ‘video-features’-related parameters :param normalization_type: indicates the type of normalization applied.
See available types in self.__available_norm_im_vid for ‘raw-image’ and ‘video’ and self.__available_norm_feat for ‘image-features’ and ‘video-features’.# ‘raw-image’ and ‘video’-related parameters :param meanSubstraction: indicates if we want to substract the training mean from the returned images
(only applicable if normalization=True)Parameters: dataAugmentation – indicates if we want to apply data augmentation to the loaded images (random flip and cropping) Returns: X, list of input data variables from sample ‘init’ to ‘final’ belonging to the chosen ‘set_name’
-
getXY
(set_name, k, normalization_type='(-1)-1', normalization=False, meanSubstraction=False, dataAugmentation=False, wo_da_patch_type='whole', da_patch_type='resize_and_rndcrop', da_enhance_list=None, get_only_ids=False)¶ Gets the [X,Y] pairs for the next ‘k’ samples in the desired set. :param set_name: ‘train’, ‘val’ or ‘test’ set :param k: number of consecutive samples retrieved from the corresponding set. # ‘raw-image’, ‘video’, ‘image-features’ and ‘video-features’-related parameters :param normalization: indicates if we want to normalize the data. # ‘image-features’ and ‘video-features’-related parameters :param normalization_type: indicates the type of normalization applied. See available types in
self.__available_norm_im_vid for ‘image’ and ‘video’ and self.__available_norm_feat for ‘image-features’ and ‘video-features’.# ‘raw-image’ and ‘video’-related parameters :param meanSubstraction: indicates if we want to substract the training mean from the returned images
(only applicable if normalization=True)Parameters: dataAugmentation – indicates if we want to apply data augmentation to the loaded images (random flip and cropping) Returns: [X,Y], list of input and output data variables of the next ‘k’ consecutive samples belonging to the chosen ‘set_name’
-
getXY_FromIndices
(set_name, k, normalization_type='(-1)-1', normalization=False, meanSubstraction=False, dataAugmentation=False, wo_da_patch_type='whole', da_patch_type='resize_and_rndcrop', da_enhance_list=None, get_only_ids=False)¶ Gets the [X,Y] pairs for the samples in positions ‘k’ in the desired set. :param set_name: ‘train’, ‘val’ or ‘test’ set :param k: positions of the desired samples # ‘raw-image’, ‘video’, ‘image-features’ and ‘video-features’-related parameters :param normalization: indicates if we want to normalize the data. # ‘image-features’ and ‘video-features’-related parameters :param normalization_type: indicates the type of normalization applied. See available types in
self.__available_norm_im_vid for ‘raw-image’ and ‘video’ and self.__available_norm_feat for ‘image-features’ and ‘video-features’.# ‘raw-image’ and ‘video’-related parameters :param meanSubstraction: indicates if we want to substract the training mean from the returned images
(only applicable if normalization=True)Parameters: dataAugmentation – indicates if we want to apply data augmentation to the loaded images (random flip and cropping) Returns: [X,Y], list of input and output data variables of the samples identified by the indices in ‘k’ samples belonging to the chosen ‘set_name’
-
getX_FromIndices
(set_name, k, normalization_type='(-1)-1', normalization=False, meanSubstraction=False, dataAugmentation=False, wo_da_patch_type='whole', da_patch_type='resize_and_rndcrop', da_enhance_list=None, get_only_ids=False)¶ Gets the [X,Y] pairs for the samples in positions ‘k’ in the desired set. :param set_name: ‘train’, ‘val’ or ‘test’ set :param k: positions of the desired samples # ‘raw-image’, ‘video’, ‘image-features’ and ‘video-features’-related parameters :param normalization: indicates if we want to normalize the data. # ‘image-features’ and ‘video-features’-related parameters :param normalization_type: indicates the type of normalization applied. See available types in
self.__available_norm_im_vid for ‘raw-image’ and ‘video’ and self.__available_norm_feat for ‘image-features’ and ‘video-features’.# ‘raw-image’ and ‘video’-related parameters :param meanSubstraction: indicates if we want to substract the training mean from the returned images
(only applicable if normalization=True)Parameters: dataAugmentation – indicates if we want to apply data augmentation to the loaded images (random flip and cropping) Returns: [X,Y], list of input and output data variables of the samples identified by the indices in ‘k’ samples belonging to the chosen ‘set_name’
-
getY
(set_name, init, final, dataAugmentation=False, get_only_ids=False)¶ Gets the [Y] samples for the FULL dataset :param set_name: ‘train’, ‘val’ or ‘test’ set :param init: initial position in the corresponding set split. Must be bigger or equal than 0 and smaller than
final.Parameters: final – final position in the corresponding set split. Returns: Y, list of output data variables from sample ‘init’ to ‘final’ belonging to the chosen ‘set_name’
-
getY_FromIndices
(set_name, k, dataAugmentation=False, return_mask=True, wo_da_patch_type='whole', da_patch_type='resize_and_rndcrop', da_enhance_list=None, get_only_ids=False)¶ Gets the [Y] pairs for the samples in positions ‘k’ in the desired set. :param set_name: ‘train’, ‘val’ or ‘test’ set :param k: positions of the desired samples # ‘raw-image’, ‘video’, ‘image-features’ and ‘video-features’-related parameters :param normalization: indicates if we want to normalize the data. # ‘image-features’ and ‘video-features’-related parameters :param normalization_type: indicates the type of normalization applied. See available types in
self.__available_norm_im_vid for ‘raw-image’ and ‘video’ and self.__available_norm_feat for ‘image-features’ and ‘video-features’.# ‘raw-image’ and ‘video’-related parameters :param meanSubstraction: indicates if we want to substract the training mean from the returned images
(only applicable if normalization=True)Parameters: dataAugmentation – indicates if we want to apply data augmentation to the loaded images (random flip and cropping) Returns: [X,Y], list of input and output data variables of the samples identified by the indices in ‘k’ samples belonging to the chosen ‘set_name’
-
keepTopOutputs
(set_name, id_out, n_top)¶ Keep the most frequent outputs from a set_name. :param set_name: Set name to modify. :param id_out: Id. :param n_top: Number of elements to keep. :return:
-
static
load3DLabels
(bbox_list, nClasses, dataAugmentation, daRandomParams, img_size, size_crop, image_list)¶ Loads a set of outputs of the type 3DLabel (used for detection)
Parameters: - bbox_list – list of bboxes, labels and original sizes
- nClasses – number of different classes to be detected
- dataAugmentation – are we applying data augmentation?
- daRandomParams – random parameters applied on data augmentation (vflip, hflip and random crop)
- img_size – resized applied to input images
- size_crop – crop size applied to input images
- image_list – list of input images used as identifiers to ‘daRandomParams’
Returns: 3DLabels with shape (batch_size, width*height, classes)
-
load3DSemanticLabels
(labeled_images_list, nClasses, classes_to_colour, dataAugmentation, daRandomParams, img_size, size_crop, image_list)¶ Loads a set of outputs of the type 3DSemanticLabel (used for semantic segmentation TRAINING)
Parameters: - labeled_images_list – list of labeled images
- nClasses – number of different classes to be detected
- classes_to_colour – dictionary relating each class id to their corresponding colour in the labeled image
- dataAugmentation – are we applying data augmentation?
- daRandomParams – random parameters applied on data augmentation (vflip, hflip and random crop)
- img_size – resized applied to input images
- size_crop – crop size applied to input images
- image_list – list of input images used as identifiers to ‘daRandomParams’
Returns: 3DSemanticLabels with shape (batch_size, width*height, classes)
-
loadBinary
(y_raw, data_id)¶ Load a binary vector. May be of type ‘sparse’ :param y_raw: Vector to load. :param data_id: Id to load. :return:
-
static
loadCategorical
(y_raw, nClasses)¶ Converts a class vector (integers) to binary class matrix. From utils. :param y_raw: class vector to be converted into a matrix (integers from 0 to num_classes). :param nClasses: total number of classes. :return:
-
loadFeatures
(X, feat_len, normalization_type='L2', normalization=False, loaded=False, external=False, data_augmentation=True)¶ Loads and normalizes features.
Parameters: - X – Features to load.
- feat_len – Length of the features.
- normalization_type – Normalization to perform to the features (see: self.__available_norm_feat)
- normalization – Whether to normalize or not the features.
- loaded – Flag that indicates if these features have been already loaded.
- external – Boolean indicating if the paths provided in ‘X’ are absolute paths to external images
- data_augmentation – Perform data augmentation (with mean=0.0, std_dev=0.01)
Returns: Loaded features as numpy array
-
loadImages
(images, data_id, normalization_type='(-1)-1', normalization=False, meanSubstraction=False, dataAugmentation=False, daRandomParams=None, wo_da_patch_type='whole', da_patch_type='resize_and_rndcrop', da_enhance_list=None, useBGR=False, external=False, loaded=False)¶ Loads a set of images from disk.
:param images : list of image string names or list of matrices representing images (only if loaded==True) :param data_id : identifier in the Dataset object of the data we are loading :param normalization_type: type of normalization applied :param normalization : whether we applying a ‘0-1’ or ‘(-1)-1’ normalization to the images :param meanSubstraction : whether we are removing the training mean :param dataAugmentation : whether we are applying dataAugmentatino (random cropping and horizontal flip) :param daRandomParams : dictionary with results of random data augmentation provided by
self.getDataAugmentationRandomParams()- :param external : if True the images will be loaded from an external database, in this case the list of
- images must be absolute paths
:param loaded : set this option to True if images is a list of matricies instead of a list of strings
-
loadMapping
(path_list)¶ Loads a mapping of Source – Target words. :param path_list: Pickle object with the mapping :return: None
-
loadText
(X, vocabularies, max_len, offset, fill, pad_on_batch, words_so_far, loading_X=False)¶ Text encoder: Transforms samples from a text representation into a numerical one. It also masks the text.
Parameters: - X – Text to encode.
- vocabularies – Mapping word -> index
- max_len – Maximum length of the text.
- offset – Shifts the text to the right, adding null symbol at the start
- fill – ‘start’: the resulting vector will be filled with 0s at the beginning. ‘end’: it will be filled with 0s at the end. ‘center’: the vector will be surrounded by 0s, both at beginning and end.
- pad_on_batch – Whether we get sentences with length of the maximum length of the minibatch or sentences with a fixed (max_text_length) length.
- words_so_far – Experimental feature. Use with caution.
- loading_X – Whether we are loading an input or an output of the model
Returns: Text as sequence of number. Mask for each sentence.
-
loadTextFeatures
(X, max_len, pad_on_batch, offset)¶ Text encoder: Transforms samples from a text representation into a numerical one. It also masks the text.
Parameters: - X – Encoded text.
- max_len – Maximum length of the text.
- pad_on_batch – Whether we get sentences with length of the maximum length of the minibatch or sentences with a fixed (max_text_length) length.
Returns: Text as sequence of numbers. Mask for each sentence.
-
loadTextFeaturesOneHot
(X, vocabulary_len, max_len, pad_on_batch, offset, sample_weights=False, label_smoothing=0.0)¶ Text encoder: Transforms samples from a text representation into a one-hot. It also masks the text. :param X: Encoded text. :param vocabulary_len: Length of the vocabulary (size of the one-hot vector) :param sample_weights: If True, we also return the mask of the text. :param vocabularies: Mapping word -> index :param max_len: Maximum length of the text. :param offset: Shifts the text to the right, adding null symbol at the start :param fill: ‘start’: the resulting vector will be filled with 0s at the beginning.
‘end’: it will be filled with 0s at the end. ‘center’: the vector will be surrounded by 0s, both at beginning and end.Parameters: - pad_on_batch – Whether we get sentences with length of the maximum length of the minibatch or sentences with a fixed (max_text_length) length.
- words_so_far – Experimental feature. Use with caution.
- loading_X – Whether we are loading an input or an output of the model
Returns: Text as sequence of one-hot vectors. Mask for each sentence.
-
loadTextOneHot
(X, vocabularies, vocabulary_len, max_len, offset, fill, pad_on_batch, words_so_far, sample_weights=False, loading_X=False, label_smoothing=0.0)¶ Text encoder: Transforms samples from a text representation into a one-hot. It also masks the text.
Parameters: - vocabulary_len –
- sample_weights –
- X – Text to encode.
- vocabularies – Mapping word -> index
- max_len – Maximum length of the text.
- offset – Shifts the text to the right, adding null symbol at the start
- fill – ‘start’: the resulting vector will be filled with 0s at the beginning. ‘end’: it will be filled with 0s at the end. ‘center’: the vector will be surrounded by 0s, both at beginning and end.
- pad_on_batch – Whether we get sentences with length of the maximum length of the minibatch or sentences with a fixed (max_text_length) length.
- words_so_far – Experimental feature. Use with caution.
- loading_X – Whether we are loading an input or an output of the model
Returns: Text as sequence of one-hot vectors. Mask for each sentence.
-
loadVideoFeatures
(idx_videos, data_id, set_name, max_len, normalization_type, normalization, feat_len, external=False, data_augmentation=True)¶ Parameters: - idx_videos – indices of the videos in the complete list of the current set_name
- data_id – identifier of the input/output that we are loading
- set_name – ‘train’, ‘val’ or ‘test’
- max_len – maximum video length (number of frames)
- normalization_type – type of data normalization applied
- normalization – Switch on/off data normalization
- feat_len – length of the features about to load
- external – Switch on/off data loading from external dataset (not sharing self.path)
- data_augmentation – Switch on/off data augmentation
Returns:
-
loadVideos
(n_frames, data_id, last, set_name, max_len, normalization_type, normalization, meanSubstraction, dataAugmentation)¶ - Loads a set of videos from disk. (Untested!)
Parameters: - n_frames – Number of frames per video
- data_id – Id to load
- last – Last video loaded
- set_name – ‘train’, ‘val’, ‘test’
- max_len – Maximum length of videos
- normalization_type – Type of normalization applied
- normalization – Whether we apply a 0-1 normalization to the images
- meanSubstraction – Whether we are removing the training mean
- dataAugmentation – Whether we are applying dataAugmentatino (random cropping and horizontal flip)
-
loadVideosByIndex
(n_frames, data_id, indices, set_name, max_len, normalization_type, normalization, meanSubstraction, dataAugmentation)¶ Get videos by indices. :param n_frames: Indices of the frames to load from each video. :param data_id: Data id to be processed. :param indices: Indices of the videos to load. :param set_name: Set name to be processed. :param max_len: Maximum length of each video. :param normalization_type: Normalization type applied to the frames. :param normalization: Normalization applied to the frames. :param meanSubstraction: Mean subtraction applied to the frames. :param dataAugmentation: Whether apply data augmentation. :return:
-
load_GT_3DSemanticLabels
(gt, data_id)¶ Loads a GT list of 3DSemanticLabels in a 2D matrix and reshapes them to an Nx1 array (EVALUATION)
Parameters: - gt – list of Dataset output of type 3DSemanticLabels
- data_id – id of the input/output we are processing
Returns: out_list: containing a list of label images reshaped as an Nx1 array
-
merge_vocabularies
(ids)¶ Merges the vocabularies from a set of text inputs/outputs into a single one.
Parameters: ids – identifiers of the inputs/outputs whose vocabularies will be merged Returns: None
-
preprocess3DSemanticLabel
(path_list, data_id, associated_id_in, num_poolings)¶ Preprocess 3D Semantic labels
-
preprocessBinary
(labels_list, data_id, sparse)¶ Preprocesses binary classes.
Parameters: - data_id –
- labels_list – Binary label list given as an instance of the class list.
- sparse – indicates if the data is stored as a list of lists with class indices, e.g. [[4, 234],[87, 222, 4568],[3],…]
Returns: Preprocessed labels.
-
preprocessCategorical
(labels_list, data_id, sample_weights=False)¶ Preprocesses categorical data.
Parameters: - data_id –
- sample_weights –
- labels_list – Label list. Given as a path to a file or as an instance of the class list.
Returns: Preprocessed labels.
-
preprocessFeatures
(path_list, data_id, set_name, feat_len)¶ Preprocesses features. We should give a path to a text file where each line must contain a path to a .npy file storing a feature vector. Alternatively “path_list” can be an instance of the class list.
Parameters: - path_list – Path to a text file where each line must contain a path to a .npy file storing a feature vector. Alternatively, instance of the class list.
- data_id – Dataset id
- set_name – Used?
- feat_len – Length of features. If all features have the same length, given as a number. Otherwise, list.
Returns: Preprocessed features
-
static
preprocessIDs
(path_list, data_id, set_name)¶ Preprocess ID outputs: Strip and put each ID in a line.
-
preprocessImages
(path_list, data_id, set_name, img_size, img_size_crop, use_RGB)¶ Image preprocessing function. :param path_list: Path to the images. :param data_id: Data id. :param set_name: Set name. :param img_size: Size of the images to process. :param img_size_crop: Size of the image crops. :param use_RGB: Whether use RGB color encoding. :return:
-
static
preprocessReal
(labels_list)¶ Preprocesses real classes.
Parameters: labels_list – Label list. Given as a path to a file or as an instance of the class list. Returns: Preprocessed labels.
-
preprocessText
(annotations_list, data_id, set_name, tokenization, build_vocabulary, max_text_len, max_words, offset, fill, min_occ, pad_on_batch, words_so_far, bpe_codes=None, separator='@@', use_unk_class=False)¶ Preprocess ‘text’ data type: Builds vocabulary (if necessary) and preprocesses the sentences. Also sets Dataset parameters.
Parameters: - annotations_list – Path to the sentences to process.
- data_id – Dataset id of the data.
- set_name – Name of the current set (‘train’, ‘val’, ‘test’)
- tokenization – Tokenization to perform.
- build_vocabulary – Whether we should build a vocabulary for this text or not.
- max_text_len – Maximum length of the text. If max_text_len == 0, we treat the full sentence as a class.
- max_words – Maximum number of words to include in the dictionary.
- offset – Text shifting.
- fill – Whether we path with zeros at the beginning or at the end of the sentences.
- min_occ – Minimum occurrences of each word to be included in the dictionary.
- pad_on_batch – Whether we get sentences with length of the maximum length of the minibatch or sentences with a fixed (max_text_length) length.
- words_so_far – Experimental feature. Should be ignored.
- bpe_codes – Codes used for applying BPE encoding.
- separator – BPE encoding separator.
- use_unk_class – Add a special class for the unknown word when maxt_text_len == 0.
Returns: Preprocessed sentences.
-
preprocessTextFeatures
(annotations_list, data_id, set_name, tokenization, build_vocabulary, max_text_len, max_words, offset, fill, min_occ, pad_on_batch, words_so_far, bpe_codes=None, separator='@@', use_unk_class=False)¶ Preprocess ‘text’ data type: Builds vocabulary (if necessary) and preprocesses the sentences. Also sets Dataset parameters.
Parameters: - annotations_list – Path to the sentences to process.
- data_id – Dataset id of the data.
- set_name – Name of the current set (‘train’, ‘val’, ‘test’)
- tokenization – Tokenization to perform.
- build_vocabulary – Whether we should build a vocabulary for this text or not.
- max_text_len – Maximum length of the text. If max_text_len == 0, we treat the full sentence as a class.
- max_words – Maximum number of words to include in the dictionary.
- offset – Text shifting.
- fill – Whether we path with zeros at the beginning or at the end of the sentences.
- min_occ – Minimum occurrences of each word to be included in the dictionary.
- pad_on_batch – Whether we get sentences with length of the maximum length of the minibatch or sentences with a fixed (max_text_length) length.
- words_so_far – Experimental feature. Should be ignored.
- bpe_codes – Codes used for applying BPE encoding.
- separator – BPE encoding separator.
Returns: Preprocessed sentences.
-
preprocessVideoFeatures
(path_list, data_id, set_name, max_video_len, img_size, img_size_crop, feat_len)¶ Preprocess already extracted features from video frames. :param path_list: path to all features in all videos :param data_id: Data id to be processed. :param set_name: Set name to be processed. :param max_video_len: Maximum number of subsampled video features. :param img_size: Size of each frame. :param img_size_crop: Size of each image crop. :param feat_len: Length of each feature. :return:
-
preprocessVideos
(path_list, data_id, set_name, max_video_len, img_size, img_size_crop)¶ Preprocess videos. Subsample and crop frames. :param path_list: path to all images in all videos :param data_id: Data id to be processed. :param set_name: Set name to be processed. :param max_video_len: Maximum number of subsampled video frames. :param img_size: Size of each frame. :param img_size_crop: Size of each image crop. :return:
-
removeInput
(set_name, id='label', type='categorical')¶ Deletes an input from the dataset. :param set_name: Set name to remove. :param id: Input to remove id. :param type: Type of the input to remove. :return:
-
removeOutput
(set_name, id='label', type='categorical')¶ Deletes an output from the dataset. :param set_name: Set name to remove. :param id: Output to remove id. :param type: Type of the output to remove. :return:
-
replaceInput
(data, set_name, data_type, data_id)¶ Replaces the data in a certain set_name and for a given data_id
-
resetCounters
(set_name='all')¶ Resets some basic counter indices for the next samples to read.
-
resize_semantic_output
(predictions, ids_out)¶ Resize semantic output.
-
setClasses
(path_classes, data_id)¶ Loads the list of classes of the dataset. Each line must contain a unique identifier of the class.
Parameters: - path_classes – Path to a text file with the classes or an instance of the class list.
- data_id – Dataset id
Returns: None
-
setInput
(path_list, set_name, type='raw-image', id='image', repeat_set=1, required=True, overwrite_split=False, normalization_types=None, data_augmentation_types=None, add_additional=False, img_size=None, img_size_crop=None, use_RGB=True, max_text_len=35, tokenization='tokenize_none', offset=0, fill='end', min_occ=0, pad_on_batch=True, build_vocabulary=False, max_words=0, words_so_far=False, bpe_codes=None, separator='@@', use_unk_class=False, feat_len=1024, max_video_len=26, sparse=False)¶ Loads a list which can contain all samples from either the ‘train’, ‘val’, or ‘test’ set splits (specified by set_name).
# General parameters
Parameters: - use_RGB –
- path_list – can either be a path to a text file containing the paths to the images or a python list of paths
- set_name – identifier of the set split loaded (‘train’, ‘val’ or ‘test’)
- type – identifier of the type of input we are loading (accepted types can be seen in self.__accepted_types_inputs)
- id – identifier of the input data loaded
- repeat_set – repeats the inputs given (useful when we have more outputs than inputs). Int or array of ints.
- required – flag for optional inputs
- overwrite_split – indicates that we want to overwrite the data with id that was already declared in the dataset
- normalization_types – type of normalization applied to the current input if we activate the data normalization while loading
- data_augmentation_types – type of data augmentation applied to the current input if we activate the data augmentation while loading
- add_additional – adds additional data to an already existent input ID
# ‘raw-image’-related parameters
Parameters: - img_size – size of the input images (any input image will be resized to this)
- img_size_crop – size of the cropped zone (when dataAugmentation=False the central crop will be used)
# ‘text’-related parameters
Parameters: - tokenization – type of tokenization applied (must be declared as a method of this class) (only applicable when type==’text’).
- build_vocabulary – whether a new vocabulary will be built from the loaded data or not (only applicable when type==’text’). A previously calculated vocabulary will be used if build_vocabulary is an ‘id’ from a previously loaded input/output
- max_text_len – maximum text length, the rest of the data will be padded with 0s (only applicable if the output data is of type ‘text’).
- max_words – a maximum of ‘max_words’ words from the whole vocabulary will be chosen by number or occurrences
- offset – number of timesteps that the text is shifted to the right (for sequential conditional models, which take as input the previous output)
- fill – select whether padding before or after the sequence
- min_occ – minimum number of occurrences allowed for the words in the vocabulary. (default = 0)
- pad_on_batch – the batch timesteps size will be set to the length of the largest sample +1 if True, max_len will be used as the fixed length otherwise
- words_so_far – if True, each sample will be represented as the complete set of words until the point defined by the timestep dimension (e.g. t=0 ‘a’, t=1 ‘a dog’, t=2 ‘a dog is’, etc.)
- bpe_codes – Codes used for applying BPE encoding.
- separator – BPE encoding separator.
# ‘image-features’ and ‘video-features’- related parameters
Parameters: feat_len – size of the feature vectors for each dimension. We must provide a list if the features are not vectors. # ‘video’-related parameters :param max_video_len: maximum video length, the rest of the data will be padded with 0s
(only applicable if the input data is of type ‘video’ or video-features’).
-
setLabels
(labels_list, set_name, type='categorical', id='label')¶ DEPRECATED
-
setList
(path_list, set_name, type='raw-image', id='image')¶ DEPRECATED
-
setListGeneral
(path_list, split=None, shuffle=True, type='raw-image', id='image')¶ Deprecated
-
setOutput
(path_list, set_name, type='categorical', id='label', repeat_set=1, overwrite_split=False, add_additional=False, sample_weights=False, label_smoothing=0.0, tokenization='tokenize_none', max_text_len=0, offset=0, fill='end', min_occ=0, pad_on_batch=True, words_so_far=False, build_vocabulary=False, max_words=0, bpe_codes=None, separator='@@', use_unk_class=False, associated_id_in=None, num_poolings=None, sparse=False)¶ Loads a set of output data.
# General parameters
Parameters: - path_list – can either be a path to a text file containing the labels or a python list of labels.
- set_name – identifier of the set split loaded (‘train’, ‘val’ or ‘test’).
- type – identifier of the type of input we are loading (accepted types can be seen in self.__accepted_types_outputs).
- id – identifier of the input data loaded.
- repeat_set – repeats the outputs given (useful when we have more inputs than outputs). Int or array of ints.
- overwrite_split – indicates that we want to overwrite the data with id that was already declared in the dataset
- add_additional – adds additional data to an already existent output ID
- sample_weights – switch on/off sample weights usage for the current output
- label_smoothing – epsilon value for label smoothing. See arxiv.org/abs/1512.00567. # ‘text’-related parameters
- tokenization – type of tokenization applied (must be declared as a method of this class) (only applicable when type==’text’).
- build_vocabulary – whether a new vocabulary will be built from the loaded data or not (only applicable when type==’text’).
- max_text_len –
maximum text length, the rest of the data will be padded with 0s (only applicable if the output data is of type ‘text’).
Set to 0 if the whole sentence will be used as an output class. - max_words – a maximum of ‘max_words’ words from the whole vocabulary will be chosen by number or occurrences
- offset – number of timesteps that the text is shifted to the right (for sequential conditional models, which take as input the previous output)
- fill – select whether padding before or after the sequence
- min_occ – minimum number of occurrences allowed for the words in the vocabulary. (default = 0)
- pad_on_batch – the batch timesteps size will be set to the length of the largest sample +1 if True, max_len will be used as the fixed length otherwise
- words_so_far – if True, each sample will be represented as the complete set of words until the point defined by the timestep dimension (e.g. t=0 ‘a’, t=1 ‘a dog’, t=2 ‘a dog is’, etc.)
- bpe_codes – Codes used for applying BPE encoding.
- separator –
BPE encoding separator.
# ‘3DLabel’ or ‘3DSemanticLabel’-related parameters
- associated_id_in – id of the input ‘raw-image’ associated to the inputted 3DLabels or 3DSemanticLabel
- num_poolings –
number of pooling layers used in the model (used for calculating output dimensions)
# ‘binary’-related parameters
- sparse – indicates if the data is stored as a list of lists with class indices, e.g. [[4, 234],[87, 222, 4568],[3],…]
-
setRawInput
(path_list, set_name, type='file-name', id='raw-text', overwrite_split=False)¶ Loads a list which can contain all samples from either the ‘train’, ‘val’, or ‘test’ set splits (specified by set_name).
# General parameters :param overwrite_split: :param path_list: Path to a text file containing the paths to the images or a python list of paths :param set_name: identifier of the set split loaded (‘train’, ‘val’ or ‘test’) :param type: identifier of the type of input we are loading
(see self.__accepted_types_inputs for accepted types)Parameters: id – identifier of the input data loaded
-
setRawOutput
(path_list, set_name, type='file-name', id='raw-text', overwrite_split=False, add_additional=False)¶ Loads a list which can contain all samples from either the ‘train’, ‘val’, or ‘test’ set splits (specified by set_name).
# General parameters :param overwrite_split: :param add_additional: :param path_list: can either be a path to a text file containing the paths to
the images or a python list of pathsParameters: - set_name – identifier of the set split loaded (‘train’, ‘val’ or ‘test’)
- type – identifier of the type of input we are loading (accepted types can be seen in self.__accepted_types_inputs)
- id – identifier of the input data loaded
-
setSemanticClasses
(path_classes, data_id)¶ Loads the list of semantic classes of the dataset together with their corresponding colours in the GT image. Each line must contain a unique identifier of the class and its associated RGB colour representation
separated by commas.Parameters: - path_classes – Path to a text file with the classes and their colours.
- data_id – input/output id
Returns: None
-
setSilence
(silence)¶ Changes the silence mode of the ‘Dataset’ instance.
-
setTrainMean
(mean_image, data_id, normalization=False)¶ Loads a pre-calculated training mean image, ‘mean_image’ can either be:
- numpy.array (complete image)
- list with a value per channel
- string with the path to the stored image.
Parameters: - mean_image –
- normalization –
- data_id – identifier of the type of input whose train mean is being introduced.
-
shuffleTraining
()¶ Applies a random shuffling to the training samples.
-
static
tokenize_CNN_sentence
(caption)¶ Tokenization employed in the CNN_sentence package (https://github.com/yoonkim/CNN_sentence/blob/master/process_data.py#L97). :param caption: String to tokenize :return: Tokenized version of caption
-
static
tokenize_aggressive
(caption, lowercase=True)¶ Aggressive tokenizer for the input/output data of type ‘text’: * Removes punctuation * Optional lowercasing
Parameters: - caption – String to tokenize
- lowercase – Whether to lowercase the caption or not
Returns: Tokenized version of caption
-
static
tokenize_basic
(caption, lowercase=True)¶ - Basic tokenizer for the input/output data of type ‘text’:
- Splits punctuation
- Optional lowercasing
Parameters: - caption – String to tokenize
- lowercase – Whether to lowercase the caption or not
Returns: Tokenized version of caption
-
tokenize_bpe
(caption)¶ Applies BPE segmentation (https://github.com/rsennrich/subword-nmt) :param caption: Caption to detokenize. :return: Encoded version of caption.
-
static
tokenize_icann
(caption)¶ Tokenization used for the icann paper: * Removes some punctuation (. , “) * Lowercasing
Parameters: caption – String to tokenize Returns: Tokenized version of caption
-
static
tokenize_montreal
(caption)¶ - Similar to tokenize_icann
- Removes some punctuation
- Lowercase
Parameters: caption – String to tokenize Returns: Tokenized version of caption
-
tokenize_moses
(caption, language='en', lowercase=False, aggressive_dash_splits=False, return_str=True, escape=False)¶ Applies the Moses tokenization. Relying on sacremoses’ implementation of the Moses tokenizer.
Parameters: - caption – Sentence to tokenize
- language – Language (will build the tokenizer for this language)
- lowercase – Whether to lowercase or not the sentence
- agressive_dash_splits – Option to trigger dash split rules .
- return_str – Return string or list
- escape – Escape HTML special chars
Returns:
-
static
tokenize_none
(caption)¶ Does not tokenizes the sentences. Only performs a stripping
Parameters: caption – String to tokenize Returns: Tokenized version of caption
-
static
tokenize_none_char
(caption)¶ Character-level tokenization. Respects all symbols. Separates chars. Inserts <space> sybmol for spaces. If found an escaped char, “'” symbol, it is converted to the original one # List of escaped chars (by moses tokenizer) & -> & | -> | < -> < > -> > ‘ -> ' ” -> " [ -> [ ] -> ] :param caption: String to tokenize :return: Tokenized version of caption
-
static
tokenize_questions
(caption)¶ - Basic tokenizer for VQA questions:
- Lowercasing
- Splits contractions
- Removes punctuation
- Numbers to digits
Parameters: caption – String to tokenize Returns: Tokenized version of caption
-
static
tokenize_soft
(caption, lowercase=True)¶ - Tokenization used for the icann paper:
- Removes very little punctuation
- Lowercase
Parameters: - caption – String to tokenize
- lowercase – Whether to lowercase the caption or not
Returns: Tokenized version of caption
-
-
class
keras_wrapper.dataset.
Homogeneous_Data_Batch_Generator
(set_split, net, dataset, num_iterations, batch_size=50, joint_batches=20, normalization=False, normalization_type=None, data_augmentation=True, wo_da_patch_type='whole', da_patch_type='resize_and_rndcrop', da_enhance_list=None, mean_substraction=False, predict=False, random_samples=-1, shuffle=True)¶ Batch generator class. Retrieves batches of data.
-
generator
()¶ Gets and processes the data :return: generator with the data
-
reset
()¶ Resets the counters. :return:
-
retrieve_maxibatch
()¶ Gets a maxibatch of self.params[‘joint_batches’] * self.batch_size samples. :return:
-
-
class
keras_wrapper.dataset.
Parallel_Data_Batch_Generator
(set_split, net, dataset, num_iterations, batch_size=50, normalization=False, normalization_type=None, data_augmentation=True, wo_da_patch_type='whole', da_patch_type='resize_and_rndcrop', da_enhance_list=None, mean_substraction=False, predict=False, random_samples=-1, shuffle=True, temporally_linked=False, init_sample=-1, final_sample=-1, n_parallel_loaders=1)¶ Batch generator class. Retrieves batches of data.
-
generator
()¶ Gets and processes the data :return: generator with the data
-
-
keras_wrapper.dataset.
dataLoad
(process_name, net, dataset, max_queue_len, queues)¶ Parallel data loader. Risky and untested! :param process_name: :param net: :param dataset: :param max_queue_len: :param queues: :return:
-
keras_wrapper.dataset.
loadDataset
(dataset_path)¶ Loads a previously saved Dataset object.
Parameters: dataset_path – Path to the stored Dataset to load Returns: Loaded Dataset object
-
keras_wrapper.dataset.
saveDataset
(dataset, store_path)¶ Saves a backup of the current Dataset object.
Parameters: - dataset – Dataset object to save
- store_path – Saving path
Returns: None
cnn_model.py¶
callbacks_keras_wrapper.py¶
beam_search_ensemble.py¶
utils.py¶
-
class
keras_wrapper.utils.
MultiprocessQueue
(manager, multiprocess_type='Queue')¶ Wrapper class for encapsulating the behaviour of some multiprocessing communication structures.
See how Queues and Pipes work in the following link https://docs.python.org/2/library/multiprocessing.html#multiprocessing-examples
-
keras_wrapper.utils.
bbox
(img, mode='max')¶ Returns a bounding box covering all the non-zero area in the image.
Parameters: - img – Image on which print the bounding box
- mode – “width_height” returns width in [2] and height in [3], “max” returns xmax in [2] and ymax in [3]
Returns:
-
keras_wrapper.utils.
build_OneVsAllECOC_Stage
(n_classes_ecoc, input_shape, ds, stage1_lr)¶ Parameters: - n_classes_ecoc –
- input_shape –
- ds –
- stage1_lr –
Returns:
-
keras_wrapper.utils.
build_OneVsOneECOC_Stage
(n_classes_ecoc, input_shape, ds, stage1_lr=0.01, ecoc_version=2)¶ Parameters: - n_classes_ecoc –
- input_shape –
- ds –
- stage1_lr –
- ecoc_version –
Returns:
-
keras_wrapper.utils.
build_Specific_OneVsOneECOC_Stage
(pairs, input_shape, ds, lr, ecoc_version=2)¶ Parameters: - pairs –
- input_shape –
- ds –
- lr –
- ecoc_version –
Returns:
-
keras_wrapper.utils.
build_Specific_OneVsOneECOC_loss_Stage
(net, input_net, input_shape, classes, ecoc_version=3, pairs=None, functional_api=False, activations=None)¶ Parameters: - net –
- input_net –
- input_shape –
- classes –
- ecoc_version –
- pairs –
- functional_api –
- activations –
Returns:
-
keras_wrapper.utils.
build_Specific_OneVsOneVsRestECOC_Stage
(pairs, input_shape, ds, lr, ecoc_version=2)¶ Parameters: - pairs –
- input_shape –
- ds –
- lr –
- ecoc_version –
Returns:
-
keras_wrapper.utils.
checkParameters
(input_params, default_params, hard_check=False)¶ Validates a set of input parameters and uses the default ones if not specified.
Parameters: - input_params – Input parameters.
- default_params – Default parameters
- hard_check – If True, raise exception if a parameter is not valid.
Returns:
-
keras_wrapper.utils.
decode_categorical
(preds, index2word, verbose=0)¶ Decodes predictions :param preds: Predictions codified as the output of a softmax activation function. :param index2word: Mapping from word indices into word characters. :return: List of decoded predictions.
-
keras_wrapper.utils.
decode_multilabel
(preds, index2word, min_val=0.5, get_probs=False, verbose=0)¶ Decodes predictions :param preds: Predictions codified as the output of a softmax activation function. :param index2word: Mapping from word indices into word characters. :param min_val: Minimum value needed for considering a positive prediction. :param get_probs: additionally return probability for each predicted label :param verbose: Verbosity level, by default 0. :return: List of decoded predictions.
-
keras_wrapper.utils.
decode_predictions
(preds, temperature, index2word, sampling_type, verbose=0)¶ Decodes predictions :param preds: Predictions codified as the output of a softmax activation function. :param temperature: Temperature for sampling. :param index2word: Mapping from word indices into word characters. :param sampling_type: ‘max_likelihood’ or ‘multinomial’. :param verbose: Verbosity level, by default 0. :return: List of decoded predictions.
-
keras_wrapper.utils.
decode_predictions_beam_search
(preds, index2word, glossary=None, alphas=None, heuristic=0, x_text=None, unk_symbol='<unk>', pad_sequences=False, mapping=None, verbose=0)¶ Decodes predictions from the BeamSearch method.
Parameters: - preds – Predictions codified as word indices.
- index2word – Mapping from word indices into word characters.
- alphas – Attention model weights: Float matrix with shape (I, J) (I: number of target items; J: number of source items).
- heuristic – Replace unknown words heuristic (0, 1 or 2)
- x_text – Source text (for unk replacement)
- unk_symbol – Unknown words symbol
- pad_sequences – Whether we should make a zero-pad on the input sequence.
- mapping – Source-target dictionary (for unk_replace heuristics 1 and 2)
- verbose – Verbosity level, by default 0.
Returns: List of decoded predictions
-
keras_wrapper.utils.
decode_predictions_one_hot
(preds, index2word, pad_sequences=True, verbose=0)¶ Decodes predictions following a one-hot codification. :param preds: Predictions codified as one-hot vectors. :param index2word: Mapping from word indices into word characters. :param verbose: Verbosity level, by default 0. :return: List of decoded predictions
-
keras_wrapper.utils.
flatten
(l)¶ Flatten a list (more general than flatten_list_of_lists, but also more inefficient :param l: :return:
-
keras_wrapper.utils.
flatten_list_of_lists
(list_of_lists)¶ Flattens a list of lists :param list_of_lists: List of lists :return: Flatten list of lists
-
keras_wrapper.utils.
indices_2_one_hot
(indices, n)¶ Converts a list of indices into one hot codification
Parameters: - indices – list of indices
- n – integer. Size of the vocabulary
Returns: numpy array with shape (len(indices), n)
-
keras_wrapper.utils.
key_with_max_val
(d)¶ - create a list of the dict’s keys and values;
- return the key with the max value
-
keras_wrapper.utils.
loadGoogleNetForFood101
(nClasses=101, load_path='/media/HDD_2TB/CNN_MODELS/GoogleNet')¶ Parameters: - nClasses –
- load_path –
Returns:
-
keras_wrapper.utils.
one_hot_2_indices
(preds, pad_sequences=True, verbose=0)¶ Converts a one-hot codification into a index-based one :param preds: Predictions codified as one-hot vectors. :param pad_sequences: Whether we should pad sequence or not :param verbose: Verbosity level, by default 0. :return: List of convertedpredictions
-
keras_wrapper.utils.
prepareECOCLossOutputs
(net, ds, ecoc_table, input_name, output_names, splits=None)¶ Parameters: - net –
- ds –
- ecoc_table –
- input_name –
- output_names –
- splits –
Returns:
-
keras_wrapper.utils.
prepareGoogleNet_Food101
(model_wrapper)¶ Prepares the GoogleNet model after its conversion from Caffe :param model_wrapper: :return:
-
keras_wrapper.utils.
prepareGoogleNet_Food101_ECOC_loss
(model_wrapper)¶ Prepares the GoogleNet model for inserting an ECOC structure after removing the last part of the net :param model_wrapper: :return:
-
keras_wrapper.utils.
prepareGoogleNet_Food101_Stage1
(model_wrapper)¶ Prepares the GoogleNet model for serving as the first Stage of a Staged_Netork :param model_wrapper: :return:
-
keras_wrapper.utils.
prepareGoogleNet_Stage2
(stage1, stage2)¶ Removes the second part of the GoogleNet for inserting it into the second stage. :param stage1: :param stage2: :return:
-
keras_wrapper.utils.
print_dict
(d, header='')¶ Formats a dictionary for printing. :param d: Dictionary to print. :return: String containing the formatted dictionary.
-
keras_wrapper.utils.
replace_unknown_words
(src_word_seq, trg_word_seq, hard_alignment, unk_symbol, glossary=None, heuristic=0, mapping=None, verbose=0)¶ Replaces unknown words from the target sentence according to some heuristic. Borrowed from: https://github.com/sebastien-j/LV_groundhog/blob/master/experiments/nmt/replace_UNK.py :param src_word_seq: Source sentence words :param trg_word_seq: Hypothesis words :param hard_alignment: Target-Source alignments :param glossary: Hard-coded substitutions. :param unk_symbol: Symbol in trg_word_seq to replace :param heuristic: Heuristic (0, 1, 2) :param mapping: External alignment dictionary :param verbose: Verbosity level :return: trg_word_seq with replaced unknown words
-
keras_wrapper.utils.
sampling
(scores, sampling_type='max_likelihood', temperature=1.0)¶ Sampling words (each sample is drawn from a categorical distribution). Or picks up words that maximize the likelihood. :param scores: array of size #samples x #classes; every entry determines a score for sample i having class j :param sampling_type: :param temperature: Predictions temperature. The higher, the flatter probabilities. Hence more random outputs. :return: set of indices chosen as output, a vector of size #samples
-
keras_wrapper.utils.
simplifyDataset
(ds, id_classes, n_classes=50)¶ Parameters: - ds –
- id_classes –
- n_classes –
Returns:
-
keras_wrapper.utils.
to_categorical
(y, num_classes=None)¶ Converts a class vector (integers) to binary class matrix.
E.g. for use with categorical_crossentropy.
- # Arguments
- y: class vector to be converted into a matrix
- (integers from 0 to num_classes).
num_classes: total number of classes.
- # Returns
- A binary matrix representation of the input.