I know that similar threads exists, but I have not been able to find the appropriate answer.
I have seen solutions for the cases where the images are stored based on certain classes, but this is not what I am looking for. I would like to know how to properly split the image dataset into train, test and validation sets, where each image has a corresponding annotation (labeling) file.
The images and their respective annotations(pascal voc/yolo darkent formats) exist both in the same and different directories, depending on the convenience of splitting.
Thanks in advance.
I think this might be the fastest solution: This later can be applied for obtaining the validation as well.
train_size = int(0.8 * len(full_dataset)) test_size = len(full_dataset) - train_size train_dataset, test_dataset = torch.utils.data.random_split(full_dataset, [train_size, test_size])
or let's say:
train, val, test = torch.utils.data.random_split(dataset, [1000, 100, 100])
given there are 1200 total images.