How to split the images and annotations into train, test and validation sets for an object detection task?

I know that similar threads exists, but I have not been able to find the appropriate answer.

I have seen solutions for the cases where the images are stored based on certain classes, but this is not what I am looking for. I would like to know how to properly split the image dataset into train, test and validation sets, where each image has a corresponding annotation (labeling) file.

The images and their respective annotations(pascal voc/yolo darkent formats) exist both in the same and different directories, depending on the convenience of splitting.

Thanks in advance.

Edit:

I think this might be the fastest solution: This later can be applied for obtaining the validation as well.

train_size = int(0.8 * len(full_dataset))
test_size = len(full_dataset) - train_size
train_dataset, test_dataset = torch.utils.data.random_split(full_dataset, [train_size, test_size])

or let's say:

train, val, test = torch.utils.data.random_split(dataset, [1000, 100, 100]) 

given there are 1200 total images.



Read more here: https://stackoverflow.com/questions/63139017/how-to-split-the-images-and-annotations-into-train-test-and-validation-sets-for

Content Attribution

This content was originally published by Karen at Recent Questions - Stack Overflow, and is syndicated here via their RSS feed. You can read the original post over there.

%d bloggers like this: