Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to create my own dataset #37

Open
shenwuyue2022 opened this issue May 31, 2024 · 4 comments
Open

How to create my own dataset #37

shenwuyue2022 opened this issue May 31, 2024 · 4 comments

Comments

@shenwuyue2022
Copy link

How to create text、mask、sketch about my own image

@ziqihuangg
Copy link
Owner

Hi, for our work, we use the multi-modal labels provided by datasets that are built on top of CelebA, for example CelebA-Dialog, CelebAMask-HQ, and Multi-Modal-CelebA-HQ. If you wish to extract new multi-modal labels, you can some off-the-shelf extractors. For example, there is a face parsing network provided by CelebAMask-HQ.

@shenwuyue2022
Copy link
Author

Hi, for our work, we use the multi-modal labels provided by datasets that are built on top of CelebA, for example CelebA-Dialog, CelebAMask-HQ, and Multi-Modal-CelebA-HQ. If you wish to extract new multi-modal labels, you can some off-the-shelf extractors. For example, there is a face parsing network provided by CelebAMask-HQ.

Hello! Regarding the mask and sketch parts of the CelebA dataset mentioned in this program, the mask part in the CelebA dataset consists of images, and in the program, they are converted into .pt format files, including the sketch part which is also converted into files. Could you please provide the specific conversion code for this part? Also, regarding the description of the folder under mask, combined with the final .pt shape being [19,1024], does 19 correspond to 19 categories, and does 1024 correspond to the downsampled 32*32?

@ziqihuangg
Copy link
Owner

ziqihuangg commented Jul 16, 2024

Also, regarding the description of the folder under mask, combined with the final .pt shape being [19,1024], does 19 correspond to 19 categories, and does 1024 correspond to the downsampled 32*32?

Yes that's right.

@shenwuyue2022
Copy link
Author

Also, regarding the description of the folder under mask, combined with the final .pt shape being [19,1024], does 19 correspond to 19 categories, and does 1024 correspond to the downsampled 32*32?

Yes that's right.

Hello, here is my understanding of the process for converting mask image files into .pt files, please help me check if there are any issues.
`def resize_and_convert_to_tensor(file_path, output_dir, num_classes=19):
transform = transforms.Compose([
transforms.Resize((32, 32), interpolation=transforms.InterpolationMode.NEAREST),
transforms.ToTensor()
])

image = Image.open(file_path).convert('L')
downsampled_tensor = transform(image)

downsampled_map = downsampled_tensor.view(-1).numpy()

one_hot_tensor = torch.zeros((num_classes, 1024), dtype=torch.float32)

for idx, pixel_value in enumerate(downsampled_map):
    class_index = int(pixel_value)
    if class_index < num_classes:
        one_hot_tensor[class_index, idx] = 1

base_name = os.path.splitext(os.path.basename(file_path))[0]
tensor_file_path = os.path.join(output_dir, f"{base_name}.pt")
torch.save(one_hot_tensor, tensor_file_path)

return tensor_file_path`

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants