Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VideoSwin Backbone #1808

Open
wants to merge 8 commits into
base: keras-hub
Choose a base branch
from
Open

Conversation

ushareng
Copy link
Collaborator

@ushareng ushareng commented Sep 4, 2024

No description provided.

divyashreepathihalli and others added 8 commits August 12, 2024 17:17
* Agg Vgg16 backbone

* update names

* update tests

* update test

* add image classifier

* incorporate review comments

* Update test case

* update backbone test

* add image classifier

* classifier cleanup

* code reformat

* add vgg16 image classifier

* make vgg generic

* update doc string

* update docstring

* add classifier test

* update tests

* update docstring

* address review comments

* code reformat

* update the configs

* address review comments

* fix task saved model test

* update init

* code reformatted
* Add ResNetV1 and ResNetV2

* Address comments
* Add CSP DarkNet

* Add CSP DarkNet

* snake_case function names

* change use_depthwise to block_type
…Backbone` (keras-team#1769)

* Add FeaturePyramidBackbone and update ResNetBackbone

* Simplify the implementation

* Fix CI

* Make ResNetBackbone compatible with timm and add FeaturePyramidBackbone

* Add conversion implementation

* Update docstrings

* Address comments
* Add DenseNet

* fix testcase

* address comments

* nit

* fix lint errors

* move description
@ushareng
Copy link
Collaborator Author

ushareng commented Sep 4, 2024

The input variable names are not fixed yet. For image classifier backbones image_shape fits right, what should we keep it here video_shape or you want to go with generic one

@ushareng
Copy link
Collaborator Author

ushareng commented Sep 4, 2024

Also, in the classifier I have used imageclassifier, which needs to be updated to video classifier
Shall I put that change in this PR or how?

@mattdangerw
Copy link
Member

We can ignore nightly failure, but please for ./shell/format.sh and ./shell/api_gen.sh to fix the linter errors.

@mattdangerw
Copy link
Member

Also, in the classifier I have used imageclassifier, which needs to be updated to video classifier
Shall I put that change in this PR or how?

Any way is fine, but it might be easiest to start with the backbone, and just add the task as a follow up.

Can you include some colab showing that we have correct numerics? E.g. load some reference implementation, assign the weights over, show our implementation matches?

Copy link
Member

@mattdangerw mattdangerw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some initial comments. Make sure to take time to check for the style of things in the repo and do some rewriting of the source code as we bring it over. We want to make sure this is a proper port that leaves us with a uniform UX for all models we have checked in.

include_rescaling=False,
image_shape=(32, 224, 224, 3),
embed_dim=96,
patch_size=[2, 4, 4],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably call the arguments that vary per layer either layerwise_patch_size or stackwise_patch_size, etc, for consistency with other models.

def __init__(
self,
include_rescaling=False,
image_shape=(32, 224, 224, 3),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this model is capable of taking in different numbers of frames or different image sizes, this should probably be (None, None, None, 3).


def __init__(
self,
include_rescaling=False,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove defaults that are specific to a given model size. patch_size, window_size, depths and num_heads should all have no default here. The defaults will be in the individual preset configs.

):

# === Functional Model ===

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove empty newline

image_shape (tuple[int], optional): The size of the input video in
`(depth, height, width, channel)` format.
Defaults to `(32, 224, 224, 3)`.
include_rescaling (bool, optional): Whether to rescale the inputs. If
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

format lines to 80 characters. This looks like we are going over.

self.embed_dim = embed_dim
self.norm_layer = norm_layer

def __compute_padding(self, dim, patch_size):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why no one underscore? Double underscore is usually reserved for python builtins

**kwargs,
):
super().__init__(**kwargs)
# variables
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

config? variables usually refers to trainable weights (these are not)

**kwargs,
):
super().__init__(**kwargs)
# variables
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here, these aren'te variables

super().__init__(**kwargs)
self.output_dim = output_dim
self.hidden_dim = hidden_dim
self._activation_identifier = activation
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see this in a few places, we should not do it.

in init
self.activation = keras.activations.get(activation)

in get_config
"activation": keras.activations.serialize(self.activation),

patch_size=(4, 4, 4), embed_dim=96
)
config = patch_embedding_model.get_config()
assert isinstance(config, dict)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rewrite all of these tests to match the repo style. We do not use assert anywhere.

@mattdangerw mattdangerw force-pushed the keras-hub branch 3 times, most recently from 753047d to a5e5d8f Compare September 13, 2024 20:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants