Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

R code creation #467

Open
wants to merge 27 commits into
base: develop
Choose a base branch
from
Open

R code creation #467

wants to merge 27 commits into from

Conversation

PaulJonasJost
Copy link
Collaborator

@PaulJonasJost PaulJonasJost commented Feb 14, 2025

Follows the following Principle:

  1. Any function stores necessary information in a list (function_infos.R), these informations are:
  • The function and its name (in order to find parameters)
  • Whether the function belongs in util or not
  • any additional self defined functions it uses
  • input and output mapping, which describe how non default values (mostly "data") should be passed to the function and how the output should be named for further usage. These are used to interlink the functions in the script
  1. Any Plot (we can adjust this later for any general download) contains a list of function infos
  2. The original data is saved as csv files for later upload and also more consistent coding in line with how the user experiences the app
  3. Preprocessing is always added to the pipeline, as well as some constants
  4. The necessary libraries are detected automatically by the functions passed to the pipeline
  • Todo: Check whether this is enough or we also need to pass on the additional functions
  1. Only par_tmp is passed resulting in no case (from tests) in a larger zip file than 1mb (previously up to 500mb)
  2. As the functions are directly parsed, we do now have multiple advantages:
  3. Whenever we change something in the function, we automatically change it in the script
  4. Tests are in place to check whether the code generation works with breaking changes (e.g. renaming non default input variables)
  5. We can create par_tmps of different situations and run the code generation and with this test whether the code runs through. This results in potential github tests for the general pipeline, leaving only the UI testing.
  6. It does require the following code guidelines for the future:
  • any parameter used in a function is stored with the same name in par_tmp.
  • any function needs to have a clear return statement at the end
  • any plotting needs to be defined as its own function (except for very very basic functions, e.g. plot_enrichment_results_info)

Any prettification of the code i would put in an issue. Important would be testing the general functionality.

  • Enrichment analysis with translation served as problems as i did not manage to save ensembl object to rds. If @LeaSeep can fix that it would be even better working.
  • For now batch correction is not added (requires a small discussion)
  • Double plots (e.g. Volcano and preprocessing) are downloadable in one code and can then be copied/adjusted.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant