Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce memory use #140

Open
xuebingjie1990 opened this issue Apr 20, 2021 · 2 comments
Open

Reduce memory use #140

xuebingjie1990 opened this issue Apr 20, 2021 · 2 comments
Labels
enhancement New feature or request

Comments

@xuebingjie1990
Copy link
Member

xuebingjie1990 commented Apr 20, 2021

The memory use for the Tissue specificity (calculation + plot), as well as Neighbor distance and GC content, requires lots of memory with the large bed files (ENCFF303ZGP, 3.7GB, is the largest file we have in Bedbase database right now).

For testing, I submitted slurm jobs for each of the steps in bedstat separately, and got the time and memory use with sacct command for different size of bedfiles in the table below.

  Bed file size time mem (K)
TSS ENCFF950CZM 2.1MB 00:00:43 867192
ENCFF610FVD 41.9MB 00:00:32 1198232
ENCFF349STD 744.1MB 00:03:46 4207212
ENCFF303ZGP 3.7GB 00:03:38 12977208
chrom_bin ENCFF950CZM 2.1MB 00:00:55 794532
ENCFF610FVD 41.9MB 00:00:50 1114972
ENCFF349STD 744.1MB 00:04:10 4008000
ENCFF303ZGP 3.7GB 00:03:38 11061980
GC_content ENCFF950CZM 2.1MB 00:01:22 883388
ENCFF610FVD 41.9MB 00:05:24 3893472
ENCFF349STD 744.1MB 01:34:21 44189576
ENCFF303ZGP 3.7GB 05:42:47 127170468
partitions (+Expected partitio, Cumulative partition) ENCFF950CZM 2.1MB 00:00:46 861968
ENCFF610FVD 41.9MB 00:00:38 1125128
ENCFF349STD 744.1MB 00:03:56 9159448
ENCFF303ZGP 3.7GB 00:16:29 54185920
Qthist ENCFF950CZM 2.1MB 00:00:42 880560
ENCFF610FVD 41.9MB 00:00:26 -
ENCFF349STD 744.1MB 00:00:39 1542740
ENCFF303ZGP 3.7GB 00:01:44 8402252
Neighbor distance ENCFF950CZM 2.1MB 00:00:46 867540
ENCFF610FVD 41.9MB 00:03:32 3529856
ENCFF349STD 744.1MB 01:27:17 44840124
ENCFF303ZGP 3.7GB 04:52:45 140088948
Tissue specificity ENCFF950CZM 2.1MB 00:01:11 3229968
ENCFF610FVD 41.9MB 00:02:45 10113384
ENCFF349STD 744.1MB 00:23:21 134680844
ENCFF303ZGP 3.7GB 00:33:05 148197020
@xuebingjie1990
Copy link
Member Author

I double checked with profvis, for the 41.9MB file, the memory use for plotOpenSignal(op) is 11841MB. so it is close to the number I got from Rivanna.

@nsheff
Copy link
Member

nsheff commented Jan 8, 2022

What's the status of this issue to reduce memory use here?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants