Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"obs_names" error in snapatac2.tl.add_cor_scores #240

Open
MubasherMohammed opened this issue Feb 15, 2024 · 7 comments
Open

"obs_names" error in snapatac2.tl.add_cor_scores #240

MubasherMohammed opened this issue Feb 15, 2024 · 7 comments

Comments

@MubasherMohammed
Copy link

MubasherMohammed commented Feb 15, 2024

Hi and thanks for developing this very useful tool!
when i tried to use snapatac2.tl.add_cor_scores(network, peak_mat= data) it gives me error of "'NoneType' object has no attribute 'obs_names'"
data:Anndata scATAC processed with snapatac2,
the previous step of the network works fine. created nodes and edges.
any idea how to workaround that?
snapatac2: i used pip install snapatac2[all] for installation.
many thanks in advance.

@kaizhang
Copy link
Owner

It will be helpful if you can provide me with a small example h5ad file and a few lines of code to reproduce this error. Thank you!

@MubasherMohammed
Copy link
Author

MubasherMohammed commented Feb 18, 2024

thanks for reply!
I shared with you subset of my h5ad object processed with snapatac2 https://drive.google.com/drive/folders/1-Wv6njVe8hXney4QpPuBivi7SzMUcN8D?usp=sharing
my code:
%%time snap.tl.macs3(data, groupby = 'Astro_GEX_SNctrl')
%%time peaks = snap.tl.merge_peaks(data.uns['macs3'], snap.genome.hg19) peaks.head()

%%time peak_mat = snap.pp.make_peak_matrix(data, use_rep=peaks['Peaks']) peak_mat

%%time marker_peaks = snap.tl.marker_regions(peak_mat, groupby='Astro_GEX_SNPD', pvalue=0.01)

motifs = snap.tl.motif_enrichment(motifs = snap.datasets.cis_bp(unique=True), regions = marker_peaks, genome_fasta=snap.genome.hg19, )

all_regions = ['chr1:1618389-1618890',
'chr1:1851469-1851970',
'chr1:4139898-4140399',
'chr1:4821118-4821619',
'chr1:6732994-6733495',
'chr1:840499-841000']

net = snap.tl.init_network_from_annotation(all_regions, anno_file =snap.genome.hg19 , upstream = 250000, downstream= 250000, id_type = 'gene_name', coding_gene_only = True)
net = snap.tl.add_cor_scores(net,gene_mat=None, peak_mat=count, select=None, overwrite=False)

`---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
Cell In[138], line 1
----> 1 net = snap.tl.add_cor_scores(net,gene_mat=None, peak_mat=count, select=None, overwrite=False)

File ~/miniconda3/envs/epi/lib/python3.9/site-packages/snapatac2/tools/_network.py:137, in add_cor_scores(network, gene_mat, peak_mat, select, overwrite)
134 from tqdm import tqdm
136 key = "cor_score"
--> 137 if list(peak_mat.obs_names) != list(gene_mat.obs_names):
138 raise NameError("gene matrix and peak matrix should have the same obs_names")
139 if select is not None:

AttributeError: 'DataFrame' object has no attribute 'obs_names'`

I see the network nodes and edges are created with genes. but error with 'obs_names'
in addition to when running net = snap.tl.add_tf_binding(net, genome_fasta = snap.genome.hg19, motifs = motifs)
`2024-02-15 20:13:44 - INFO - Fetching 48 sequences ...
2024-02-15 20:13:44 - INFO - Searching for the binding sites of 10 motifs ...
0%| | 0/10 [00:00<?, ?it/s]

AttributeError Traceback (most recent call last)
Cell In[106], line 1
----> 1 net = snap.tl.add_tf_binding(net, genome_fasta = snap.genome.hg19, motifs = motifs)

File ~/miniconda3/envs/epi/lib/python3.9/site-packages/snapatac2/tools/_network.py:261, in add_tf_binding(network, motifs, genome_fasta, pvalue)
259 logging.info("Searching for the binding sites of {} motifs ...".format(len(motifs)))
260 for motif in tqdm(motifs):
--> 261 bound = motif.with_nucl_prob().exists(sequences, pvalue=pvalue)
262 if any(bound):
263 name = motif.id if motif.name is None else motif.name

AttributeError: 'str' object has no attribute 'with_nucl_prob'`

many thanks for help!

@kaizhang
Copy link
Owner

You need to specify gene_mat as well in snap.tl.add_cor_scores. And the obs_names between gene_mat and peak_mat must be the same. I'll write a tutorial for this later.

@MubasherMohammed
Copy link
Author

thanks for the swift reply!
my issue is the gene_mat i have not the same cell barcodes as peak_mat. is there a workaround for the way that I can add from GEX anndata with not the same obs_names? or just only use peak_mat to add the score?
thanks again..

@kaizhang
Copy link
Owner

Short answer is yes, if you have a sensible grouping that can be apply to both RNA and ATAC, e.g., group the cells according to cell types. More details will be provided once I finish the tutorial.

@MubasherMohammed
Copy link
Author

okay, thanks for the advice and looking fwd to the tutorial.

Regards

@TingTingShao
Copy link

Will the tutorail will be published soon:D?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants