bug fix: string truncation in AtomArray annotations #755

davegrays · 2025-02-14T04:50:20Z

The structure.array() method truncates any string-based annotations to an arbitrary length of string. i.e. you can end up with

atoms[1000] --> Atom(..., str_annot="111", ...)
atoms[1000].str_annot.dtype --> dtype('<U3')

array = structure.array(atoms)
array[1000] --> Atom(..., str_annot="1", ...)
array[1000].str_annot.dtype --> dtype('<U1')

The culprit is this
array.add_annotation(name, dtype=type(atoms[0]._annot[name]))
is using the first atom in the array, which is often the shortest, to determine the dtype of the whole array. This is particularly problematic when processing cif columns that contains string representations of ids. Instead, we need to find the maximum length across the atom list to ensure no truncations occur.

codspeed-hq · 2025-02-14T05:17:46Z

CodSpeed Performance Report

Merging #755 will not alter performance

_{Comparing davegrays:fix/atomarray-string-annotations (1effc13) with main (d7df095)}

Summary

✅ 59 untouched benchmarks

padix-key

Thanks for the fix 👍!

padix-key · 2025-02-15T21:20:38Z

The currently failing tests are caused be a change on the side on the RCSB, which will be addressed in #759. Hence, I will merge this PR.

fix string lengths in AtomArray annotations

1effc13

davegrays force-pushed the fix/atomarray-string-annotations branch from c67b22b to 1effc13 Compare February 14, 2025 04:56

padix-key approved these changes Feb 14, 2025

View reviewed changes

padix-key merged commit 89f9510 into biotite-dev:main Feb 15, 2025
27 of 28 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bug fix: string truncation in AtomArray annotations #755

bug fix: string truncation in AtomArray annotations #755

davegrays commented Feb 14, 2025 •

edited

Loading

codspeed-hq bot commented Feb 14, 2025

padix-key left a comment

padix-key commented Feb 15, 2025

bug fix: string truncation in AtomArray annotations #755

bug fix: string truncation in AtomArray annotations #755

Conversation

davegrays commented Feb 14, 2025 • edited Loading

codspeed-hq bot commented Feb 14, 2025

CodSpeed Performance Report

Merging #755 will not alter performance

Summary

padix-key left a comment

Choose a reason for hiding this comment

padix-key commented Feb 15, 2025

davegrays commented Feb 14, 2025 •

edited

Loading