[CONSULT-469] add initial code for field parsing #377

gvaradarajan · 2025-01-08T23:44:01Z

First of ~4 PRs for the NMEA 2000 parsing library. Introduces a macro for defining enums representing the various lookup fields present in messages and readers that parse specific data types from a byte slice. These readers will be used by the code generated with the proc macro appearing in the next PR.

Note: Going forward, to better visualize the code generated by the macros, I recommend using cargo expand (cargo install cargo-expand to install, cargo expand to run) in the micro-rdk-nmea directory. You can use it in this PR to see how the lookup macro expands but it will be more important in the subsequent PR introducing the proc macros

gvaradarajan · 2025-01-08T23:45:15Z

micro-rdk-nmea/Cargo.toml

+# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html
+
+[dependencies]
+micro-rdk = { path = "../micro-rdk", features = ["native"] }


To my great annoyance, we've gone back to a state where micro-rdk cannot be built as a feature-less library. I'll make a ticket for this and link it here

EDIT: Ticket filed (https://viam.atlassian.net/browse/RSDK-9710)

acmorrow

Done with a first pass. I haven't taken too close a look at the implementation details of the bit bashing yet, as I'd like to fully understand how the types work together and I'm not quite there yet, but here are my notes so far.

acmorrow · 2025-01-10T18:35:37Z

Cargo.toml

@@ -9,6 +9,7 @@ members = [
  "micro-rdk-ffi",
  "examples/modular-drivers",
  "etc/ota-dev-server",
+  "micro-rdk-nmea",


Can we sort this list?

acmorrow · 2025-01-10T18:37:47Z

micro-rdk-nmea/Cargo.toml

+# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html
+
+[dependencies]
+micro-rdk = { path = "../micro-rdk", features = ["native"] }


I think you can say micro-rdk = { workspace = true, features= ... here - the top level Cargo.toml imports micro-rdk.

acmorrow · 2025-01-10T18:39:59Z

micro-rdk-nmea/README.md

@@ -0,0 +1,5 @@
+# The micro-RDK NMEA 2000 Message Parser (WIP)
+
+This is a library that will eventually supply logic to parse NMEA 2000 messages from byte data into


Maybe just "This libraries supplies ..."

acmorrow · 2025-01-10T18:41:00Z

micro-rdk/src/common/app_client.rs

@@ -63,7 +63,6 @@ pub enum AppClientError {
    AppConfigHeaderDateMissingError,
    #[error(transparent)]
    AppGrpcClientError(#[from] GrpcClientError),
-    #[cfg(feature = "data")]


Intentional? I don't see any other references to this type in the review.

Discussed offline, I found this issue when I tried to build featureless micro-RDK. The issue is that config monitor now uses this error, so trying to build without the data feature fails

acmorrow · 2025-01-10T18:43:02Z

micro-rdk-nmea/src/parse_helpers/enums.rs

@@ -0,0 +1,328 @@
+pub trait Lookup: Sized {
+    fn from_value(val: u32) -> Self;
+    fn to_string(&self) -> String;


For to_string, could/should we use the ToString trait?

For from_value is there another trait that could be used?

yes, I meant to make that change and forgot

I could implement From<u32>, is that desired?

acmorrow · 2025-01-10T18:45:25Z

micro-rdk-nmea/src/parse_helpers/enums.rs

+/// For generating a lookup data type found in an NMEA message. The first argument is the name of the
+/// enum type that will be generated. Each successive argument is a tuple with
+/// (raw number value, name of enum instance, string representation)
+macro_rules! lookup {


Same feeling on the name of this macro as for the trait.

acmorrow · 2025-01-10T18:55:35Z

micro-rdk-nmea/src/parse_helpers/parsers.rs

+                type FieldType = $t;
+
+                fn read_from_data(&self, data: &[u8], start_idx: usize) -> Result<(usize, Self::FieldType), NmeaParseError> {
+                    let type_size = std::mem::size_of::<$t>();


Does FieldType here work instead of $t? In the below usages? I just think it is easier if there are fewer places the macro parameters are used.

acmorrow · 2025-01-10T18:59:05Z

micro-rdk-nmea/src/parse_helpers/parsers.rs

+        &self,
+        data: &[u8],
+        start_idx: usize,
+    ) -> Result<(usize, Self::FieldType), NmeaParseError>;


Could this take self rather than &self?

Could it return a new FieldReader at the advanced position rather than returning the offset?

is there a reason we want it consumed?

I'm not convinced this is easy to accomplish because something that implements the FieldReader trait is unaware of the data type of the next field

acmorrow · 2025-01-10T19:40:23Z

micro-rdk-nmea/src/parse_helpers/parsers.rs

+    }
+}
+
+macro_rules! number_field {


Maybe generate_number_field_readers or similar.

acmorrow · 2025-01-10T20:14:35Z

micro-rdk-nmea/src/parse_helpers/parsers.rs

+        &self,
+        data: &[u8],
+        start_idx: usize,
+    ) -> Result<(usize, Self::FieldType), NmeaParseError> {


Same thoughts as above on signature here regarding self and the usize return.

npmenard

Done with first pass;
I don't like having function signature be like Result<(usize, Self::FieldType),...> it means wee need logic to keep track where we are in the stream of bytes which i believe would be error prone.
Did you consider using a Reader at the bit level? perhaps https://docs.rs/bitreader/latest/bitreader/ can be an option

npmenard · 2025-01-13T15:52:00Z

micro-rdk-nmea/src/parse_helpers/parsers.rs

+        let mut end_idx = start_idx;
+        let field_reader: NumberField<T> = Default::default();
+        for i in 0..N {
+            let (bytes_read, next_elem) = field_reader.read_from_data(data, end_idx)?;


I think this will panic if N*sizeonf(T)>len(data) i can't see where a bound check would occur

npmenard · 2025-01-13T15:54:10Z

micro-rdk-nmea/src/parse_helpers/parsers.rs

+    }
+}
+
+impl<T, const N: usize> FieldReader for ArrayField<T, N>


having N as parameter in the template will lead to unique codegen for each pair of (T,N) I am not sure if the N is extremely important here can we get away with a reference slice? Otherwise we can leave it as is I am just concerned about overall code size

I don't believe that's true, I think it will only code gen for each actually referenced (T, N), but I can double-check

npmenard · 2025-01-13T16:08:01Z

micro-rdk-nmea/src/parse_helpers/enums.rs

+    (6, Integrated, "integrated"),
+    (7, Surveyed, "surveyed"),
+    (8, Galileo, "Galileo"),
+    CouldNotParse


CouldNotParse feels like it should be an error vs an actual field. If we want to represent an Unknown LookupField then I would suggest naming it UnknowLookupField or alike

npmenard · 2025-01-13T16:20:02Z

micro-rdk-nmea/src/parse_helpers/parsers.rs

+            }
+            x if x < 32 => {
+                let shift = 32 - x;
+                let raw_val = u32::from_le_bytes(data_slice.try_into()?);


I think this will not work for bit_size in [16;23] as (self.bit_size / 8) == 2 and bit_size ==32 as (self.bit_size / 8) == 4. And for bit_size ==16` as (self.bit_size / 8) == 2

npmenard · 2025-01-13T16:22:42Z

micro-rdk-nmea/src/parse_helpers/parsers.rs

+                let raw_val = u32::from_le_bytes(data_slice.try_into()?);
+                raw_val >> shift
+            }
+            _ => unreachable!("lookup field raw value cannot be more than 32 bits"),


should not instantiate LookupField if bit_size > 32

acmorrow

Took a pass through. It mostly makes sense, but I'll need to spend some more time on generate_number_field_readers to really get what is going on there. Some questions and suggestions as well.

acmorrow · 2025-01-16T22:00:54Z

micro-rdk-nmea/src/parse_helpers/enums.rs

+            $default
+        }
+
+        impl From<u32> for $name {


Should this be TryFrom? Then you wouldn't need the $default:ident or associated enum member and could just return an error.

I had this be the behavior because the gonmea library does not fail on an unrecognized lookup value

acmorrow · 2025-01-16T22:03:01Z

micro-rdk-nmea/src/parse_helpers/parsers.rs

+/// child-instances of `DataRead` or calling the `advance` function
+pub struct DataCursor<'a> {
+    data: &'a [u8],
+    bit_position: Rc<AtomicUsize>,


I don't understand why atomics are making an appearance.

I had a reason, but I have since forgotten it. I'll just remove it, it's not like the rest of the code is thread-safe anyway

Ok I've remembered again. Rust will not let me mutate the content of an Rc in the drop for DataRead because it doesn't consider it safe to mutate it when the reference count isn't 0. I have to get around this with the interior mutability of AtomicUsize

acmorrow · 2025-01-16T22:06:39Z

micro-rdk-nmea/src/parse_helpers/parsers.rs

+        }
+    }
+
+    fn read(&self, bits: usize) -> Result<DataRead, NmeaParseError> {


If you call read twice without dropping the DataRead object that resulted from the first read before the second call to read, what happens?

It will be very bad. They could both read from the same starting place but potentially different ending places. Even if they do read to the same end position, when they're both dropped the bit position will advance twice as far as it should. Should I protect against this?

Ideally, you would leverage the type system to make such a situation impossible. Just as a for instance, you could do something similar to how ScopeGuard works, where calling read on a DataCursor consumes the DataCursor and hands you a DataRead object that holds an inner DataCursor, which you must recover via into_inner or similar, which in turn consumes the DataRead and restores the DataCursor. That way, you go back and forth between them, but you can't end up with two DataRead objects obtained on the same DataCursor object. That might also let you get rid of the Rc? I'll bet @npmenard would have some even better ideas about how to structure it. This is just me thinking aloud.

I think read should consume data and update the Cursor in place if we need to read part of a byte then a peek function would be more suitable.

acmorrow · 2025-01-16T22:08:48Z

micro-rdk-nmea/src/parse_helpers/parsers.rs

+        }
+    }
+
+    pub fn advance(&self, bits: usize) -> Result<(), NmeaParseError> {


Could advance be implemented by calling read and throwing away the result?

It could, is that preferred?

I think so, because then advance is very very simple, and the complexity lives only in read.

acmorrow · 2025-01-16T22:14:44Z

micro-rdk-nmea/src/parse_helpers/parsers.rs

+        let enum_value = match self.bit_size {
+            x if x <= 8 => Ok::<u32, NmeaParseError>(u8::try_from(data_read)? as u32),
+            x if x <= 16 => Ok::<u32, NmeaParseError>(u16::try_from(data_read)? as u32),
+            x if x <= 32 => Ok::<u32, NmeaParseError>(u16::try_from(data_read)? as u32),


Should this be u32::try_from?

You're right that that should be equivalent. It wasn't, which revealed a bug in number parsing. Adding another number field test and changing this to u32::try_from

npmenard

Done with first pass, i have some questions about the bit manipulations as I don't understand fully what's going on.

npmenard · 2025-01-17T16:49:10Z

micro-rdk-nmea/src/parse_helpers/parsers.rs

+            impl TryFrom<DataRead<'_>> for $t {
+                type Error = NumberFieldError;
+                fn try_from(value: DataRead) -> Result<Self, Self::Error> {
+                    let max_size = std::mem::size_of::<Self>();


we need to add a check because if max_size*8 < value.bit_size the function will panic

npmenard · 2025-01-17T16:53:48Z

micro-rdk-nmea/src/parse_helpers/errors.rs

+    #[error(transparent)]
+    TryFromSliceError(#[from] std::array::TryFromSliceError),
+    #[error("end of buffer exceeded")]
+    EndOfBufferExceeded,


nit: NotEnoughData?

npmenard · 2025-01-17T17:38:10Z

micro-rdk-nmea/src/parse_helpers/parsers.rs

+        }
+    }
+
+    fn read(&self, bits: usize) -> Result<DataRead, NmeaParseError> {


I think read should consume data and update the Cursor in place if we need to read part of a byte then a peek function would be more suitable.

npmenard · 2025-01-17T18:33:56Z

micro-rdk-nmea/src/parse_helpers/parsers.rs

+                        // incomplete byte so it can be padded with zeros, then we reverse the bits
+                        // of the now complete last byte for the proper bit order
+                        let last_bit_start = bit_vec.len() - (value.bit_size % 8);
+                        let _ = &bit_vec[last_bit_start..].reverse();


what is the purpose of this operation? for example if I want to read 12bit on a 16bit number then last_bit_start would be 8 so the reverse operation will affect bits [8..12]. I think this ends up doing something like shift the left most bits right overwriting on the way.
Also different value of bit_size will change the amount of bits reversed can that affect reading numbers?

That is what I'm trying to do, yes. The reversing is just to be able to pad the bits with 0 from the front. So, if the overflow bits are 1110, I want to make it so the last byte is 00001110. To do that I reverse just the last 4 bits in place (0111), add the padding (01110000), then reverse the last byte to restore the bit order (00001110)

are suggesting I could just right shift the last byte by however many bits are remaining?

npmenard · 2025-01-17T18:34:12Z

micro-rdk-nmea/src/parse_helpers/parsers.rs

+}
+
+struct DataRead<'a> {
+    data: &'a [u8],


should own the buffer

[CONSULT-469] add initial code for field parsing

740695e

gvaradarajan requested a review from a team as a code owner January 8, 2025 23:44

gvaradarajan commented Jan 8, 2025

View reviewed changes

acmorrow requested changes Jan 10, 2025

View reviewed changes

gvaradarajan added 3 commits January 10, 2025 15:18

bug fix

05c5886

address some review comments

f8856c2

name change

e10fb75

gvaradarajan requested review from acmorrow and npmenard January 13, 2025 14:45

npmenard requested changes Jan 13, 2025

View reviewed changes

gvaradarajan added 2 commits January 15, 2025 15:39

cursor based redesign

30edd34

handle some edge cases

f940523

gvaradarajan requested a review from npmenard January 15, 2025 21:00

gvaradarajan added 3 commits January 15, 2025 16:05

move dependency

feb9871

oops

65c6b9a

ok now the tests pass again

d443cea

acmorrow reviewed Jan 16, 2025

View reviewed changes

bugfix

189157e

gvaradarajan requested a review from acmorrow January 17, 2025 16:58

npmenard reviewed Jan 17, 2025

View reviewed changes

use shifting and remove bitvec dependency

84ad67e

		@@ -0,0 +1,5 @@
		# The micro-RDK NMEA 2000 Message Parser (WIP)

		This is a library that will eventually supply logic to parse NMEA 2000 messages from byte data into

[CONSULT-469] add initial code for field parsing #377

Are you sure you want to change the base?

[CONSULT-469] add initial code for field parsing #377

Conversation

gvaradarajan commented Jan 8, 2025 • edited Loading

gvaradarajan Jan 8, 2025 • edited Loading

Choose a reason for hiding this comment

acmorrow left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

npmenard left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

npmenard Jan 13, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

npmenard Jan 13, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

acmorrow left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gvaradarajan Jan 17, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

npmenard left a comment

Choose a reason for hiding this comment

npmenard Jan 17, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gvaradarajan commented Jan 8, 2025 •

edited

Loading

gvaradarajan Jan 8, 2025 •

edited

Loading

npmenard Jan 13, 2025 •

edited

Loading

npmenard Jan 13, 2025 •

edited

Loading

gvaradarajan Jan 17, 2025 •

edited

Loading

npmenard Jan 17, 2025 •

edited

Loading