-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CONSULT-469] add initial code for field parsing #377
base: main
Are you sure you want to change the base?
Conversation
micro-rdk-nmea/Cargo.toml
Outdated
# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html | ||
|
||
[dependencies] | ||
micro-rdk = { path = "../micro-rdk", features = ["native"] } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To my great annoyance, we've gone back to a state where micro-rdk cannot be built as a feature-less library. I'll make a ticket for this and link it here
EDIT: Ticket filed (https://viam.atlassian.net/browse/RSDK-9710)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done with a first pass. I haven't taken too close a look at the implementation details of the bit bashing yet, as I'd like to fully understand how the types work together and I'm not quite there yet, but here are my notes so far.
Cargo.toml
Outdated
@@ -9,6 +9,7 @@ members = [ | |||
"micro-rdk-ffi", | |||
"examples/modular-drivers", | |||
"etc/ota-dev-server", | |||
"micro-rdk-nmea", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we sort this list?
micro-rdk-nmea/Cargo.toml
Outdated
# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html | ||
|
||
[dependencies] | ||
micro-rdk = { path = "../micro-rdk", features = ["native"] } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you can say micro-rdk = { workspace = true, features= ...
here - the top level Cargo.toml imports micro-rdk.
micro-rdk-nmea/README.md
Outdated
@@ -0,0 +1,5 @@ | |||
# The micro-RDK NMEA 2000 Message Parser (WIP) | |||
|
|||
This is a library that will eventually supply logic to parse NMEA 2000 messages from byte data into |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe just "This libraries supplies ..."
@@ -63,7 +63,6 @@ pub enum AppClientError { | |||
AppConfigHeaderDateMissingError, | |||
#[error(transparent)] | |||
AppGrpcClientError(#[from] GrpcClientError), | |||
#[cfg(feature = "data")] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Intentional? I don't see any other references to this type in the review.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Discussed offline, I found this issue when I tried to build featureless micro-RDK. The issue is that config monitor now uses this error, so trying to build without the data feature fails
@@ -0,0 +1,328 @@ | |||
pub trait Lookup: Sized { | |||
fn from_value(val: u32) -> Self; | |||
fn to_string(&self) -> String; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- For
to_string
, could/should we use theToString
trait? - For
from_value
is there another trait that could be used?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- yes, I meant to make that change and forgot
- I could implement
From<u32>
, is that desired?
/// For generating a lookup data type found in an NMEA message. The first argument is the name of the | ||
/// enum type that will be generated. Each successive argument is a tuple with | ||
/// (raw number value, name of enum instance, string representation) | ||
macro_rules! lookup { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same feeling on the name of this macro as for the trait.
type FieldType = $t; | ||
|
||
fn read_from_data(&self, data: &[u8], start_idx: usize) -> Result<(usize, Self::FieldType), NmeaParseError> { | ||
let type_size = std::mem::size_of::<$t>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does FieldType
here work instead of $t? In the below usages? I just think it is easier if there are fewer places the macro parameters are used.
&self, | ||
data: &[u8], | ||
start_idx: usize, | ||
) -> Result<(usize, Self::FieldType), NmeaParseError>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Could this take
self
rather than&self
? - Could it return a new FieldReader at the advanced position rather than returning the offset?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- is there a reason we want it consumed?
- I'm not convinced this is easy to accomplish because something that implements the FieldReader trait is unaware of the data type of the next field
} | ||
} | ||
|
||
macro_rules! number_field { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe generate_number_field_readers
or similar.
&self, | ||
data: &[u8], | ||
start_idx: usize, | ||
) -> Result<(usize, Self::FieldType), NmeaParseError> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same thoughts as above on signature here regarding self
and the usize
return.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done with first pass;
I don't like having function signature be like Result<(usize, Self::FieldType),...>
it means wee need logic to keep track where we are in the stream of bytes which i believe would be error prone.
Did you consider using a Reader at the bit level? perhaps https://docs.rs/bitreader/latest/bitreader/ can be an option
let mut end_idx = start_idx; | ||
let field_reader: NumberField<T> = Default::default(); | ||
for i in 0..N { | ||
let (bytes_read, next_elem) = field_reader.read_from_data(data, end_idx)?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this will panic if N*sizeonf(T)>len(data)
i can't see where a bound check would occur
} | ||
} | ||
|
||
impl<T, const N: usize> FieldReader for ArrayField<T, N> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
having N as parameter in the template will lead to unique codegen for each pair of (T,N) I am not sure if the N is extremely important here can we get away with a reference slice? Otherwise we can leave it as is I am just concerned about overall code size
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't believe that's true, I think it will only code gen for each actually referenced (T, N), but I can double-check
(6, Integrated, "integrated"), | ||
(7, Surveyed, "surveyed"), | ||
(8, Galileo, "Galileo"), | ||
CouldNotParse |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CouldNotParse
feels like it should be an error vs an actual field. If we want to represent an Unknown LookupField then I would suggest naming it UnknowLookupField
or alike
} | ||
x if x < 32 => { | ||
let shift = 32 - x; | ||
let raw_val = u32::from_le_bytes(data_slice.try_into()?); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this will not work for bit_size in [16;23]
as (self.bit_size / 8) == 2 and bit_size ==32
as (self.bit_size / 8) == 4. And for bit_size ==16` as (self.bit_size / 8) == 2
let raw_val = u32::from_le_bytes(data_slice.try_into()?); | ||
raw_val >> shift | ||
} | ||
_ => unreachable!("lookup field raw value cannot be more than 32 bits"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should not instantiate LookupField if bit_size > 32
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Took a pass through. It mostly makes sense, but I'll need to spend some more time on generate_number_field_readers
to really get what is going on there. Some questions and suggestions as well.
$default | ||
} | ||
|
||
impl From<u32> for $name { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be TryFrom
? Then you wouldn't need the $default:ident
or associated enum member and could just return an error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had this be the behavior because the gonmea library does not fail on an unrecognized lookup value
/// child-instances of `DataRead` or calling the `advance` function | ||
pub struct DataCursor<'a> { | ||
data: &'a [u8], | ||
bit_position: Rc<AtomicUsize>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand why atomics are making an appearance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had a reason, but I have since forgotten it. I'll just remove it, it's not like the rest of the code is thread-safe anyway
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok I've remembered again. Rust will not let me mutate the content of an Rc
in the drop for DataRead
because it doesn't consider it safe to mutate it when the reference count isn't 0. I have to get around this with the interior mutability of AtomicUsize
} | ||
} | ||
|
||
fn read(&self, bits: usize) -> Result<DataRead, NmeaParseError> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you call read
twice without dropping the DataRead
object that resulted from the first read
before the second call to read
, what happens?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It will be very bad. They could both read from the same starting place but potentially different ending places. Even if they do read to the same end position, when they're both dropped the bit position will advance twice as far as it should. Should I protect against this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ideally, you would leverage the type system to make such a situation impossible. Just as a for instance, you could do something similar to how ScopeGuard works, where calling read
on a DataCursor
consumes the DataCursor
and hands you a DataRead
object that holds an inner DataCursor
, which you must recover via into_inner
or similar, which in turn consumes the DataRead
and restores the DataCursor
. That way, you go back and forth between them, but you can't end up with two DataRead
objects obtained on the same DataCursor
object. That might also let you get rid of the Rc
? I'll bet @npmenard would have some even better ideas about how to structure it. This is just me thinking aloud.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think read
should consume data and update the Cursor in place if we need to read part of a byte then a peek
function would be more suitable.
} | ||
} | ||
|
||
pub fn advance(&self, bits: usize) -> Result<(), NmeaParseError> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could advance
be implemented by calling read
and throwing away the result?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It could, is that preferred?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think so, because then advance
is very very simple, and the complexity lives only in read
.
let enum_value = match self.bit_size { | ||
x if x <= 8 => Ok::<u32, NmeaParseError>(u8::try_from(data_read)? as u32), | ||
x if x <= 16 => Ok::<u32, NmeaParseError>(u16::try_from(data_read)? as u32), | ||
x if x <= 32 => Ok::<u32, NmeaParseError>(u16::try_from(data_read)? as u32), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be u32::try_from
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right that that should be equivalent. It wasn't, which revealed a bug in number parsing. Adding another number field test and changing this to u32::try_from
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done with first pass, i have some questions about the bit manipulations as I don't understand fully what's going on.
impl TryFrom<DataRead<'_>> for $t { | ||
type Error = NumberFieldError; | ||
fn try_from(value: DataRead) -> Result<Self, Self::Error> { | ||
let max_size = std::mem::size_of::<Self>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we need to add a check because if max_size*8 < value.bit_size the function will panic
#[error(transparent)] | ||
TryFromSliceError(#[from] std::array::TryFromSliceError), | ||
#[error("end of buffer exceeded")] | ||
EndOfBufferExceeded, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: NotEnoughData?
} | ||
} | ||
|
||
fn read(&self, bits: usize) -> Result<DataRead, NmeaParseError> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think read
should consume data and update the Cursor in place if we need to read part of a byte then a peek
function would be more suitable.
// incomplete byte so it can be padded with zeros, then we reverse the bits | ||
// of the now complete last byte for the proper bit order | ||
let last_bit_start = bit_vec.len() - (value.bit_size % 8); | ||
let _ = &bit_vec[last_bit_start..].reverse(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is the purpose of this operation? for example if I want to read 12bit on a 16bit number then last_bit_start would be 8 so the reverse operation will affect bits [8..12]. I think this ends up doing something like shift the left most bits right overwriting on the way.
Also different value of bit_size will change the amount of bits reversed can that affect reading numbers?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is what I'm trying to do, yes. The reversing is just to be able to pad the bits with 0 from the front. So, if the overflow bits are 1110, I want to make it so the last byte is 00001110. To do that I reverse just the last 4 bits in place (0111), add the padding (01110000), then reverse the last byte to restore the bit order (00001110)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are suggesting I could just right shift the last byte by however many bits are remaining?
} | ||
|
||
struct DataRead<'a> { | ||
data: &'a [u8], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should own the buffer
First of ~4 PRs for the NMEA 2000 parsing library. Introduces a macro for defining enums representing the various lookup fields present in messages and readers that parse specific data types from a byte slice. These readers will be used by the code generated with the proc macro appearing in the next PR.
Note: Going forward, to better visualize the code generated by the macros, I recommend using cargo expand (
cargo install cargo-expand
to install,cargo expand
to run) in the micro-rdk-nmea directory. You can use it in this PR to see how thelookup
macro expands but it will be more important in the subsequent PR introducing the proc macros