-
Notifications
You must be signed in to change notification settings - Fork 380
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Prover SDK tracking issue #148
Comments
|
I'm amenable to really taking the time to think through the api design in a more comprehensive way than the #146 approach, which is really an attempt to get programmatic usage "out the door" in an at least somewhat constrained and opinionated way. I do think it makes sense though to have an independent public facing API from the prover API that is also consumed internally though. |
This is relevant because we already have Nova(seq), Nova(pcd), Nova(pcd+spartan), HyperNova soon, maybe Jolt which is going to be different. |
Here's a rough early sketch of a proposal for the overall SDK architecture. In terms of code, the idea will be to have an API that is consumed by both the SDK and CLI, providing two routes to similar functionality (albeit with the CLI being a simplified, streamlined version). ObjectsThe central object of the SDK/CLI will be an engine. Each engine will correspond to an ISA + arithmetization method, which takes in a complied program using the Each engine will be configurable with respect to its memory model and prover backend -- as checked by a compatibility layer -- and will also expose a set of properties indicating whether it supports key features (e.g., a private input tape, 32-bit vs. 64-bit architecture, various precompiles when they become available, etc.). Each engine will also be opinionated about its configuration, so that the vast majority of users should need to do no more than The prover backend will be the secondary object of the SDK/CLI. It itself will be configurable, e.g., with choice of whether to operate in sequential or PCD mode, as well as with its commitment scheme (pair) and curve (pair). Similarly, it will have compatibility checks and be opinionated about what defaults should be. Backends will also support a set of helpful property checks (e.g., whether they are recursive, Ethereum verifiable, support compression, etc.). InterfaceAs for the user API exposed by the SDK, the goal is to support operation chaining using the let vm = {NameOf}Engine::default()
.load('/path/to/binary/')?
.set_input<T>(input)
.execute()?
assert!(vm.started());
assert!(vm.halted());
assert!(!vm.proven());
check(vm.output)?; // some user function returning Result<(), Error> checking that the output looks right
vm.rewind(); // do not clear loaded binary or input tape, as compared to vm.reset()
assert!(!vm.started());
assert!(!vm.halted());
vm.enable_prover_mode();
vm.prover.enable_sequential(); // likely to be the default
assert!(vm.is_proving());
vm.generate()?
.execute()?; // alternatively, .prove()?
.verify()?
assert!(vm.started());
assert!(vm.halted());
assert!(vm.proven());
assert!(vm.verified());
if vm.prover.can_compress() {
vm.compress()?
assert!(vm.compressed());
}
vm.proof I/OFor I/O (into engines that support it), we will use postcard to support serializing/deserializing arbitrary structs that implement Serde::Serialize and Serde::Deserialize. Essentially, the API (and so SDK) will support two interfaces to each of the public and private tapes: -- Internal to the engines that support I/O, these are mirrored: -- The Compile-time ConfigurationThe last major piece of the early design is a compilation hook. This will be an (entirely optional) function that allows the user to trigger compilation + linkage of their program programmatically, through cargo. The main use case of this function will be to dynamically support emitting the linker, thereby enabling setting the memory limit through the SDK (per #152). The goal here will be to allow the user to provide a callback to do arbitrary compilation steps, with a default callback that just initiates the compilation, similar to: https://github.com/a16z/jolt/blob/c9bfe3226b8be6d5012495a7e2c72605980c6ae1/jolt-core/src/host/mod.rs#L93 |
The proposed Highlighting a couple points that resonate with me:
Yes, there are so many potential configurations, both now and in the future, and it's not a simple matter to swap them in and out.
Definitely, it makes sense that we'd want both. General princples we seem aligned on
To make that a bit more concrete, a likely implementation strategy is that (1) involves traits and generic parameters and (2) provides wrapper types that handle selecting generic type arguments and the construction of those internal values. Additional proposed principles
For example, workshopping on the I/O concept to make flexible internal abstractions: extern crate alloc;
use alloc::vec::Vec;
enum Error {
// TODO
}
trait RawInputTape {
fn read_byte(&mut self) -> Option<u8>;
}
trait RawOutputTape {
fn write_byte(&mut self, byte: u8) -> Result<(), Error>;
}
impl RawInputTape for Vec<u8> {
fn read_byte(&mut self) -> Option<u8> {
self.pop()
}
}
impl RawOutputTape for Vec<u8> {
fn write_byte(&mut self, byte: u8) -> Result<(), Error> {
self.push(byte);
// TODO: handle failed memory allocation
Ok(())
}
}
trait RawProof {}
trait RawVirtualMachine<I, O>
where
I: RawInputTape,
O: RawOutputTape,
{
type Proof: RawProof;
fn run(&mut self, input: I) -> Result<(O, Self::Proof), Error>;
} This allows us to keep the engine's interfaces as minimal as possible to reduce the cost of adding more implementations. For ergonomic conveniences like input serialization and output deserialization (which could happen outside proof machinery) we could wrap those abstractions in high-level ones. Implementations of these higher-level abstractions would automatically work with all of the engines implementing the core abstractions they consume: trait InputValue {
fn serialize<I: RawInputTape>(&self) -> Result<I, Error>;
}
trait OutputValue {
fn deserialize<O: RawOutputTape>(tape: &O) -> Result<Box<Self>, Error>;
}
struct UserInput;
impl InputValue for UserInput {
fn serialize<I: RawInputTape>(&self) -> Result<I, Error> {
todo!();
}
}
struct UserOutput;
impl OutputValue for UserOutput {
fn deserialize<O: RawOutputTape>(tape: &O) -> Result<Box<Self>, Error> {
todo!();
}
}
trait VirtualMachine<I, O>
where
I: InputValue,
O: OutputValue,
{
type Input: RawInputTape;
type Output: RawOutputTape;
type Proof: RawProof;
type Core: RawVirtualMachine<Self::Input, Self::Output, Proof = Self::Proof>;
fn core(&mut self) -> &mut Self::Core;
fn run(&mut self, input: I) -> Result<(O, Self::Proof), Error> {
let input: Self::Input = input.serialize()?;
let (output, proof) = self.core().run(input)?;
let output = O::deserialize(&output)?;
Ok((*output, proof))
}
} There's a lot more we could do with traits to model valid machine state transitions through UX interactions @sjudson alludes to in their sample code and @slumber's mention of "stepped CLI output". EDIT: I realize now, and confirmed with @sjudson, that the point of |
Here is the current sketch of the basic SDK structure. The core object of the SDK is an engine, which captures a vm + arithmetization. At time of writing, we will have two engines: NVMEngine { ... }
JoltEngine { ... } Each implements the pub trait Engine: Default + Clone {
/// The memory model to be used by the VM.
type Memory: Memory;
// setup
/// Load the ELF file at `path` into the VM.
fn load(&mut self, path: &PathBuf) -> Result<&mut Self, NexusError>;
/// Load the indicated test machine into the VM.
fn load_test_machine(&mut self, machine: &String) -> Result<&mut Self, NexusError>;
// configuration
/// Load VM configuration from a `NexusVMConfig` object.
fn from_config(&mut self, config: NexusVMConfig) -> Result<&mut Self, NexusError>;
/// Print debugging trace of the VM execution.
fn enable_debug_execution_mode(&mut self) -> Result<&mut Self, NexusError>;
/// When executing the VM, generate a proof.
fn enable_prover_mode(&mut self) -> Result<&mut Self, NexusError>;
/// Set choice of prover, using its defaults.
fn set_defaulted_prover(&mut self, prover: &ProverImpl) -> Result<&mut Self, NexusError>;
/// Set choice of prover.
fn set_prover(&mut self, prover: &impl Prover) -> Result<&mut Self, NexusError>;
/// Set `k` parameter that captures how many VM cycles should be proven per step.
fn set_k(&mut self, k: usize) -> Result<&mut Self, NexusError>;
// input
/// Serialize the input of type `T` for reading as private input in the VM.
fn set_input<T: Serialize>(&self, input: T) -> Result<&mut Self, NexusError>;
/// Serialize the input byte slice for reading as private input in the VM.
fn set_input_bytes(&self, input: &[u8]) -> Result<&mut Self, NexusError>;
/// Serialize the input of type `T` for reading as public input in the VM.
fn set_public_input<T: Serialize>(&self, input: T) -> Result<&mut Self, NexusError>;
/// Serialize the input byte slice for reading as public input in the VM.
fn set_public_input_bytes(&self, input: &[u8]) -> Result<&mut Self, NexusError>;
// running
/// Generate public parameters for proving the VM execution.
fn generate(&self) -> Result<&mut Self, NexusError>;
/// Execute the VM.
fn execute(&self) -> Result<&mut Self, NexusError>;
/// Execute the VM and generate a proof of its execution.
fn prove(&self) -> Result<&mut Self, NexusError>;
/// Verify a proven VM execution.
fn verify(&self) -> Result<&mut Self, NexusError>;
/// Rewind the VM, without clearing the loaded program or inputs.
fn rewind(&self) -> Result<&mut Self, NexusError>;
/// Reset the VM, including clearing the loaded program and inputs.
fn reset(&self) -> Result<&mut Self, NexusError>;
// status
/// Indicates whether configured to generate a proof when executing.
fn proving(&self) -> bool;
/// Indicates whether configured to print debug output for program execution.
fn debugging_execution(&self) -> bool;
/// Indicates whether program execution has started.
fn started(&self) -> bool;
/// Indicates whether program execution has halted.
fn halted(&self) -> bool;
/// Indicates whether program execution has been proven.
fn proven(&self) -> bool;
/// Indicates whether proof of program execution has been verified.
fn verified(&self) -> bool;
} Note that functions which update the state of the engine return let vm = NVMEngine::default()
.load(PathBuf::from(r"/path/to/binary/"))?
.set_input<T>(input)?
.execute()? Internally, an engine is configured with a #[derive(Default)]
enum ProverArch {
#[default]
Local,
}
pub trait Prover: Default + Clone {
/// The proving architecture used to prove the trace.
type Arch: ProverArch;
/// The memory proof form used in the trace.
type MemoryProof: MemoryProof;
/// Generate public parameters for proving.
fn generate(&self, k: usize) -> Result<(), NexusError>;
/// Prove a VM trace.
fn prove(&self, trace: Trace<Self::MemoryProof>) -> Result<(), NexusError>;
/// Verify a proven VM execution.
fn verify(&self) -> Result<(), NexusError>;
} The The set of compatible provers for a given engine is defined by an engine specific trait. For example, At present there is no generic support: specific provers are concrete in both proving architectures + other proof specific generics (such as choice of curve), as is specified by the |
This looks great |
// It shouldn't be hidden within a trait impl how to load from fs
// returns &mut Self?
fn load(&mut self, path: &PathBuf) -> Result<&mut Self, NexusError>;
// Probably should be omitted as well, or maybe autoimplemented.
fn load_test_machine(&mut self, machine: &String) -> Result<&mut Self, NexusError>;
// All these methods could go into NexusVMConfig
fn enable_debug_execution_mode(&mut self) -> Result<&mut Self, NexusError>;
fn enable_prover_mode(&mut self) -> Result<&mut Self, NexusError>;
fn set_defaulted_prover(&mut self, prover: &ProverImpl) -> Result<&mut Self, NexusError>;
fn set_k(&mut self, k: usize) -> Result<&mut Self, NexusError>;
// Either of 2, otherwise this is like trying to bypass Rust rules for conflicting implementations
// because [u8] also implements Serialize.
fn set_input<T: Serialize>(&self, input: T) -> Result<&mut Self, NexusError>;
fn set_input_bytes(&self, input: &[u8]) -> Result<&mut Self, NexusError>;
// This is present in the prover trait, which one to use?
fn generate(&self) -> Result<&mut Self, NexusError>;
// Doesn't `enable_prover_mode` eliminate 1 of 2?
fn execute(&self) -> Result<&mut Self, NexusError>;
fn prove(&self) -> Result<&mut Self, NexusError>;
// and so on IMO as with the RPC pr, trying to put every possible configuration and generic into the first implementation does more bad than good. Usual software engineering practice is to build from small code, introducing such huge traits puts a maintenance burden. Like start with a single trait that only knows how to prove/verify, merge it, add some vm configurations, make it further configurable with memory layout etc. At this point it looks like I keep blocking work on sdk, so feel free to unblock #170 |
I broadly agree with @slumber's feedback. Here are some additional thoughts:
It may be helpful to tackle #165 and iterate on cleaning up that test until it feels both ergonomic and sufficiently configurable and extensible. |
Responding directly to @mx00s since they were a bit more concrete, but this also applies to some of your comments too @slumber.
Sure.
Also sure.
Sorry yes, that's a typo (I accidentally reverted a search + replace before copying).
I debated this. The flow I was concerned with enabling looks like: let engine = ...;
if check_something() {
engine.enable_prover_mode();
}
engine.execute(); // generates proof depending on what check_something() returns Alternatively, if the user necessarily knows they will always want a proof generated, they can just do engine.prove(); without having to think about it twice. If you are really opposed to having two ways to do the same thing like this, then I'd probably err on the side of getting rid of
These are roughly the same question:
Sure.
This would require breaking the function call chaining paradigm (which we can, of course, do). I find the call chaining paradigm to be more misuse resistant: the user never actually has to touch, e.g., the public parameters or carry them through to the right place as an argument, but instead just needs to specify the order of operations (which they'd have to do either way).
These were formulated with the idea that an engine can be in a variety of states (has [no] program, has [no] input, has [not] generated proof, has [not] generated parameters, etc.), and we don't want to be presumptive about if and when the calling code will want to programmatically check which state it is in. The calling context being |
I'm not sure I follow the benefit of that flow over something like this: let engine = ...;
if check_something() {
engine.prove();
} else {
engine.execute();
} The way the user inspects the state of the returned/mutated engine would also need to vary depending on the value of
I agree on the desire for misuse resistance. I also find method chaining convenient, and tend to use it a lot when constructing objects with a broad spectrum of configurations. However, object construction is typically more of a concern for concrete types whereas Typestate is a tactic that enforces operational constraints at compile-time and thereby improves misuse resistance. Here's an illustrative article that includes examples with method chaining. I think is worth revisiting an important point @slumber made:
Simple interfaces will lower the barrier to adding new implementations. Constraining the behaviors of those interfaces through types and generic tests will also help everyone be confident things will work when swapping one implementation for another. I'm curious how far we could go with an interface like this, for instance: trait Prover {
type Program;
type PublicInput;
type PrivateInput;
type Output;
type Proof: Verifiable;
type Error;
fn prove(
program: &Self::Program,
public_input: &Self::PublicInput,
private_input: &Self::PrivateInput,
) -> Result<(Self::Output, Self::Proof), Self::Error>;
}
trait Verifiable {
type Error;
fn verify(&self) -> Result<(), Self::Error>;
} ...or perhaps even this assumes to much in general, e.g. will inputs--both public and private--definitely be supported on all prover/verifier implementations we ever offer? EDIT: Here's a Rust playground with fake implementations of those traits and a happy-path usage example. I personally find it helpful to motivate trait design by writing tests from the consumer's perspective. |
Resolved by #208 |
The main prover functionality is gathered in
nexus-prover
crate, which does all sort of stuff: generate public parameters, print nice CLI output, compress proofs etc.I propose to remove it in favor of
nexus-sdk
which provides abstracted interface likeAnd then use it
This SDK is then integrated into
nexus-tools
to be reused with CLI and pretty output.The text was updated successfully, but these errors were encountered: