Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

safeguarding code changes done by wasm binary #17

Open
ChALkeR opened this issue Jul 26, 2024 · 18 comments
Open

safeguarding code changes done by wasm binary #17

ChALkeR opened this issue Jul 26, 2024 · 18 comments
Labels
stale PRs or issues that have been stale

Comments

@ChALkeR
Copy link
Member

ChALkeR commented Jul 26, 2024

Perhaps adding an additional safeguard on the transform output could increase trust in the shipped base64 wasm file

PoC on top of the package public API (but embedding that inside the package might avoid double conversion)

const transformedBuf = Buffer.from(transformed)
if (sourceBuf.length !== transformed.length) throw new Error('length mismatch')
for (let i = 0; i < transformedBuf.length; i++) {
  // should match either the source buffer or spaces or semicolon: https://github.com/swc-project/swc/issues/9331
  const val = transformedBuf[i]
  if (val !== sourceBuf[i] && val !== 0x20 && val !== 0x3b) throw new Error('result mismatch')
}

That seems to work on simple examples, I wonder how something like this (minus the extra buffer<->string conversions) would affect perf

@ChALkeR ChALkeR changed the title safeguarding code changes safeguarding code changes done by wasm binary Jul 26, 2024
@marco-ippolito
Copy link
Member

I mean we compile rust and build the wasm manually, so if there is anything suspicious we see it in the rust code when we update swc https://github.com/nodejs/amaro/tree/main/deps/swc

This is the wasm build: https://github.com/nodejs/amaro/blob/main/tools/build-wasm.js

@ChALkeR
Copy link
Member Author

ChALkeR commented Jul 26, 2024

trusting/verifying the build process and verifying the result on the fly are different levels of guarantees, which are not mutually exclusive and should be complementing each other, ideally

so the main question is if there is a measurable perf hit on this / what would be the costs of such a safeguard

@marco-ippolito
Copy link
Member

I wonder what is safeguarding against specifically? Can you describe an use case?

@ChALkeR
Copy link
Member Author

ChALkeR commented Jul 26, 2024

@marco-ippolito e.g. a supply chain attack on upstream swc embedding malicious js into generated output which gets past source audit when updating the binary

this check would make it significantly harder to do anything bad with that

@marco-ippolito
Copy link
Member

marco-ippolito commented Jul 26, 2024

@marco-ippolito e.g. a supply chain attack on upstream swc embedding malicious js into generated output which gets past source audit when updating the binary

this check would make it significantly harder to do anything bad with that

If a supply chain attack happens in swc, when we update swc, we have a pr with the changed rust source code. We dont use swc javascript. Then someone has to manually compile the wasm and open a pr to update the generated js in amaro. The inline wasm is generated by us from the rust source code, not from upstream. If a malicious attacked opened a pr with malicious wasm, the safeguard would not be useful

@ChALkeR
Copy link
Member Author

ChALkeR commented Jul 26, 2024

cc @nodejs/security-wg (also perhaps @nodejs/security-tsc)

@ChALkeR
Copy link
Member Author

ChALkeR commented Jul 26, 2024

If a supply chain attack happens in swc, when we update swc, we have a pr with the changed rust source code.

Yes, and I'm not entirely sure we can give a 100% guarantee that each line in each change in rust code will be properly audited each time we update swc (we can't give a 100% guarantee to anything actually)

(Was the current impl audited btw?)

We dont use swc javascript

The issue is with rust swc possibly injecting malicious js which node then executes

Then someone has to manually compile the wasm and open a pr to update the generated js in amaro

Not a safeguard which is strong enough

The inline wasm is generated by us from the rust source code, not from upstream

See issue above (about changes audit process)

If a malicious attacked opened a pr with malicious wasm, the safeguard would not be useful

Why? This is a very clean/minimal check which could be additionally covered with codeowners
Which interfaces are available to the wasm binary?

What are the actual downsides? (if perf won't be significantly affected?)

@ChALkeR
Copy link
Member Author

ChALkeR commented Jul 27, 2024

it looks like there are more whitespace chars possible to mask multi-byte symbols, but still a very limited set

@marco-ippolito
Copy link
Member

marco-ippolito commented Jul 30, 2024

to be honest we have several other dependencies that we use as wasm:

We build the wasm from the source code in the organization, so we are well aware of the source code.
Just like we compile dependency code, in the deps folder, code changes are carefully reviewed.
SWC is trusted and also powers other runtimes and projects such as Deno, rspack etc...
I understand your concerns but I believe this is an issue that should be discussed with the @nodejs/security-wg and it would be quite unreasonable to block v0.0.5 based on this concern, expecially given that this is an early stage feature that requires bootstrap work.

@aduh95
Copy link

aduh95 commented Jul 30, 2024

IIUC, sourceBuf would be the user-code (e.g. the content of index.ts) and transformed is the output of SWC stripping types out of sourceBuf, correct? I like this idea, as long as --experimental-stip-types does not apply to node_modules files, we can assume it's only used for developing, in which case it's OK to have it a bit slower if that improves the confidence.

@aduh95
Copy link

aduh95 commented Jul 30, 2024

it would be quite unreasonable to block v0.0.5 based on this concern

I definitely agree, it should block the unflagging of the feature (or worst case scenario the landing of the experimental flag on a release line), but there's no reason to block updating from 0.0.4 to 0.0.5 as the concern applies to both versions.

@ChALkeR
Copy link
Member Author

ChALkeR commented Jul 31, 2024

@aduh95 I was not suggesting to block 0.0.4 -> 0.0.5 update based on this, that one is blocked on build process not being reproducable or transparent

This is a separate issue

@ChALkeR
Copy link
Member Author

ChALkeR commented Jul 31, 2024

Filed a PR at nodejs/node#54141

@bcheidemann
Copy link

to be honest we have several other dependencies that we use as wasm:

As an upstream user of Node, it's concerning to hear that this is common/accepted practice. Particularly in light of the xz attack.

We build the wasm from the source code in the organization, so we are well aware of the source code.

This is not something I can verify as an end user.

Just like we compile dependency code, in the deps folder, code changes are carefully reviewed.

Also not something I can verify. This can change over time as well.

SWC is trusted and also powers other runtimes and projects such as Deno, rspack etc...

I think the point is that it's possible for a modified (malicious) version of the SWC WASM to be merged to main, even if SWC is clean. The Binary is updated and committed manually, and there's no automated verification step before publishing.

FWIW, the way Deno is shipping SWC is very different to this. It is, I believe, much less prone to supply chain attacks.

@marco-ippolito
Copy link
Member

marco-ippolito commented Aug 1, 2024

@bcheidemann I agree there is a lot of work to do to improve security against supply chain attacks, in this case we build the wasm from sources and with cargo.lock, but there is obviously a lot more to do, like to start an automated build step. Any help is very welcome

@bcheidemann
Copy link

@bcheidemann I agree there is a lot of work to do to improve security against supply chain attacks, in this case we build the wasm from sources and with cargo.lock, but there is obviously a lot more to do, like to start an automated build step. Any help is very welcome

That's great to hear. I have a long flight next Thursday and will need something something to fill my time 😅 Perhaps, with your blessing, I could take a look at automating the build step?

@mhdawson
Copy link
Member

mhdawson commented Aug 1, 2024

@bcheidemann to add some additonal context, for most dependencies there is an update script in https://github.com/nodejs/node/tree/main/tools/dep_updaters which is used to generate PRs to update dependencies.

So automating the build step along with automating the PR update generation are two related tasks that help would be welcome with. From my perspective for the dependencies with WASM the end goal should be that we build from source in the deps/amaro directory in the node.js project.

@bcheidemann
Copy link

@marco-ippolito @mhdawson I have submitted a draft PR (#48) which I believe would address some of the concerns discussed in this issue. It's not fully tested because my plane wifi is bad and not playing ball with Docker. However, I've included a detailed writeup of the changes I've made and why, so any feedback on this would be greatly appreciated 🙂

From my perspective for the dependencies with WASM the end goal should be that we build from source in the deps/amaro directory in the node.js project.

@mhdawson I had a look at the current dep updaters (thanks for the link!). I think I agree that building from source, as you suggest, would be preferable. However, at this point I can think of a few different ways that we could achieve this goal, and I'm not sure which would be best. Anyway, I'd be happy to help how I can with this, and I'm sure discussion on #48 will help illuminate the correct path.

@marco-ippolito marco-ippolito added the stale PRs or issues that have been stale label Oct 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stale PRs or issues that have been stale
Projects
None yet
Development

No branches or pull requests

5 participants