An Update on Plugin Security

cantrevealname · on Oct 3, 2019

I'm pleasantly surprised to read a fairly detailed, apparently honest analysis of their security decisions and trade-offs from a company selling proprietary software services. I don't have an opinion of whether technology X they used in the past is better Y, but this kind of detailed disclosure is rare and they should be thanked.

In the non-software world, how often do we get any kind of explanation? What was the mechanical problem that delayed my flight for 14 hours? I've never gotten even a two-sentence explanation; sometimes they lie (the announcement says "it's the weather" when the pilot has already admitted it's mechanical). I can list a hundred examples involving banks, government, and utilities where I've sought an explanation for a weird failure, but gotten absolutely nothing. The software world is leading the way in transparency compared to pretty much every other industry.

CJefferson · on Oct 3, 2019

Running a javascript interpreter, written in C and cross compiled to WASM, in a browser, does feel like a joke. However, it probably is most simple and effective way of running user submitted code in a sandbox which they can't escape from.

bsaul · on Oct 3, 2019

Indeed. The more i read about wasm technology, the more i agree with people that think it's going to change everything.

pjmlp · on Oct 3, 2019

I just get déjà-vu from all virtual machines that have been designed since the 60's.

bsaul · on Oct 3, 2019

Except it’s already proven to run in the most hostile environment : the browser, with just random code from internet. This is where the JVM stumbled.

pjmlp · on Oct 4, 2019

It is not proven at all. Black hat hackers are yet to be bothered to attack it.

Those WASM related CVEs will start coming when they do.

For starters it lacks bounds checking on memory access inside the sandbox, which is an easy target if the module happened to be originally written in C, C style C++, or any other systems language without bounds checking enabled by default.

Not even its Harvard architecture based design can help there.

The JVM stumbled in library logical errors implementation and FOSS devs hate.

While WASM for whatever reason has become the new hip child, with many acting as if it hasn't never been done before.

Plenty of IoT devices with sandbox based VMs for download of hostile code have been designed throughout the years. WASM is just yet another one.

I think the biggest punishment for all that cheering will be the return of Flash, ActiveX, Java, ..., using Canvas/WebGL/WebGPU and now you cannot turn it off.

Welcome to the new Web experience.

bsaul · on Oct 4, 2019

you seem concerned about security (which was a big factor against the flash VM), but i was more thinking about performance and integration with the environment. Java applets were slow, couldn’t run on most mobile browsers and were sticking out like a sore thumb on each page they ran into.

I was talking about the javascript VM (not specifically wasm), which has been the target of hackers for years and run fine on even modest mobile phone.

i understand running VM assembly code directly may change a few things relative to security, but other than that we pretty much have the guarantee that wherever javascript is able to run today, wasm will be, and the experience is going to be better (don’t see why wasm couldn’t call the DOM once browser decide it)

AgentME · on Oct 4, 2019

>For starters it lacks bounds checking on memory access inside the sandbox

Right, but you can't use that to escape the sandbox.

pjmlp · on Oct 4, 2019

What matters is subverting the sandbox behavior, no need to escape it.

jml7c5 · on Oct 4, 2019

>Running a javascript interpreter, written in C and cross compiled to WASM, in a browser, does feel like a joke.

Every day I see more clearly how prophetic "The Birth & Death of Javascript"[1] (2014) was. I'd love to pluck 1996 Brendan Eich into the future and show him how far his little programming language would go.

[1]: https://www.destroyallsoftware.com/talks/the-birth-and-death...

BrendanEich · on Oct 7, 2019

I was closely involved as CTO and then SVP of Engineering at Mozilla from the inceptions of both WebGL (originally Canvas 3D) and asm.js (see http://asmjs.org/ for docs), which led to the 4 day port via Emscripten or Unreal Engine 3 and Unreal Tournament from native to web and in Firefox at 60fps. This prefigured WebAssembly, which came in 2015 after it was clear from MS and the V8 team that one VM (not two as for Dart or PNaCL) would win.

Gary added the insight that system call overhead is higher than VM safety in one process (he may have exaggerated just a little) to predict migration of almost all native code. The general idea of a safe language VM+compiler being smaller and easier to verify than a whole OS+browser-native-codebase, as well as having lower security check overhead, I first heard articulated by Michael Franz of UCI, and it inspired my agenda at Mozilla that led to the current portable/multiply implemented JS+WebAssembly VM standard.

underbluewaters · on Oct 3, 2019

I was really skeptical when they announced the realms approach, and feel the same way about this one. The simple solution (and one that relies on proven browser-provided apis) is to isolate such code within an iframe. Yes there is some overhead when communicating over the postMessage api but it is not great.

aboodman · on Oct 3, 2019

I wonder if it's because they didn't want user code to have access to browser APIs, to avoid inadvertently growing dependencies on them.

edit: they explain all the rationale here and it makes a lot of sense to me:

https://www.figma.com/blog/how-we-built-the-figma-plugin-sys...

tiborsaas · on Oct 3, 2019

It's pretty ironic how they titled their announcement post "How to build a plugin system on the web and also sleep well at night"

naringas · on Oct 3, 2019

it is not that ironic, they did such a throrough job that they managed to swap out their problematic solution into their plan b easily enough.

rstupek · on Oct 3, 2019

I think the point was they swapped out one problematic solution for another

AriaMinaei · on Oct 3, 2019

They explained the trade-offs in the original blog post. They had considered this class of security holes and had a contingency plan in case one would be detected. They simply executed that plan. I'd say they handled it very well.

AgentME · on Oct 3, 2019

The Realms polyfill solution was inherently hacky: it tries to shim a sandbox into an environment that didn't have one, and it makes it too easy to accidentally leak capabilities into the sandbox.

WebAssembly was built from the start as completely sandboxed (if you want the sandbox to communicate with the world, you have to write each bridge yourself), and it's not as easy to directly share references to host objects into the sandbox, so it's unlikely to happen by accident.

It's a blacklist (Realms polyfill) vs whitelist (WebAssembly) kind of situation. Whitelist approaches are almost always easier to secure and verify.

rubbingalcohol · on Oct 3, 2019

Yeah it was a bit too smug. Having a bit of humility is never a bad thing. It's a good skill to learn, and experiences like this can help.

AriaMinaei · on Oct 3, 2019

Downvoted because clearly you either haven't thoroughly read the original blog post or do not remember it very well.

In the post, they clearly mentioned that such security holes might be found, and they explained their contingency plan, which they now executed.

renke1 · on Oct 3, 2019

I am really thankful the Figma team is talking about this stuff (especially with regard to security). I think providing a plugin API so that other can extend their application is pretty smart and in general brings web applications closer to traditional desktop applications where plugins are way more common.

welder · on Oct 3, 2019

> Figma seeks to ensure that plugins can only be run by an explicit user action.

> Figma seeks to ensure that plugins can’t run by themselves.

What about triggering plugins from implicit user actions? [1]

[1] https://www.figma.com/plugin-docs/whats-supported/#trigger-p...

IncludeSecurity · on Oct 3, 2019

I gotta say, the Figma team's blog has been on fire with quality posts about their sandbox.

They've posted all details about their sandbox implementation, then all the details about this dependency vulnerability that they were on top of, they showed how they reviewed logs for exploitation traces to find there was none, and finally described their strategy for future proactive protection.

Kudos all around!

gfodor · on Oct 3, 2019

It's non-obvious to me why a security incident like this should dictate switching engines over to Plan B. Does the incident reveal any new posterior information about the security of using Realms, that wasn't available at the outset?

The reason Realms seems like a smarter move overall is because in relative terms Realms has a wide variety of stakeholders, is (at least so far) on the ECMA track, and generally speaking seems like the 'right' solution to this problem. I can empathize with not wanting to be the product pushing the envelope and finding these holes, but at the same time having someone like Figma validating things seems like it could push the path dependencies towards more organizations investing in Realms for this kind of thing, and hence cause acceleration towards maturity -- the net effect being Figma's plugin platform being more secure, more quickly than if they back off from Realms.

The VM-in-WASM approach, while in theory fundamentally less likely to be exploited, seems risky from the perspective that now Figma basically is on their own for finding any places where it does in fact have a hole -- I doubt anyone else but them is going to be auditing the various touch points with QuickJS for any kind of sandboxing escaping. QuickJS is also a very new engine, written by a single individual which is also a factor (who by all accounts seems to be capable of 100 man-hours per hour, but nonetheless.) Insofar as Figma is a juicy target, you can be sure that probing QuickJS for potential issues will now be a worthwhile endeavor to hackers, and it seems likely that unless others start doing this approach that Figma will be the only party who will be incentivized to try to keep ahead of identifying exploits.

I acknowledge it's a bit of a catch-22 -- they want to ship safe functionality now, but also don't want to use tech that won't be the safest, most adopted standard and best option in the long run -- but that choice, today, seems like a security risk. I'm curious if there is any other additional context that warranted pulling the escape lever. I understand they can switch back at any time, but it does seem that this event increased their priors on future vulnerabilities in Realms.

nicoburns · on Oct 3, 2019

> I doubt anyone else but them is going to be auditing the various touch points with QuickJS for any kind of sandboxing escaping.

I don't think this is relevant. QuickJS-in-wasm doesn't have access to the host environment in the first place. They're not relying on QuickJS for sandboxing, they're relying on WASM. And the WASM runtimes have plenty of eyeballs.

gfodor · on Oct 3, 2019

There's still interop code though, isn't part of the challenge with this ensuring that there's no security holes in the communication layer between the VM and the outer application? Perhaps they have that concern regardless, but I think a standard set of patterns + shims will eventually converge for Realms.

azakai · on Oct 3, 2019

There is interop code between the wasm and its JS runtime, but that's very easy to audit for exactly what the wasm is given and what it returns.

Also that interop code will be a combination of standard runtime code - very heavily tested, unless they are using something very new - and code they write themselves for their specific application - which they control.

So I think this is the safer route for security. It does have a speed penalty atm, but sounds like it's worth it for Figma.

rvz · on Oct 3, 2019

> We now use QuickJS, a JavaScript VM written in C and cross-compiled to WebAssembly. This was our backup plan in case the Realms shim approach didn't work out. We were able to activate our backup plan very quickly thanks to a swappable architecture.

I had high hopes for Figma for their sensible security choices before this blog post. But reading that they are using QuickJS even when it is unstable and they have cross-compiled to WASM doesn't improve the security prospects. Sandbox escapes are still a thing in JS VM and the WASM VM these days and using it alone still won't solve these issues. Lite-mode V8 might have made more sense to embed.

Having a plugin system and avoiding malicious code-execution was always going to be a tricky situation, especially on the web. Some form of isolation must exist between the VM and the code that disallows this better than a sandbox. As for choosing JS engines, I don't think choosing QuickJS was a sensible choice in terms of security.

hackcasual · on Oct 3, 2019

Another downside to the WASM approach is they'll need to allow unsafe-eval in their CSP to support it on Chrome.

megous · on Oct 3, 2019

Can't they just run the script in a stripped down worker context?

krzkaczor · on Oct 3, 2019

It's explained here: https://www.figma.com/blog/how-we-built-the-figma-plugin-sys...

megous · on Oct 4, 2019

It's not. There's no mention of web workers.

AgentME · on Oct 4, 2019

They talked about external-origin iframes, which are very similar to web workers (besides that iframes get a DOM). All communication with them is asynchronous.

Sephr · on Oct 3, 2019

Maybe. I explored this concept with JSandbox[1] a decade ago.

1. https://github.com/eligrey/jsandbox

dmackerman · on Oct 3, 2019

My guess is they explored this, but realized how limited workers are?

megous · on Oct 4, 2019

They did not write about it though.

Web Workers were used for running untrusted plugins in the past. They run off-the-main-thread, are terminateable, don't have access to DOM, etc.

You'd still need to cleanup their environment, before running untrusted code there. It's some challenge, but not too hard.

The only trouble is that they're hard to monitor for abuses of CPU/Memeory use. But that will at most crash your browser.