New Application implementation for Emscripten If you build your Mag­num apps for the web, you can now make use of a new fea­ture-packed, small­er and more pow­er-ef­fi­cient ap­pli­ca­tion im­ple­men­ta­tion. It is us­ing the Em­scripten HTM­L5 APIs di­rect­ly in­stead of go­ing through com­pat­i­bil­i­ty lay­ers.

Un­til now, the Plat­form::Sdl2Ap­pli­ca­tion was the go-to so­lu­tion for most plat­forms in­clud­ing the web and mo­bile. How­ev­er, not ev­ery­body needs all the fea­tures SDL pro­vides and, es­pe­cial­ly on Em­scripten, apart from sim­pli­fy­ing port­ing it doesn’t re­al­ly add any­thing ex­tra on top. On the con­trary, the ad­di­tion­al lay­er of trans­la­tion be­tween HTM­L5 and SDL APIs in­creas­es the ex­e­cutable size and makes some fea­tures un­nec­es­sar­i­ly hard to ac­cess.

To solve that, the new Plat­form::Em­scripte­nAp­pli­ca­tion, con­trib­uted in mosra/mag­num#300 by @Squareys, is us­ing Em­scripten HTM­L5 APIs di­rect­ly, open­ing new pos­si­bil­i­ties while mak­ing the code small­er and more ef­fi­cient.

“SDL2” vs SDL2 Since there’s some con­fu­sion about SDL among Em­scripten users, let’s clar­i­fy that first. Us­ing SDL in Em­scripten is ac­tu­al­ly pos­si­ble in two ways — the im­plic­it sup­port, im­ple­ment­ed in li­brary_s­dl.js, gives you a slight­ly strange hy­brid of SDL1 and SDL2 in a rel­a­tive­ly small pack­age. Not all SDL2 APIs are present there, on the oth­er hand it has enough from SDL2 to make it a vi­able al­ter­na­tive to the SDL2 ev­ery­one is used to. This is what Plat­form::Sdl2Ap­pli­ca­tion is us­ing. The oth­er way is a “full SDL2”, avail­able if you pass -s USE_SDL=2 to the link­er. Two years ago we tried to re­move all Em­scripten-spe­cif­ic work­arounds from Plat­form::Sdl2Ap­pli­ca­tion by switch­ing to this full SDL2, but quick­ly re­al­ized it was a bad de­ci­sion — in to­tal it re­moved 30 lines of code, but caused the re­sult­ing code to be al­most 600 kB larg­er. The size in­crease was so se­ri­ous that it didn’t war­rant the very mi­nor im­prove­ments in code main­tain­abil­i­ty. For the record, the orig­i­nal pull re­quest is archived at mosra/mag­num#218.

The SDL-free Em­scripte­nAp­pli­ca­tion All ap­pli­ca­tion im­ple­men­ta­tions in Mag­num strive for al­most full API com­pat­i­bil­i­ty, with the goal of mak­ing it pos­si­ble to use an im­ple­men­ta­tion op­ti­mal for cho­sen plat­form and use case. This was al­ready the case with Plat­form::GlfwAp­pli­ca­tion and Plat­form::Sdl2Ap­pli­ca­tion, where switch­ing from one to the oth­er is in 90% cas­es just a mat­ter of us­ing a dif­fer­ent #include and pass­ing a dif­fer­ent com­po­nent to CMake’s find_package() . The new Plat­form::Em­scripte­nAp­pli­ca­tion con­tin­ues in this fash­ion and we port­ed all ex­ist­ing ex­am­ples and tools that for­mer­ly used Plat­form::Sdl2Ap­pli­ca­tion to it to en­sure it works in broad use cas­es. Apart from that, the new im­ple­men­ta­tion fix­es some of the long-stand­ing is­sues like mis­cal­cu­lat­ed event co­or­di­nates on mo­bile web browsers or the Delete key leak­ing through text in­put events. On­ly two wide­ly used APIs are miss­ing from it right now — the Plat­form::Sdl2Ap­pli­ca­tion::tick­Event() and Plat­form::Sdl2Ap­pli­ca­tion::setSwap­In­ter­val(). The for­mer will get added to­geth­er with an equiv­a­lent in GLFW ap­pli­ca­tion, while the sec­ond will be ex­posed dif­fer­ent­ly, al­low­ing to use the ex­tend­ed brows­er APIs. Right now it’s enough to #ifdef around it, as browsers, un­like most desk­top plat­forms, en­able VSync by de­fault.

Pow­er-ef­fi­cient idle be­hav­ior Since the very be­gin­ning, all Mag­num ap­pli­ca­tion im­ple­men­ta­tions de­fault to re­draw­ing on­ly when need­ed in or­der to save pow­er — be­cause Mag­num is not just for games that have to an­i­mate some­thing ev­ery frame, it doesn’t make sense to use up all sys­tem re­sources by de­fault. While this is sim­ple to im­ple­ment ef­fi­cient­ly on desk­top apps where the ap­pli­ca­tion has the full con­trol over the main loop (and thus can block in­def­i­nite­ly wait­ing for an in­put event), it’s hard­er in the call­back-based brows­er en­vi­ron­ment. The orig­i­nal Plat­form::Sdl2Ap­pli­ca­tion makes use of em­scripten_set_­main_loop(), which pe­ri­od­i­cal­ly calls win­dow.re­ques­tAni­ma­tion­Frame() in or­der to main­tain a steady frame rate. For apps that need to re­draw on­ly when need­ed this means the call­back will be called 60 times per sec­ond on­ly to be a no-op. While that’s still sig­nif­i­cant­ly more ef­fi­cient than draw­ing ev­ery­thing each time, it still means the brows­er has to wake up 60 times per sec­ond to do noth­ing. Plat­form::Em­scripte­nAp­pli­ca­tion in­stead makes use of re­ques­tAni­ma­tion­Frame() di­rect­ly — the next an­i­ma­tion frame is im­plic­it­ly sched­uled, but can­celled again af­ter the draw event if the app doesn’t wish to re­draw im­me­di­ate­ly again. That takes the best of both worlds — re­draws are still VSync’d, but the brows­er is not loop­ing need­less­ly if the app just wants to wait with a re­draw for the next in­put event. To give you some num­bers, be­low is a ten-sec­ond out­put of Chrome’s per­for­mance mon­i­tor com­par­ing SDL and Em­scripten app im­ple­men­ta­tion wait­ing for an in­put event. You can re­pro­duce this with the Mag­num Play­er — no mat­ter how com­plex an­i­mat­ed scene you throw at it, if you pause the an­i­ma­tion it will use as much CPU as a plain stat­ic text web page. Idle Sdl2Ap­pli­ca­tion Idle Em­scripte­nAp­pli­ca­tion

DPI aware­ness re­vis­it­ed Ar­guably to sim­pli­fy port­ing, the Em­scripten SDL em­u­la­tion re­cal­cu­lates all in­put event co­or­di­nates to match frame­buffer pix­els. The ac­tu­al DPI scal­ing (or de­vice pix­el ra­tio) is then be­ing ex­posed through dpiS­cal­ing(), mak­ing it be­have the same as Lin­ux, Win­dows and An­droid on high-DPI screens. In con­trast, HTM­L5 APIs be­have like mac­OS / iOS and Plat­form::Em­scripte­nAp­pli­ca­tion fol­lows that be­hav­ior — frame­buf­fer­Size() thus match­es de­vice pix­els while win­dow­Size() (to which all events are re­lat­ed) is small­er on HiD­PI sys­tems. For more in­for­ma­tion, check out the DPI aware­ness docs. It’s im­por­tant to note that even though dif­fer­ent plat­forms ex­pose DPI aware­ness in a dif­fer­ent way, Mag­num APIs are de­signed in a way that makes it pos­si­ble to have the same code be­have cor­rect­ly ev­ery­where. The sep­a­ra­tion in­to dpiS­cal­ing(), frame­buf­fer­Size() and win­dow­Size() prop­er­ties is main­ly for a more fine-grained con­trol where need­ed.

Ex­e­cutable size sav­in­gs Be­cause we didn’t end up us­ing the heavy­weight “full SDL2” in the first place, the dif­fer­ence in ex­e­cutable size is noth­ing ex­treme — in to­tal, in a Re­lease We­bAssem­bly build, the JS size got small­er by about 20 kB, while the WASM file stays rough­ly the same. 111.9 kB 74.4 kB 52.1 kB 731.2 kB 226.3 kB 226.0 kB 0 100 200 300 400 500 600 700 800 kB Sdl2Application Sdl2Application EmscriptenApplication -s USE_SDL=2 -s USE_SDL=1 Download size (*.js, *.wasm)

Min­i­mal run­time, or brain surgery with a chain­saw On the oth­er hand, since the new ap­pli­ca­tion doesn’t use any of the emscripten_set_main_loop() APIs from library_browser.js , it makes it a good can­di­date for play­ing with the rel­a­tive­ly re­cent MIN­I­MAL_RUN­TIME fea­ture of Em­scripten. Now, while Mag­num is mov­ing in the right di­rec­tion, it’s not yet in a state where this would “just work”. Sup­port­ing MINIMAL_RUNTIME re­quires ei­ther mov­ing fast and break­ing lots of things or have the APIs slow­ly evolve in­to a state that makes it pos­si­ble. Be­cause re­li­able back­wards com­pat­i­bil­i­ty and pain­less up­grade path is a valu­able as­set in our port­fo­lio, we chose the lat­ter — it will even­tu­al­ly hap­pen, but not right now. An­oth­er rea­son is that while Mag­num it­self can be high­ly op­ti­mized to be com­pat­i­ble with min­i­mal run­time, the usu­al ap­pli­ca­tion code is not able to sat­is­fy those re­quire­ments with­out re­mov­ing and rewrit­ing most third-par­ty de­pen­den­cies. That be­ing said, why not spend one af­ter­noon with a chain­saw and try de­mol­ish­ing the code to see what could come out? It’s how­ev­er im­por­tant to note that MINIMAL_RUNTIME is still a very fresh fea­ture and thus it’s very like­ly that a lot of code will sim­ply not work with it. All the dis­cov­ered prob­lems are list­ed be­low be­cause at this point there are no re­sults at all when googling them, so hope­ful­ly this helps oth­er peo­ple stuck in sim­i­lar places: std::getenv() or the environ vari­able (used by Util­i­ty::Ar­gu­ments) re­sults in writeAsciiToMemory () be­ing called, which is right now ex­plic­it­ly dis­abled for min­i­mal run­time (and thus you ei­ther get a fail­ure at run­time or the Clo­sure Com­pil­er com­plain­ing about these names be­ing un­de­fined). Since Em­scripten’s en­vi­ron­ment is just a bunch of hard­cod­ed val­ues and Mag­num is us­ing Node.js APIs to get the re­al val­ues for com­mand-line apps any­way, so­lu­tion is to sim­ply not use those func­tions.

vari­able (used by Util­i­ty::Ar­gu­ments) re­sults in be­ing called, which is right now ex­plic­it­ly dis­abled for min­i­mal run­time (and thus you ei­ther get a fail­ure at run­time or the Clo­sure Com­pil­er com­plain­ing about these names be­ing un­de­fined). Since Em­scripten’s en­vi­ron­ment is just a bunch of hard­cod­ed val­ues and Mag­num is us­ing Node.js APIs to get the re­al val­ues for com­mand-line apps any­way, so­lu­tion is to sim­ply not use those func­tions. Right now, Mag­num is us­ing C++ iostreams on three iso­lat­ed places (Util­i­ty::De­bug be­ing the most prom­i­nent) and those us­es are grad­u­al­ly be­ing phased off. On Em­scripten, us­ing any­thing that even re­mote­ly touch­es them will make the back­end emit calls to llvm_stacksave () and llvm_stackrestore () . The JavaScript im­ple­men­ta­tions then call stackSave () and stackRestore () which how­ev­er do not get pulled in in MINIMAL_RUNTIME , again re­sult­ing in ei­ther a run­time er­ror ev­ery time you call in­to JS (so al­so all emscripten_set_mousedown_callback () func­tions) or when you use the Clo­sure Com­pil­er. Af­ter wast­ing a few hours try­ing to con­vince Em­scripten to emit these two by adding _llvm_stacksave__deps : [ '$stackSave' ] the ul­ti­mate so­lu­tion was to kill ev­ery­thing stream-re­lat­ed. Con­sid­er­ing ev­ery­one who’s in­ter­est­ed in MINIMAL_RUNTIME prob­a­bly did that al­ready, it ex­plains why this is an­oth­er un­googleable er­ror.

and . The JavaScript im­ple­men­ta­tions then call and which how­ev­er do not get pulled in in , again re­sult­ing in ei­ther a run­time er­ror ev­ery time you call in­to JS (so al­so all func­tions) or when you use the Clo­sure Com­pil­er. Af­ter wast­ing a few hours try­ing to con­vince Em­scripten to emit these two by adding the ul­ti­mate so­lu­tion was to kill ev­ery­thing stream-re­lat­ed. Con­sid­er­ing ev­ery­one who’s in­ter­est­ed in prob­a­bly did that al­ready, it ex­plains why this is an­oth­er un­googleable er­ror. If you use C++ streams, the gen­er­at­ed JS driv­er file con­tains a full JavaScript im­ple­men­ta­tion of strftime () and the on­ly way to get rid of it is re­mov­ing all stream us­age as well. Grep your JS file for Monday — if it’s there, you have a prob­lem.

and the on­ly way to get rid of it is re­mov­ing all stream us­age as well. Grep your JS file for — if it’s there, you have a prob­lem. JavaScript Em­scripten APIs like dynCall () or allocate () are not avail­able and putting them in­to ei­ther EXTRA_EXPORTED_RUNTIME_METHODS or RUNTIME_FUNCS_TO_IMPORT ei­ther didn’t do any­thing or moved the er­ror in­to a dif­fer­ent place. For the for­mer it was pos­si­ble to work around it by di­rect­ly call­ing one of its spe­cial­iza­tions (in that par­tic­u­lar case dynCall_ii () ), the sec­ond re­sult­ed in a frus­trat­ed table­flip and the rel­e­vant piece of code get­ting cut off. Be­low is a break­down of var­i­ous op­ti­miza­tions on a min­i­mal ap­pli­ca­tion that does just a frame­buffer clear, each step chop­ping an­oth­er bit off the to­tal down­load size. All sizes are un­com­pressed, built in Re­lease mode with -Oz , --llvm-lto 1 and --closure 1 . Lat­er on in the process, Bloaty McBloat­Face ex­per­i­men­tal We­bAssem­bly sup­port was used to dis­cov­er what func­tions con­trib­ute the most to fi­nal code size. Op­er­a­tion JS size WASM size Ini­tial state 52.1 kB 226.3 kB En­abling min­i­mal run­time 36.3 kB 224.5 kB Ad­di­tion­al slim­ming flags 35.7 kB 224.5 kB Dis­abling filesys­tem 19.4 kB 224.5 kB Chop­ping off all C++ stream us­age 14.7 kB 83.6 kB En­abling COR­RADE_NO_ASSERT 14.7 kB 75.4 kB Re­mov­ing a sin­gle use of std::sort() 14.7 kB 69.3 kB Re­mov­ing one std::un­or­dered_map 14.7 kB 62.6 kB Us­ing em­mal­loc in­stead of dl­mal­loc 14.7 kB 56.3 kB Re­mov­ing all printf() us­age 14.7 kB 44 kB (es­ti­mate) 52.1 kB 36.3 kB 35.7 kB 19.4 kB 14.7 kB 14.7 kB 14.7 kB 14.7 kB 14.7 kB 14.7 kB 226.3 kB 224.5 kB 224.5 kB 224.5 kB 83.6 kB 75.4 kB 69.3 kB 62.6 kB 56.3 kB 44.0 kB 0 50 100 150 200 250 kB Initial state Enabling minimal runtime Additional slimming flags Disabling filesystem Chopping off all C++ stream usage Enabling CORRADE_NO_ASSERT Removing a single use of std::sort() Removing one std::unordered_map Using emmalloc instead of dlmalloc Removing all printf() usage Download size (*.js, *.wasm) While all of the above size re­duc­tions were done in a hack-and-slash man­ner, the fi­nal ex­e­cutable still ini­tial­izes and ex­e­cutes prop­er­ly, clear­ing the frame­buffer and re­act­ing to in­put events. For ref­er­ence, check out diffs of the chainsaw-surgery branch­es in cor­rade and mag­num. The above is def­i­nite­ly not all that can be done — es­pe­cial­ly con­sid­er­ing that re­mov­ing two us­es of se­mi-heavy STL APIs led to al­most 20% save in code size, there are most prob­a­bly more of such low hang­ing fruits. The above tasks were added to mosra/mag­num#293 (if not there al­ready) and will get grad­u­al­ly in­te­grat­ed in­to master .