Par Narann le dimanche, 24 août 2014, 19:14 - Mes coups de coeur - Lien permanent

As this kind of niche book is rare in the CGI, let's talk about it!

This book has been created after the Multithreading and VFX courses session at Siggraph 2013 where some companies were presenting how they used multithreading in their tools. The results where so interesting they decide to gather them and make a book.

Chapter 1: Introduction and Overview (by James Reinders)

The prologue is an ode to multithreading (What a surprise ). James Reinders is an engineer from Intel and we can say he know his job!

What I'd really liked is that multiple faces of multithreading are presented (multi task vs multi thread, vectorization, etc...), explaining technically how they work and giving pros and cons for each of them.

I must talk about how TBB is presented. In general TBB is considered as a serious choice for multithreading programs (Almost every program in the book use it). And from what I've read from the documentation, it deserves it. To be honest, TBB make we want to master C++ templates and I totally understand you would choose TBB if you are familiar with them.

But I regret James didn't spend more time and love presenting alternatives. I know it's a Intel guy but clearly, TBB receive a lot of emphasis and it totally hides the others. I think it's quiet a shame because a lot of chapters talk about TBB too. Alternatives are "presented" but not as in depth than TBB.

Anyway, you get the message: TBB is good, eat some!

Chapter 2: Houdini: Multithreading Existing Software (by Jeff Lait)

A lot of multithreading informations you can find here and there, even on books (including this one), are very academic. They explain how you should write code.

This chapter is interesting because it take the invert approach: How convert old code to modern multithreading. Jeff present different concrete problems and how they choosed to solve them, what they lost doing so and how, if it was possible, they can still save what can be.

What I really love in this chapter is how unacademic it is. Some example's code solution make you realize: "Yeah! I could do that... It's weird but it works pretty well!". The best examples are how Houdini handles the thread creation time (because create a thread is long actually) where the datas to compute doesn't worth the threads creation time. Easy trade of: If data_count < 5000 : compute in the main thread!

You will never see this kind of solution on an academic book but you will never have more efficient that this! :D

Each case is interesting. While not relevant for a general experience, read them all train your brain in why you should avoid to do things like this and how to react when you see them.

I must confess I'd read the original Side Effect paper after the Siggraph 2013. This make me realize some work has been done to reorganize and better explain the examples. I remember I were pretty lost with some examples in the original paper but it was not a problem with the book so I guess this examples where better explained (or my general programming skills has increase from that time).

This chapter was frustrating for me because I wanted to know more! More examples, more tricky problems to solve, more "here boys is what you've got if you write code this way".

If you want to know more about how Houdini works under the hood you can find this very interesting post

Chapter 3: The Presto Execution System: Designing for Multithreading (by George ElKoura)

Interactivity is the key of this chapter.

The first part is not very interesting. George explain the technical goals and constrains they had. This is good, but he spend too much time on things you often know without explain what where the specific implication in Presto ("vectorization is important because blah blah", "you should organized you data because blah blah"). Its kind of redundant with the first chapter. I would have put every general multithreading stuff explained in the first chapter and let the study case focus on implementation.

However, the second part is very interesting. Here are the smart choices they had to do. The way they handle threads in Presto (especially how graph computation is separate) is very different than an "academic" graph network. Indeed, Presto had to deal with "time" because time is the animator toy. He also talk about interactivity and the relation betweens "batch" threads (responsive of math computation) and UI threads and how they tried to avoid any kind of lag in interactivity (interrupt or not interrupt, that is the question). It's very well explained while not enough detailed for my taste. They finish with a small (2 pages) concrete code example of how a "naive" geometric deformer code can be multithreaded. It's an interesting code snippet because it present a general problem (but he give a "general" solution).

If you want to have an overview of how Presto work under the hood, you will like this chapter. The word overview is a good word here because George only focus on multithreading (it's what this book is supposed to do so...) but he often talk about things implicitly that (I guess) you could understand if you were familiar with Presto (what a rig look like in Presto?!). This "abstraction" of some technical details can be frustrating.

I also felt a "everything were fine and rosy" like if there was no complications, no multithreading specific "problems", no "what a silly idea we had there". I mean, in development, there is always moments where the vision and abstracted concepts face reality and involve serious trade of, moments where the academic approach can't be applied as it. This is even more glaring when you compare to the following chapter.

Chapter 4: LibEE: Parallel Evaluation of Character Rigs (by Martin Watt)

This one surprise me a lot! It shine in many places!

From the first pages of the chapter I was not very excited. LibEE is presented as a multithreaded Dependency Graph... You could almost read "Maya DG is not relevant anymore so there was no alternative: We have to write our own". I'm not a big fan of the classic "Dependency" Graph pattern. I'm not sure it's a good approach for multithreading and more I was reading the chapter more it confirm my humble opinion (but you know, I'm not working at Dreamwork so there is maybe a lot of valid reasons...). While the Dreamwork team seems to have done a big job to bypass some of the limitations a Dependency Graph could have, many sections make me realize "why" DG is not straightforward for massive multithreading. I tend to prefer a compiled approach like VEX or what Presto team choosed but reading the rest of the chapter, I'm not sure the Dreamwork RnD team has all the needed discretion (and resources?) to choose a such approach.

And this is one point where this chapter is amazing: You "feel" the production. I mean, the deadlines, riggers ranting over the new tools because-it-was-better-to-use-the-old-shitty-Maya-DG-before. I'm a big fan of various few line examples of discussions betweens various peoples involved (copy of rigger mails make me ROFL). It also explained how they have to train riggers to think multithreaded. The various bench tools they provided to riggers to find where the critical paths were and when "cut" them (yet another point to stay away from DG IMHO). LibEE follow the typical VFX situation where tools are developed and used at the same time (unlike Presto team that seems to have worked far away the production before release even if this is not clearly stated). Handle such situation is an art from a development perspective as goals often change and this chapter explain (a little) how handle this.

The second impressive point is the experience they share with the tests they did in production. I was sad the Presto chapter where only mentioning few technical points to be aware of in general (NUMA, bandwith, Hyperthreading, etc...) without provide any graph or so, while the Dreamworks team not only explain, but show where this points are problematics, how they bench them and what they observed. And they became very far at this point. Once again, this is not academic, this is true use cases decorticates.

So a very cool and clear chapter! One of the most valuable of this book. I would have love the same level of in depth discussion with Presto.

Chapter 5: Fluids: Simulation on the CPU (by Ronald D. Henderson)

lol this one is massive. If you love code, maths and computer science, you would love it!

There is some interesting OpenMP/TBB comparison codes (with TBB C++11 lambdas, first time I read such! ). If you are not comfortable with C++, that's gonna be hard (and kind of disrupting).

About maths, I must admit I've smiled a lot each time I realize "I don't get anything! " after read some lines.

In others chapters of the book, a lot of explanation is often presented with sentences like: "We measure the core efficiency by dividing the bench time by the number of core" (you know, something humanly readable). In this chapter, such sentence are actually presented by a tiny mathematical formulation following by: "Where p is the relational infrastructured derived of the inversed logarithm's result of the number of core and n is the time of the benchmark expressed in second (ISU-CIPM 1967)". Yeah, I exaggerate of course but as I'm not used to mathematical formulations this is exactly the feel I had reading this kind of sentences for the first time. And I laugh even more when, after some digging minutes, I realize all of this could be said with a simple sentence. Haha!

But I had the opportunity to work with mathematicians (or peoples loving maths in general) and they tend to use math equations to model anything (even such trivial things than the average time peoples spend at lunch or to be late at a meeting) so it reminded me all this funny moments.

I'd also really loved the quick OpenVDB presentation. I had often heard about OpenVDB and never know how it was working under the hood. This tiny presentation was perfect. There is a lot of widely open source libraries used in the CGI but they lack of easy-to-get presentations and there is a lot of CG artists, while not true developers, are very technical and would benefit of advanced presentations of such libraries in their day to day work. So VFX artists, this presentation is for you!

For the rest, it was too mathematics for me. If you find it hard to read mathematics equation you would be lost in this chapter. I did so I can't really comment how valuable it is. Actually, I think you could be right with it if you are familiar with all "fluid/liquid" simulation maths.

Chapter 6: Bullet Physics: Simulation with OpenCL (by Erwin Coumans).

What a cool chapter I was not expecting.

I'm not so much interested by physics in general but I must admit Erwin did a great work to decorticate his nice library. It's also the only chapter talking about OpenCL and GPU low level computation. While it's a very fast overview, it's well described and every steps of the various collision detection/repulsion computation of the Bullet libs that have been parallelized are well highlighted. This give a very interesting overview of how a physic engine can work which is something pleasant to know in VFX.

There is also a lot of tiny and very clear schema. I tend to prefer a lot of tiny, one problem focused, schema than bigger ones, you can see in the Presto eating places where some other schemas could have been interesting.

It finished on more advanced principles.

One of the clearest and pleasant chapter to read.

Chapter 7: OpenSubdiv: Interoperating GPU Compute and Drawing (by Manuel Kraemer)

I was waiting for this one impatiently and I've not been disappointed, very cool chapter. It's composed of two parts.

The first part talk about the iterative Catmull Clark subdivision, its history, why a "universal-matching" subdivision is needed, how it was working and how they multithread it. If you are not familiar with geometry hard edge data structure you will certainly be lost. They also give some shy benchmarks and conclude iterative subdivision design, while a need for off-line rendering, does not fit well to realtime constraints. I've been surprised they didn't talk about how they organize the memory to improve cache and benefit vectorization. I'm not an expert but after have read the previous chapters, I wonder if there was not other things to do to improve performances. I guess they are just not presented because the benefit where not such interesting who know... But to have distantly followed OpenSubdiv development I have the impression they quickly jump to the GPU side as it was maybe a more important step for the studio (Presto seems to rely on it), leaving the CPU code as "good enough".

The second part talk about how OpenSubdiv use GPU (and OpenGL) to offer realtime subdivision using realtime tessellation shaders available with OpenGL. Once again, I enjoyed the GPU presentation and how you are supposed to organize your batchs to benefit from it (and deal with strong GPU limitation). Edge Crease (a pivotal point of OpenSubdiv) seems to have been a problem (so does the triangle/quad/polygon) but I really like how they handle it: They separate each set of the geometry to apply different algorithms. The separation process is done once as the topology doesn't change, and the GPU simply recompute the tessellation shader according to the face set. Interesting and pragmatic approach to apply the most efficient algorithm to various part of a single geometry (while I guess it should complicate the code).

Off topic, I wonder if creases were a good idea for an open subdiv library. It fit very well with the surface limit concept dear to Renderman but:

This concept doesn't seems to fit well with raytracers that need true geometry to raycast.

It seems that you need high subdivision levels to benefits of strong (but still smooth) crease edges were a simple bevel or double edge could do the job (it certainly make the geometry more complex indeed).

It seems to complicate and fragment the subdivision code.

Fragmented code? Fragmented data structure? What impact it could have on memory (efficient access and size)?

If you can't subdivide for hell, a last option could be to represent creases modifying the normal but you would have to deal with shading stuff and this is not what OpenSubdiv is supposed to do (beside all the troubles you could have to "merge" the other shading normals).

So I wonder if creases where not something Pixar (and mainly Pixar) absolutely needs and this would explain while a such "exotic" concept fall in OpenSubdiv.

Notice the next OpenSubdiv is coming and seems to have some other studios contributions. It's a good thing as it seems to became a strong part of Maya. :)

Conclusion

Well... Good book! :D

There is inconsistencies between chapters. Some are very deep (LibEE, Fluids, Bullets), and others just touch the surface and finish too soon (Houdini, Presto).

So this book is exactly what it states: Multithreading for Visual Effects. And it focus on how multithreading can be/is used in VFX. But as the chosen softwares are very interesting (Seriously, Side Effect, Pixar, Dreamworks!) it can be sometime frustrating!

The book is very nice (hard cover!) while a bit pricey (hard cover?). But it's about a specific domain so it's ok (is there any book that talk about such softwares?). Most of the time it will be your studio that will buy it so... :P

Does it worth it? If you have to code on VFX softwares, don't hesitate. You will not find massive "ready to use" resources but it's like if you had talks with this guys sharing their experience, they explain how their babies works, what is important and what is a mess. Not sure you can find such infos everywhere.

If you are an artists enjoying technical stuff, I wouldn't recommend this book. It's very focus on development, you will not find multithreading tricks that will enhance your work. Anyway, I encourage to prick it to your RnD and see by yourself! :P

After reading this book, I'm sure I could pay for books (even with soft cover!) that focus on in depth architecture of a specific VFX software from a dev point of view (Houdini, Presto, Maya, Nuke). How datas are managed, what data structures look like, how a specific kind of operations are done etc... And not just an overview!

From the time I've been on the artistic side, I've tried to understood how softwares where technically working (because most of the time they were working bad/not as expected/I DIDN'T SAVE MY WORK!!!! etc...). I'm now able to understand the "logic" part of them so I want to read such high end code more!

Unfortunately, I missed a lot of valuable informations because of my lack of mathematic skills. I'm more familliar with computer science in general. I mean, I know what 4x4 matrices are, what they are used to, I know what a dot/cross product is and how it can be used to achieve various stuff but thats pretty much all. I was angry against myself to not be able to read and understand some advanced part of the code (I skip a lot of the Fluids chapter. :( ).

What is also interesting is to gather interesting Siggraph courses to make a book. I really like the idea while I would suggest to favor courses that go more in depth or with a lot of implementation studies (Once again, Presto is very frustrating). I also tend to prefer books than pdf.

I don't know about the success of this book but I've heard many peoples interested in it so we could hope for... Let say... "Alembic/OpenVDB/OpenEXR code review and use cases" with a lot of in depth explanation of the foundation and infrastructure of each of this libs. :D

Hope you enjoyed this humble review! I would be very interesting to ear your opinion if you also read this book so feel free to comment. :)

Dorian