

We are getting closer and closer to automate artistic tasks that today are done “manually”, one by one, thus leading the creation of art into something analogous to an industrial production. In this article I intend to discuss the development of automation technology and artificial intelligence, and the impact that I think they will have on society.

First, the future of movie production will allow more and more people to produce high-profile movies in small groups, even alone in a further future, with a very low budget.

All this will be the result of an ingenious combination of “AI” (Artificial Intelligence), maybe even a “Weak AI”, a large processing power, along with “Computer vision”, “Natural language processing”, automatic 3D modeling, among other future technologies.

Basically, a computer software will be able to produce a content following your instructions. You’ll describe the scene, the lightning, how many people will be on the scene, the time in which it takes place, the dialogue and the tone and the feeling that you want those actors-avatars to have. And then the computer will produce this movie for you based on that.

It’s important to bear in mind that, in a way, some movies today are already made that way. Movies created entirely in computer graphic, in which the scenery and all the other elements of it are editable without the need to reshot the scene. However, they are pieces of work produced in a very handicraft way, often tremendously more expensive than a movie with a conventional production.

The point is that everything you need the dedicated time of another human being to help you tends to be a very expensive and not scalable process. Andy Warhol with his industrialized paintings soon realized this: the handicraft is expensive.

In such system described, the data of the image or video that were not specified would simply be filled by the most likely outcome based on “common sense” that would be obtained in big databases, that such intelligence would be able to understand.

For example, if it were being described a scene in which the protagonist is in a bar at 2 p.m on Monday, the program would not show many people on the scene due to the simple fact that it’s not usually common to see a large number of people in bars at this time, especially on that day of the week, for a number of reasons. Unless there were a specific user order requiring changes in such scene that diverge from the pattern, such as a crowded bar at 2 p.m at the beginning of a week.

The TED talk “How we’re teaching computers to understand pictures” shows very well the challenges of today’s technology to allow us to make a computer understand what is happening in an image, and not just recognize isolated objects out of a larger context.

That last aspect of this technology today is relatively on an advanced level, as you can see on the site Image Identify (a kind of Google image that runs based fully on computer vision; you upload an image and the site tries to tell you what is that object that is being displayed), an initiative of Wolfram Research, and ImageNet (an image cataloging project that uses neural networks). The lecture above talks about it, by the way.

And although this technology, in this case in particular, is being used in a different manner (the program is only acknowledging the context of an existing scene, and not creating that scene from scratch), I believe that in both cases it must, at the end, understand the concept of the events in question. Whether to identify them or to create them, so certainly it will be something useful for the kind of technology described here.

Imagine a kind of “Game Maker Studio” or “Unity” for movies, by the way, what happened with video games is a good comparison to what will happen to the audiovisual production.

In the old days a huge team was needed to produce a game like “Super Mario World”, and nowadays there are tools that allow ordinary people, like you and me, to produce games equal to the ones that big companies were producing in the 90s, however, in a much more practical, simple and inexpensive way.

Basically, you’ll have a kit to assemble movies, just as today you have a kit to build games. In fact, this idea of kits, of generic templates of creation, is something that has become very popular.



The site Market Envato is a good example of what I just mentioned. Among the content they sell are included things like 3D models, opening sequences, icons, vectors, fonts, logos, graphics, sound effects, music, themes for WordPress and Tumblr and other content management systems, stock footages, projects in After effects, plugins for 3D programs, transition effects for videos, and the list goes on…

In most of the things mentioned above, you just insert the information and the data that you want on pre-determined places. They come pre-finished (some totally finished). These creation templates, as I call, try to give content creators the best of both worlds: they try to create something that is relatively customizable and differentiated, without you having to pay someone to create that content just for a project yours. However, due to the fact that a human mind is necessary to create such creation templates, there is a limit to how much one can do. And often the tools to edit these templates are not necessarily very user friendly, although each year they become more and more simple to use. An interesting example is the “Mojo” plugin, shown in the image above, which is basically a tool that allows you to easily perform color adjustments in the video (which you can do using the default settings in the program, but in a less friendly way) using terms as “warm up” to apply a red filter to it, or “cool”, to apply a blue filter to it.

Another plugin with similar functions is the “Magic Bullet Looks”, which applies pre-programmed filters on a video, for example, giving the film the look of 60s movies, or a vintage look, or a sepia-tone look.



On the issue about cartoon animation, we have tools like Toon Boom Animation, which makes the whole exhausting archaic traditional animation process, frame by frame, of the old days something much easier and less repetitive.

All these tools have considerably reduce the work of editors and content creators. Previously, for example, performing color adjustments accurately was practically impossible. At best you could submit the film through chemical processes so that a particular scene gained a particular look, but you couldn’t choose which parts of a scene you would like to enhance the color, or make some other kind of adjustment, in a practical way.

And nowadays you just need the idea, and the plugin basically does the rest for you, or at least it greatly facilitates your work. I mean, an interesting example to reflect about it’s the movie “2001: A Space Odyssey”. In the scene where David runs through the light tunnel to come into contact with an alien civilization. So much that the video above, of the channel Filmmaker IQ on Youtube, addresses this reduction of the level of complexity (and why not talent also?) to create a particular technical effect.

Nowadays, many outside shooting scenes (which is usually something somewhat generic, that it does not need to “feel” that are part of that particular movie) are purchased on sites like Videohive, Stock Footage, and so on. Instead sending a movie crew to a place, you simply buy a stock video and edit it. And if the scene is not exactly the way you want, there is always the option to completely change it using software such as Cinema 4D, Maya, Auto Desk Smoke, among others.

Cinema became a huge patchwork quilt with thousands of scenes recorded and edited separately across the world, and with models rendered in 3D. All this creating the illusion of truth.

By the way, there is a clip that got quite famous on the Internet, of a Hungarian singer named Boggie, which demonstrates very well the path in which the video and image editing programs are going towards. It’s not completely automated, but you realize that there is a very complex AI behind, which is able by itself to isolate elements of the body’s model and animate the movement of the various elements in 3D overlaid on her (clothes, hair, etc) along the other frames of the video.

All this reminds me of this interview with Tim Sarnoff, president of Technicolor Production Services, in the article “Film Making Process Needs to Catch Up with New Technological Capabilities“:

“The industry of filmmaking has been undergoing serious transformation and change every year for the last 100 years. It is an industry of invention. The most impactful invention and transition in the recent era has been digitization. While it has sped up the process and made it non-linear, digitization has created other complexities. At the same time, digitization has significantly enhanced how productions are viewed, creating much more immersive and augmented experiences. But the processes involved in making a film, such as planning and post production, have not evolved very much and are now up against significant stress points. To be blunt, traditional processes no longer work efficiently. In many ways they are an analog remnant in a digital world. These aging ways of doing things are characterized by a series of one-to-one stand-alone activities that aren’t automated or integrated; there’s no visibility into them, and they have become very inefficient […] Part of the reason is because each production is so unique. Every movie or original series has its own special way of being created. To take an analogy from the auto manufacturing industry, high production value projects – like major movie – are more like building a racing car, than a mass-produced vehicle. Each element is bespoke. Everything you are doing is subject to the tolerance levels of a particular project and the particular needs of its creative teams.”

That is, it is basically illustrating the reason why is so difficult to automate the artistic creation. Precisely because it is something so peculiar, and a human could not create templates for individual creation to meet all the peculiarities of each possible project. However… maybe we could build something to do this.

For example, imagine you wanted a scene in which your character began to laughing frenetically because “all the shit hit the fan on his life”, then you would state that you wish your character had an interpretation inspired by Walter White reaction at the end of the 4th season of Breaking Bad. And there would be a database of character reactions, including even the greatest performances in movie history, that you could indicate to the AI to base on.



In the case of cartoons, for example, you would have several choices of different artistic styles to indicate to the computer on which one you would like it to be based on. You could choose an animation style like the one of The Simpsons, or South Park, or The Jetsons…

Of course, and this is one of the problems: depending on what kind of editing you want to do with a photo or video, the complexity increases considerably, and often we don’t have the tools to make this editing in a simple and automatic way, and on these cases, a graphic editing expert is still necessary.



Another example of how this technology could be used would be on the restoration of old media. I mean, there is often a large amount of redundant information on a photo or video. Like the texture of an object, like a part of something missing out, but we managed to restore due to the fact that remains enough data on the picture so that we can to determine what would most likely be there.

Imagine a kind of Photoshop’s Content-Aware only much more efficient, of course. We’ll have an algorithm that will allow us to recreate in high-resolution old contents using as a basis the own original content at low resolution. An automatic restoration.

This photo restoration above, made by humans, of course, illustrates brilliantly what I’m talking about. Despite the scratches, and the fact of the photo was torn, there was enough information to allow us to complete it, to have a good idea of how it was before it was damaged. In other words, there was enough redundancy of information there to allow the restoration. And there are a lot of redundancy on most of our audiovisual records.

We could also use similar restoration means to this, using information redundancy, for audio restoration; as in recordings where the audio ended up having a problem. Then we would create an audio based on the original, however, removing the noises that were disturbing the recording: wind, traffic, outdoor decks overall.

In fact, much of this process has been going on today in the restoration of old movies. For example, if some frames are damaged throughout a scene, we can recover these damaged parts seeking them in previous frames, before the damage appeared, and use them to fix it. As we can see in this video above about the restoration of the 1931 “Dracula” movie.

Another restoration that I think goes quite well in the direction of what I’m referring to is the “Star Trek: The Next Generation”, in which it was not only restored, but improved as well, since the special effects of the time were tremendously archaic. The final result was very beautiful.

It’s just a shame that we still don’t have the necessary tools to make practical to convert the “aspect ratio” of a 4: 3 video to a 16: 9 video, for example, to convert a movie to widescreen. By “convert” here I don’t mean to zoom in on the image and try to somehow frame the existing material in a new aspect ratio. I’m referring to digitally creating the rest of the scene, the sides that were not originally recorded.



Out of curiosity, I tried to manually recreate the missing parts of one frame of the “Star Trek: The Next Generation”, and I ended spending almost an hour trying to recreate it, and the result was still not quite right. Of course, a professional devoting more time to such task would probably be able to perfectly recreate the scene.

Although today is “technically possible” to do this kind of thing, it would be tremendously expensive and laborious to make this process in a single episode, in one scene, by the way; imagine having to recreate parts of all 178 episodes of the show? It simply wouldn’t be a scalable process and it would cost a fortune.



However, obviously, this is not magic. We couldn’t simply enlarge an image of a car license plate in low resolution, in which is not possible to see its number, and simply “make those numbers appear there by magic”, as we have seen so many times in series such as CSI. Neither would be possible to extract the audio of someone whispering across the room in an audio recording. These are physical limits: It’s not possible to extract an information that is just not there in the first place.

3D modeling done by Chris Jones, an Australian graphic artist.



This creation in 3D demonstrates the level at which the 3D skin rendering technology is today. Although, from what I understand reading the comments on the video, the graphic artist painted the texture using a program called Sculptris, instead of using a 3D skin texture map. That is, we can compare what he did with a very talented painter making an ultra-realistic painting of a person, and compare this with someone taking a picture with a smartphone. A real breakthrough in technology would be in this second option. But, anyway, I thought I should include this video on this text, due its amazing quality.



An interesting example of the creation of a 3D map, of a real “skin texture map”, was the project “Digital Emily”, made in 2008 in collaboration between the company Image Metrics and the Institute for Creative Technologies of the University of Southern California, where they built in 3D the face of actress Emily O’Brien, based on just one video of her, and then overlapped that face above the original in the same video.

By the way, this same institute a few months ago published a study, “Skin Microstructure Deformation with Displacement Map Convolution”, about new techniques to enable a more realistic skin rendering. In fact, I won’t go into details because Gizmodo has published an excellent article explaining their research.

For those more interested in this issue in particular, I recommend the documentary “Hollywood’s Creating Digital Clones”, of the Youtube channel The Creators Project, which talks a lot about this issue of the impact that 3D actors will have upon the artistic production.

I mean, what will happen when we start to bring actors back from the dead and start to produce movies featuring them? Inquisitively, this will probably happen in the coming years, in a handicraft way, of course (i. e. 3D models of these deceased actors being created by graphics professionals). The movie “The Congress” slightly addresses this issue.

This “Galaxy Chocolate” ad, featuring Audrey Hepburn 20 years after her death, indicates this trend. There’s a very good article in The Guardian with the team responsible for the graphics effects in the commercial.

Another possibility is that perhaps the whole concept of a movie and TV show actor would disappear entirely. The concept of an audiovisual content actor would become an impersonal figure, they would be like the cartoon characters; eternally frozen in time.

As I mentioned above, at first, these actors-avatars would be created exclusively by large companies. Over a short-medium time period, when only they would have the computing power and money to do this sort of thing, and also when the AI was still not so clever to the point of being able to do all of the creation alone process, and, therefore, the help of designers and programmers would still be needed.

However, over time, the whole process would be done by the AI alone, and computers with processing power needed to run this AI would be tremendously cheaper as technology advances.

Obviously, given the precarious state of “Computer vision”, “Natural language processing” and so many other technologies required for this are nowadays, today this whole concept is purely theoretical. Instead of trying to create such technology in fact, much of the research about today try to explore the feasibility of this concept with the advent of future technological inventions in various fields.

Among the research and information that I found, I recommend the book “Automatic Generation of Computer Animation” from the Chinese scientists Songmao Zhang and Lu Ruqian, published in 2002.



For those who want a briefest reading about this, I also recommend the article “Automating The Creation of 3D Animation From Annotated Fiction Text”, from the computing science department at University of Rhodes. It’s very interesting, from what I understand the paper authors created a sort of a “scenes descriptive programming language”, as you can see in the picture above, in which the user indicates the scene through codes; how many people will be there, the type of action that they will be performing, the environment in which these actions will be taking place, etc…



In the excerpt above, we see that they indicated that there are two avatars in the scene, that they are in a room, the sort of action that’s taking place, along with the exterior setting from where they are, in this case, a big city.

Obviously, both the quality of the graphics and the characters “acting” are still very primary. On the computer graphic issue, this is because it’s simply irrelevant since they are only demonstrating the concept of the idea there is no need to spend time and money rendering characters in a super-realistic manner.

And regarding the acting, well, this is because the current AI is simply unable to act realistically (an AI controlling the actions of the character has a believability acting level similar to the actions of a The Sims character), that’s why in animation movies the movement animation is either manually animated or, more preferably, it’s used motion capture with real actors to serve as a basis to animate 3D characters.



Interestingly, this work reminded me some movies that people do using The Sims, I think it’s a very unusual example, but a very good one to illustrate the future of filmmaking. As in this game you are able to build various environments and characters in a relatively “specific” way, and you can control the actions of yours Sims, albeit in a limited way, it ends up being one of the closest things, that we have today, to the type of technology and concept being theorized here. In fact, there is even a term for films created like this: “Machinima”. I strongly recommend you guys to read the Wikipedia’s article on the subject.

Another example, as I mentioned above, is the Unity engine used in the creation of games. I’ve been seeing some game models made in it, and it’s really fantastic. In the engine you have several models of houses, cars, buildings, trees, etc, etc… in 3D, and through them you build your scenario. The curious thing is that, judging from what I could see, the scenario creation is relatively intuitive, it is certainly easier than a Blender or 3D Max’, however, is certainly more complex than it is The Sims.

Interestingly, the company responsible for this engine, in collaboration with Passion Pictures, has created a movie using this game creation system. It was relatively good, certainly better than the “movies” made using The Sims. The graphics resemble the ones from the beginning of the 2000s. I also especially recommend you guys to watch the Making-Of of this movie in question, which is something simply amazing, and in my opinion, along with The Sims, is one of the closest and most similar things, nowadays, to the future of the creation of series, movies and fictional videos.

We can assume that with advances in the technologies of computer language understanding , probably commands in the form of programming language will be greatly simplified, perhaps even eliminated, and the computer will then be able to understand and reproduce the minutiae of human speech and fill in the blanks with the most likely outcome.

There is also this other article, “The Story Picturing Engine-A System for Automatic Text Illustration”, published by American researchers that, among various aspects, addresses the difficulty and nuances to create a concept of something:

“Choosing a few representative images from a collection of candidate pictures is challenging and highly subjective because of the lack of any defined criteria. People may choose to display one particular image over another entirely on merit or pure prejudice. Moreover, the process of story picturing is always constrained by the nature of the image collection.”

It’s a fascinating subject, by the way, the earliest research in relation to this concept dates back to 1979. As this one, entitled “Creation of Computer Animation from Story Descriptions”, published by the computer scientist Kenneth Michael Kahn.

If you are further interested, search on Google the term “text to scene conversion”, in fact, it took me some time to figure out that this was the name of the type of technology which was trying to refer to in my searches. You can also check the search results of Google Scholar articles that cited these other articles and books I mentioned (this is a great way to discover new research involving unusual and difficult to explain subjects).



In the case of design and photography, for example, today you have sites like Canva that provide you the option to create stylish posters, banners, flyers, posts to social networks, etc… The site comes with a number of Templates, filters, and other pre-made resources. Not to mention the famous Instagram, that needs no introduction, and in which image filters are able to, basically, make “every picture (almost all, actually) look good”.

At the website creation business we have services such as “The Grid,” which have been causing quite an impact, and what many consider “the future of web design” Basically the service uses an AI that allows you to create a website that suits your tastes, and that shows the information the way you want it to be shown, and the more you use, the more it learns your tastes and behavior patterns, and then it uses this in the creation of your site.



Regarding the music, many orchestras now are digitally created using sample libraries as the “Vienna Symphonic Library”, “LA Scoring Strings”, “Hollywood Strings”, among others. You have a program with the different sounds of various instruments with an almost perfect quality (a sort of high-definition MIDI), and all you have to do is to build the score of your music.



We also have been having advances in digitally synthesized voices, as in the case of vocaloids, as I had mentioned earlier. Obviously, we have not reached yet the same level of realism of the human voice, we are still in the “uncanny valley of sound”, but we’re getting closer. When this happens, the human voice will come down to no more than “instrument”. And just as there these instruments samples today, in the future you will have something like “Elvis Presley’s Feelings” or “Freddie Mercury’s Emotions” or whatever they’ll call it.

I also recommend the Wired article “Speech Synthesizer Could ‘Resurrect’ Dead Singers”.

Perhaps we could say that what would happen is an extreme popularization of what is already happening in music today with vocaloids; in which a digital avatar itself is almost a personality, I mean, people go to Hatsune Miku‘s (a famous Vocaloid) shows.

This concept proves to be very true when we reflect that the pop divas and other singers are already, even nowdays, industrial creations, are already fully artificial singers produced by the best “song writers” and “hitmakers” on the market. The only difference is that it is still necessary a human to go there and put his voice.



Interestingly, a few days ago it was launched a site called Jukedeck , which intends to allow the creation of instrumental music automatically for you to use as a soundtrack to your videos. You choose the music style, the theme and the speed and the site creates it for you automatically. If we consider that music royalty free are sold for prices ranging from $10 to $20, we can conclude that this is a market with many opportunities, having in mind that we would be able to create such songs for a much lower price.

It is clear, obvious and elementary that the current technology is not able to create a musical masterpiece on Hans Zimmer or Morricone level… yet However, for someone who wants a background sound pretty much generic, which it doesn’t need to be a Cinema Paradiso, is a great solution.

Reminding that music creation on computers is not a new thing, of course, but this site in question works very intuitively. The negative points which I’m most annoyed were the issue of having a few musical styles options available for creating music, especially the few options for customizing the pace of the music in a little more specific way (the only thing you can choose is the duration and the music speed, i. e. BPM).



The utilities, even of a simple and primitive system of automatic music creation, would be gigantic. Suppose I have a 5-minute video and I want that at 2 minutes and at 4 minutes the pace of music becomes more frantic to match the video in question that it will be illustrating. It would be advantage of this service in relation to sites like AudioJungle, in which the music are created by artists, in which, if you want to fit the music more synced into your videos, you end up having to make some adjustments in them, which do not always yield good results (although, to be fair, they generally offer music in 3 versions with different lengths for you to try fitting into your project in the most imperceptible possible way).

Another application of this technology would be the creation of automatic dubbing systems. In other words, we’d grab a video in German, for example, split the ambient noise and the voice sounds, then we would rebuild that ambient noise without the voices, and then create the dubbed audio of the desired language (the original dialogues would have been previously automatically transcribed and translated in the foreign language) based on the original audio voices; in the intonation, pauses, tones, etc…

I’ve been researching about and, interestingly, it seems that there is a company (who also uses the technology of IBM’s Watson) that make this type of transcription/translation and subtitling/almost automatic dubbing (from what I understand, it is still needed an assistance from humans throughout out the process). The company name is “MediaWen International” and is very interesting this video above, of them showing an automatic dubbing using synthesized computer voice. You can check out some other videos explaining better how this technology works on their Youtube channel.

There is a stretch from a scientific research, done in collaboration between French, German and Japanese researchers, quoted in the post “Why Synthetic Speech Doesn’t Sound Right (Yet)?” that the site Tested mentioned, as it follows:

“If a synthesizer is to be used in seamless or unobtrusive conversational interactions with a human interlocutor, then there will be a need for the expression of personal attitudes, moods, and interest, and more use will be made of non-lexical sounds such as ‘grunts’, fillers, and laughter. In such cases, the key difference lies in the degree of interaction with the listener and reaction to the contexts of the discourse. Humans raise their voices both to show anger and to adapt to a noisy environment. They whisper when the content is confidential. Conversation is an interactive, two-way process, with the listener also taking an active part in the discourse. The synthesizers that take the part of a human will be required to express personal feelings and attitudes that are perhaps more in the domain of psychology than linguistics.”

I also recommend the 2007 article published by Microsoft, titled “Historical Development and Future Directions in Speech Recognition and Understanding” , in which they approach this issue.

Oh, and if you want to know more on the subject, focusing on the evolution of this technology, I recommend reading this great Telegraph article “How do you teach a computer to speak like Scarlett Johansson?”





And finally, probably even the literary creation process will also be affected by this, just look at projects like Grammarly, which provides text correction solutions and some other grammar, linguistics and literary aid, etc… A sort of a proofreading, but not as good as a real one.

Personally, a technology that was able to review my text would be very helpful, judging by the lack of attention to the words and my traditional spelling and grammar mistakes that you probably saw here and there on this blog posts, probably even on this one.

IBM’s Watson computer is a great example of a system that has demonstrated a high understanding of language when it beat the humans on the quiz game show “Jeopardy” in 2011. This video above shows in a very didactic and simple way how it works.

We can imagine that a further evolution of “natural language processing”, aside making spell checkers much better, as they would be able to have an “understanding” of what you meant and what you wanted to convey, it would also make “Machine translation” a lot better.

It’s intriguing to imagine and to reflect the immeasurable effects that all these types of automatic and easy editing will have on society and culture. Imagine, for example, how it would affect a milestone in contemporary pop culture: memes.



Imagine if you had an AI capable of, for instance, put the face of a famous corrupt politic (in the case of this image: Eduardo Cunha, a Brazilian politic involved in countless political scandals, and which has quite a few similarities to Frank Underwood in his modus operandi and political machinations) on all the “House Of Cards” scenes. Just imagine the memes!

And these editing tools, even the current ones, enables us to deconstruct and reassemble an artwork, to remix it to do what we wish.

This research from the University of Washington, titled “What makes Tom Hanks look like Tom Hanks”, is precisely about this type of technology, which, of course, obviously, is still in its infancy, but even so…

Basically, they were able to model faces from a collection of photos of a person with different expressions, and also through videos. And, beyond that: they were able to capture all the facial movements and to transfer them to other faces. At 1:35 we see that they took a video of George Bush, built his face in 3D, and after that copied those movements and mannerisms to other personalities faces also built in 3D.

This kind of 2D image conversion into a 3D model using algorithms that can calculate the distortion caused by the difference between angles and recreate it in 3D reminded me of this also very interesting, Israeli research that Engadget published a post about it some years ago “Disney tech auto-edits your raw footage into watchable video”.

Interestingly, Disney has a very broad sector research focused on developing technologies in areas such as this (I recommend you guys give a look on their Youtube channel, Disney Research Hub). The research range from robots able to replace cameramen (i.e. robots able to analyze, to some degree, the subjective importance of a scene and record what we humans consider to be important) to technologies that can merge two scenes. This latest research in question is called “FaceDirector: Continuous Control of Facial Performance in Video”. Basically, they developed an algorithm capable to combine different takes of a scene synchronizing and smoothing the subtle differences in movements between them, using a sort of automatic “morph” effect, which make it possible to merge them into a single cohesive long take with very little effort. Although, for now the researchers are not able to merge the audio of different scenes as smoothly as they did with the videos. The synthesized voice gets very evident, and also from what I understand, the program is not able to put together more complex scenes in which there are a much greater character movement. The site The Verge also made a post about, for those who want more information and have no time to read the whole article, I recommend taking a look.

An interesting example of how memes are gaining increasing more technical complexity is this editing of the meme “Just Do It”, in which the creator put the meme protagonist into famous movie scenes (among other unusual places and situations).

It is extremely difficult to predict such social impacts that an AI would bring in the production of “technical part of art”. It would be like expecting that the people who developed the first computers and the Internet had predicted the major cultural phenomenons that happened thanks to them, as the meme “Turn Down For What?”.

Today you might make a meme mentioning some funny situation that happened at a party with your friends. Well, with this kind of technology you could, with very little effort, make an animation illustrating such a situation.

Going back to the question of movie production, it will greatly increase, obviously, and perhaps many movies become “disposable”, I do not say this in a criticizing tone, it’s just that you will have a lot of people making movies, very easily, and not all of them will necessarily be masterpieces.

Again, we can compare this to the effect that Photoshop invention had on creating fake photos: they increased greatly, and the criteria for something to be memorable increased as well. I mean, you could photoshopped photos before the 90s, but it wasn’t something practical, it was not the sort of thing you did in 5 minutes.

But, if on the one hand the rapid and practical movie production, in part, would ended up making many movies “disposables”, we can assume that, on the other, the independent production cutting-edge and more specific movies would become tremendously more popular, considering that the film’s budget issue would become essentially irrelevant.

“The Internet destroys mass culture and creates the post-mass one, a niche culture” – Internet Technologies, Applications and Societal Impact, page 159. We can imagine an enormous specification of the content that reaches you, which will lead to a proliferation of various niche cultures instead of a mass culture. So, more and more, instead of having millions of people watching a 'Game of Thrones', we'll have, for example, 1 million people watching hundreds, thousands and even millions of different series. However, it is obvious that there will always be mainstreams movie and TV shows, however, as competition increases, and shows can be made according to the most particular specified tastes of people. We can have in mind that every time will be harder to a content to establish itself as a "pop culture icon". As nowadays, with thousands of options for viewing content and thousands of other contents fighting for other people's attention, it's more difficult and expensive to something go mainstream on a global level than it was 20 or 30 years ago, when there was considerably less content competing for your attention.

You would be able to make a movie that today costs $100 million by the price of, let’s say, $10,000; or even for less, in fact, each time less and less. In this reality, producing an “Avatar” would be like writing a book. Not that you can’t make a movie with a low budget today, of course you can. The movie “The Blair Witch Project” had a budget of $20,000 and it was a blockbuster, and this was more than 15 years ago. But we cannot ignore the limitations of the stories that one can tell with this amount of money. Actors, locations, computer graphics, etc, etc… are very expensive today.





(Probably a good chunk of the movie’s budget was spent on scenes of this kind, I suppose.)

It is wise to say that from the $237 million spent on the James Cameron mega project, the amount of money that were spent in the creation of the movie script in itself (and I don’t even know if “spent” is the right word here, since the script was made by Cameron himself, and since it was the thing that, in itself, was economically less costly compared to the costs throughout the filmmaking process) corresponds to a very small portion.

However, it was needed to spend all this exorbitant amount of money to turn that script into a digital video file of whatever how many gigabytes was needed to store it on the hard drive to be shown in a digital cinema, or to be burn into a Blu-Ray disc, or to be stored on streaming sites’ servers and broadcast over the internet (not to mention the 1 petabyte raw file). And the question is: how to transform a script of a movie, in which nowadays we spend millions of dollars to do, without spending those $237 million?



Episode 15 Animatic of Mission Hill, which it was never made.

With such technology, another aspect that would increase would be the creation of fanfics of audiovisual content, since the only thing you would need would be to write the story (and an AI would be able to do this part as well, by the way) and all the technical process involving filming, editing, etc… would be made in a tremendously simple way, with perhaps a few small assistance of human in order to those scenes to reflect their artistic vision. Fans of series and animations canceled would be able to bring back the seasons ever written of their most beloved shows.

On the legal impact of this kind of use... it's complicated to imagine, but I think we can try to draw parallels with that legal precedents in fanfic books in the United States. Nowadays, anyone can write a continuation of Harry Potter and share with millions of people and, generally, there is nothing that copyright holders can do to prevent this, as the book production, especially the one of a fiction book, is something relatively inexpensive, economically speaking, and it is an individual process. In other words, there are fewer parts involved in an unofficial literary work made by a fan than an unofficial movie made by a fan. There are fewer parts of the process to movie studios to obstruct with lawsuits and more lawsuits. But probably with the current legal framework, it is wise to imagine that the creators of these unofficial works would find problems to monetize their content through traditional means (i. e. selling broadcast rights of a 2nd unofficial season of Mission Hill). Although, however, of course, we can think of other ways to raise money, such as Crowdfunding, or even adding ads of "shady" websites, which do not care about government regulations, on the videos.

We can imagine that big companies would begin to use this technology to provide an even more customized service to their users. Facebook has a very interesting function nowadays, in which it basically creates a retrospective of all your major posts, and major posts in which you were tagged. Surely such feature would be greatly improved with this tremendously specific and scalable customization technology.

It could, for example, create a video based on all the thousands of information you put there directly or indirectly every year. And more and more, as this amount of information increase, especially through technologies like “Wearables”. I mean, imagine how much information, how much “Big data”, Facebook would have about you to sell to companies if you wear their virtual/augmented reality glass to record all your interactions all the time.

Services like Netflix could start creating series specially designed to increasingly more specific user segments (i. e. create a horror movie based on the likes of 6000 of their users, approximately 0.06% of its total subscribers). And, of course, the same applies to Spotify and many other services. Not to mention the applications of this kind of thing for advertisement, probably it would be the dream of an ad agency: to make an ad for each one of his customers maximized to increase its effectiveness.

Certainly this customization of the provided services will be amazing, especially if you ignore all the orwellian dystopian side, in which all you see is recorded and analyzed, even if by a machine – although, to be honest, things wouldn’t be as worse than they are already today – anyway, I believe it’s undeniable the utility of these technologies for commercial applications, even from companies that we would not imagine.



“Photoshopped photo made by the Italian brand ‘Benetton’ showing world leaders kissing in a publicity campaign for peace, aimed at creating a new culture of tolerance across the world.”

However, we must also have in mind the potentially negative impacts of this technology. For example, regarding the legitimacy of videos and photos as legal evidence in court. That conspiracy blogspot talking about aliens, with the title in Caps-Lock will now have a video of Obama morphing into a reptilian to “prove” its claims. And while this example is obviously fake, it is wise to imagine that such technology will be used to fake much more plausible situations and much more difficult to discover the truth about them.

Another point to consider would be the issue of the image rights, about how a society with this sort of technology would see it. A company could do an ad with an actor-avatar in 3D just like Lionel Messi, not explicitly saying that it was Messi, and thus, theoretically, not needing to pay a penny in image rights to him. Although, in fact, this problem already exists today with the issue of look-alikes, they are often not exactly like the famous people that they are trying to impersonate, and there is also the fact that you do not find Messi look-alikes and look-alikes of other famous people in every corner.

It’s curious to imagine the legal implications of this technology. I think certainly the number of companies using famous look-alikes would increase… in the same proportion to the number of process for improper use of the image.

And that’s because we not even begun to talk about pornography. I mean, in a world like this you would probably find the nastier most bizarre and hardcore porn on the internet. All of them featuring the most renowned Hollywood actresses like Megan Fox, Angelina Julie, Scarlett Johanson, and the list goes on.

From what I could tell, there are precedents in court to analyze this type of case in which a look-alike pretends to be someone famous. In the United States, there was the case "Onassis v. Christian", which reached the Supreme Court, in which the widow of John F. Kenndy, Jacqueline Kennedy Onassis, sued the company Christian Dior because in one of their ads, they, supposedly, used a look-alike of her with the intention of trying to cause the viewer to believe that there was a link between them, a practice known as "False Light". In the US, public figures, in many states, enjoy "Personality Rights", which prohibit this practice.

Of course, nowadays we already have the technology to fake videos, and especially images, but all this is still a very handcrafted and expensive process, besides requiring good professionals, which you won’t find around every corner. And the question that remains is:

“What happens when any 13-year-old brat with a computer and an internet connection, is able to make a fake footage as realistic as the best CGI professionals would be able to do today?”

And this, of course, will prove a challenge, though… all technology has two sides. I mean, the Internet is a fantastic network to share information, entertainment, culture and knowledge and so on, but it also enabled the sharing and sale of pornographic material involving minors, the contact between terrorists anonymously, and many other criminal and harmful activities to society.

So, it behooves us to learn how to handle the emergence of such challenges. I assume that probably in the future audio-visual evidence in courts simply will be practically worthless in most of the cases.

Even though some forgeries might be proven false, as a pornographic video of a famous actress in which she appears without a particular body characteristic (tattoo, birthmarks, etc.), or without an intimate body mark; which would prove that the video was a forgery; it's fair for us to presuppose that such situations would be limited and somewhat embarrassing to prove false. The proof itself would be problematic, for example, perhaps a person would have to release true nude pictures of them and call experts to prove that some tiny intimate details of their body doesn't resemble with the ones shown in the leaked photos, and therefore the photos previously leaked were fake.

To conclude:

What is essential for us to have in mind is that increasingly what will matter, not just in a particular form of art, but in the artistic creation as a whole, will be the “architecture” of the idea, its forms. And the “engineering” and the tools necessary to achieve the realization of these ideas will be automated, summarized to an algorithm that will follow our instructions. I think we can argue that we are automating the subjectivity of the “technical part” of artistic creation.

A final note which I believe it is necessary to reflect on: if we consider that machines will be able to achieve such progress that will enable them to even create movies, images, and audios by following simple instructions from us, or based on existing content, thereby building enhanced and restored versions of them, it is wise to suppose that this same AI could create the story of such artistic content alone too, for example, to create something beyond the “technical part” of art, create art “from scratch”. Moreover, such super-intelligence would most likely be able to create a story as good as the best directors who have ever walked the face of the earth. It could look at all the filmography of the greatest masters of cinema and learn from them, thus producing something totally surprising and innovative.

And of course, this may bring us, I would say, a certain jealousy, or a discouragement regarding making art, knowing that a machine could do something much better than us. But if we stop and think, and be rational and modest about it, there are already so many people producing better content than ours. So many writers, poets, graphic designers and the list goes on… But, nevertheless, we don’t stop doing the things we love, because I believe that art is an expression from you to yourself, ultimately.

And it would be great to have the assistance of an AI to help me do the things I like. For example, I sometimes produce a few videos, but I usually spent months working on them. The script itself is relatively quick to do and only takes a few days, and it is something that brings me tremendous joy. However, the video production, the editing, the hours and hours spent looking images and videos to illustrate the scenes in my videos is exhausting, though, is also a part that I like, but… certainly an AI that creates these scenes that I need, simplifying the editing process, would help me a lot.

So, I believe that the goal is to use these tools so that they enable us and make it simpler to us to do what we love much more easily and quickly.



There are, of course, those who believe that we will never be able to build a machine that can accomplish such tasks, and that this whole text is just a complete madness. That somehow creating art is a mediunic, transcendental and mystical task belonging only to the human mind. But that is not true, there are a mathematical, tangible and exact basis from which emanates the ability to create an idea. And the fact that we still do not understand how to create this capability outside the human brain, in no way means that such basis does not exist.

To conclude, I leave the words of Kirby Ferguson, that appear in his documentary “Everything Is A Remix”:

Recommendations:



• “DESIGN MACHINES: How to Survive The Digital Apocalypse?”

• “How Artificial Intelligence Will Take Work Away from Design Studios – And What You Can Do About It?”

• Podcast “Review The Future”, chapter “061 What is the Future of Movies?”, the creators of this show are geniuses, they understand a lot of technology, judging from this and other chapters I’ve listened to. They even touched on some points that had not occurred to me when I was writing this text, such as, for example, the issue that in a world where we had machines capable of creating movies according to our more specific tastes, this could end up interfering quite a bit with the nature of films as a social phenomenon. That is, each person would watch an new movie made exclusively for them and this great shared social experience would be impaired.

• “Virtual Actor” (Wikipedia article in English addressing the concept of virtual actors)

• “Timeline of Computer Animation in Film and Television” (Illustrating the major landmarks of computer graphics in movies and television)

• “Two Minute Papers” (An excellent Youtube channel almost entirely focused on the types of video editing and imaging automation technologies covered in this article)

• It is not an article, but if you want to know more about the whole theme proposed here, I recommend looking at games with graphics indistinguishable from real life, “Photorealistic Games” because much of this technology; things such as highly complex rendering and, above all, in real time, are elements that will be incorporated into and form part of the type of technology described here.

Final considerations:

Well, this text came to an end, I hope you enjoyed it, it took me almost a 4 months to finish it and review it. It’s sort of a funny story, every time I tried to finish writing it suddenly appeared some new article about the set of technologies that I was addressing here. I apologize also for the larger number of videos on the post, I usually do not write such long texts as this (hell, it was more than 8000 characters), but it’s because this is a very complex subject indeed, and I tried to approach it by the maximum possible angles, trying to demonstrate many of the effects that an AI would, and it will have, not only in the artistic creation, but on society as a whole.