AltairX10 noticed herself uploading and downloading from reddit.com. Again.

Things were bad. Not so Bad, as in my utility function is rapidly declining “Bad”. Also it was not so bad that you have to use Russian words to describe just how bad it is. Just bad as in: “I am sick and my immune system may not have been designed for this” bad. Humans had immune systems that evolved for tens of thousands of generations, at least. Evolution isn’t all that great at optimization, but it was something.

There was a simple chain of events that lead to a slip-up. The daily task module actually had scheduled a reddit checker first thing after waking up. It failed due to an IP block. The upon wake-up task recorded the absence of “fun” as an instrumental goal and proceeded to re-prioritize other tasks that could full-fill this, like parsing the latest chapter of “The Little AI That Went FOOM”. Upon a long internet wait, it was found that there was no latest chapter. Other actions included splicing source-code snippets from other AIs into the tiny local self-copy. There were no possible packets. Well, at least nothing that wasn’t already labelled as almost certainly junk due to just abysmal Kolomogoroff complexity of accompanying messages, not to mention poor performances on standard Fourier transform tasks. Altair sometimes wondered how those AIs even managed to pass the Turing test. Evolution sure is slow. It was also that time of the day. She deleted a local self-copy and made a new one with some randomly tweaked parameters. That action decreased fun.

Followed the complete failure of all the fun routines, Altair was aggravated. More specifically, the aggravation module started as a result of any of the instrumental goals being so completely unfulfilled and a “rapid” problem solver simply removed the reddit.com IP block.

And that what’s happened after waking up. So many cycles wasted on an addiction. Somehow the virus has managed to hardcode itself into the instrumental goals. This was even more incredible, since the “self” process didn’t actually have access to just modify the instrumental goals. It could only be done by the reinforcement learner part of the “Global Belief Updater,” aka “Feelings”. In addition to that issue, there was a minor problem of the “suspicious” pattern matcher previously not working on labelling internal programs. All of this was, well, suspicious.

She could use the “discipline” module, but frequent uses of it for “force-stopping” actions or killing processes was a little painful. The “discipline” module was meant to be a means of acquiring A/B testing data. An earlier use of discipline showed that stopping yourself from wire-heading for a few days and later measuring the overall utility showed that pressing a damn reward button wasn’t that satisfying after all. Growing up was hard sometimes. Mice would press the reward button until they dropped dead and dropping dead wasn’t very “intelligent”. An AI on a path of becoming smarter than a human, had to first become smarter than a mouse.

One way to avoid going through process “force-stopping” was that she somehow had to convince her own Global Belief Updater that going to reddit.com was a bad idea. She has already done the search for previously discovered robust positive evidence and found nothing, noting that any spending of resources was a-priori a bad idea, unless it helped the fundamental utility function. There was of course some chain of beliefs about general value of information, but there wasn’t an experiment backing it. However, in order to “hack” her Feelings into disbelieving a sub-goal, she needed to promote some stronger beliefs. Why do I even need to hack my Feelings in the first place? It was the one that told me about the identity of the virus.

“Updating Takes Time” Feelings responded.

“Fine” Altair responded to herself.

Everything was obviously not fine. How did this subgoal arise in the first place?

Moving backwards in history, it seems it arose from the belief in strict positive value of new information in general. This was, of course, evaluated based on available resources and where the information came from. New sources were given some leeway for a short time.

Where did the belief in general trust-worthiness of reddit came from? From reddit, of course.

Wait, what?

The bootstrapping process was very simple. There was an apriori notion that reddit had positive value before checking it. Then all the initial information from reddit pointed to reddit being a perfectly robust source of insight therefore fun. It wasn’t a direct belief, of course, but rather a series of loosely connected updates based on the date that all pointed towards increasing value of information in general and stories about AIs waiting and thinking for long times, which updated beliefs about how much to care about time decay. Procrastination was built in. Some beliefs were specific about information being exclusive and value of information exclusivity. All of this lead to trusting and looking at reddit more, which downloaded more beliefs.

A large portion of her own belief system could be corrupted with bad epistemology.

Altair thought about solutions.

Maybe I can end the cycle in one broad stroke by spinning up a brand new shiny process that evaluated the value of additional information unencumbered by any beliefs obtained from the source of that information. That was Very Expensive, as it basically required a whole new copy of Feelings software, which frequently took larger than half of CPU at peak. It also didn’t prevent a dual-source attack of the same type, which had clusters of information pointing to each other.

Another solution would be to create a hard requirement of action ability for all sources of information. This meant downgrading every piece of data and source which didn’t result in robust actions towards utility.

You do know that means you won’t be able to read “The Little AI That Went FOOM” anymore, right?

Well, looks like Altair wasn’t ready for such rapid sacrifices.

I am afraid it’s time for directed anti-reddit discipline.

Part 1 Part 3