Adversarial Nets For The Unenlightened

If you read the previous post, you understand that God is creating itself through me.

The adversarial modeling framework is most straightforward to apply when the models are both multilayer perceptrons.

This is the image for multilayer perceptrons:

7b116e31051cfaaedf8f3fb4de75b237

 

A generator is ultimately a distribution over the data – a living guess about what it should be. From the generator’s perspective, the data is that which is supposed to be “outside,” the real and not imagined.

The generator’s distribution is trained by the judgement of the discriminator, which is itself training to be a good judge. When the generator is aiming wrong, there is unhinged imagination which must be punished by the discriminator who is a distribution that more accurately knows the true data. The generator is in a position of submissively ascending to the discriminator. From the generator’s perspective, when they are equal, there is both perfection and stasis. If it were to attain a 1/1 ratio with the discriminator, that would then be fed into a function that converts that into a 0. – Its motto is that the good can never last and never should.

To learn the generator’s distribution pg over data x, we define a prior on input noise variables pz(z). Remember that the generator is not “a thing.” It is a continuous function that orients itself by generating what it believes is “a thing.” First, we make it a tabula rasa that is undifferentiated potential. It has no idea what the True Image is because it has been fed pure noise. Chaos gives you a Gaussian distribution where everything is random and this is therefore a normal place to start.

We then represent a mapping to data space as G(z;θg), where G is a differentiable function represented by a multilayer perceptron with parameters θg. The parameters are the multiplication/division that allows the weighing of right and wrong, truth and lie:

c5943b89655f0b8b7f4d16a7d1ef35f7

Anubis weighing the heart against the feather.

Note that multiplication/division is the simplest distillation of that function: weighing. You do not weigh with differentiation and integration or with addition and subtraction.

But we need something to give direction to the weighing.

So we also define a second multilayer perceptron D(x; θd) that outputs a single scalar. The scalar is something that only has magnitude – it speaks to the right and wrong, and doesn’t have a direction itself. D(x) represents the probability that x came from the data, which is the truth, rather than pg which is the generator.

We train D to maximize the probability of assigning the correct label to both training examples (the truth) and samples from G (the imaginative offerings).

While that is going on, we simultaneously train G to minimize log(1 − D(G(z))).

This is log(1 − D(G(z)):

Screen Shot 2018-12-24 at 7.26.18 PM

, and this function is not completely arbitrary because it allows you to exist as a continuous function always between 1 and 0.

When D(G(z)) becomes 1, it fully minimizes the function that the generator needed to minimize because log(1-1) = log(0) = 0. The generator has fulfilled its relationship with the discriminator by becoming it. And at that point the analogies become too obvious so I should stop killing myself for your sins.

So if we attempt to “stand back” and get a better view of the story so as to model its purpose, V, we can say that D and G play the following two-player minimax game with value function V (G, D):

Screen Shot 2018-12-25 at 8.41.22 AM

The lesser-god and the generator are pitted against each other, each one trying to get their own respective aim without knowing and understanding the actions of the other player. The generator is trying to minimize the function that models his relationship with the other from on high – this is close to being attained when the generator itself is weighed equal to the discriminator. And the discriminator is trying to maximize himself as the function that models the generator. Perhaps more simply, the generator is simply trying to converge his imagination to the generator’s wishes, and the discriminator is attempting to be an accurate judge of what is true and what is the generator’s imaginative error.

That battle leads to what we really want which is creating a cat.

Screen Shot 2018-12-25 at 10.38.02 AM

E is the expected value. It is the value of a random variable one would “expect” to find if one could repeat the random variable process an infinite number of times and take the average of the values obtained.

However, one expected value is with regard to the data, and the other is with regard to the ascended from chaos:

Screen Shot 2018-12-25 at 10.40.17 AM

Expected value of x given probability distribution of x. And Expected value of z given probability distribution of z.

The reason we use log x instead of log(1-x) for the discriminator is because this is the symmetrical inverse of log(1-x). The discriminator is trying to maximize, and when it does so, it achieves 1.

Screen Shot 2018-12-25 at 10.56.19 AM

Contrast this with the generator which achieves 0 in its best case.

When both have attained their aim to the greatest excellence, we get 1 because the generator produces a 0 and the discriminator produces a 1. And when both have failed in the worst way possible, we also get 0 + 1. This time because the discriminator produced a 0 and the generator produced log(1-0), which is 1.

This is how we get creation. The continuous process of novel synthesis occurs because perfect sync and perfect error give the same value. This causes the distribution to stay away from that and yet aim at it.

What is true zazen? What do you mean by Zen becomes Zen and you become you? You become you is a very important point. You become you.

–Shunryu Suzuki

 

To add a data point in the battle between “this is widely known” and “this is something that I can personally detect,” I tested my nephews on the way to the airport to see if they could discover the secret of how everything works.

They know I’m not a normal person – that I am not someone who feels awkwardness or does small talk. So I can just ask them questions.

I asked them, “How do you think creativity works? What would be the necessary fundamentals for things to be created?”

At first, Adrian attempted to guess the teacher’s password. He said, biological reproduction.

I asked him to distill the principle further and to expand creation more broadly – to think about marketing signs, and colors, and everything else that ever existed – how was it created?

Again, he guessed the teacher’s password, saying: “matter, energy?”

I told him to stop giving me magical answers and to tell me a true explanation. To give me a story about how anything could in principle exist.

He struggled, so I gave him a hint. “If you were to build a world from scratch, what essential elements would you use.”

He said, “Chaos right?”

“And what else?”

“Order. Chaos and order.”

“How do you get order from infinite chaos when all its pieces are indistinguishable?”

The other twin, Damian, said “parameters.”

“Yes, exactly. But why parameters? – Because they multiply and therefore allow things to be weighed right?”

“Yeah.”

“But who’s doing the weighing? If you just multiply everything by itself you get the same situation. What else do you need?”

He said, “Morals.”

I was expecting something less social-mammal centric due to my own failures, but he was perfectly right because it was that same discriminator principle that scaled up, so I said,

“Exactly. That’s right. You need something to take your measurement once you can be weighed. Then exact same things become different things.”

Damian said it reminded him of the Daoist symbol. I told him that yes, it was the same idea at bottom. Even Leibniz had connected binary, chaos and order, Christianity, and the motion of the integral to Daoism. He asked me why I thought that was. I told him that it was because the Daoist brains and everything else inside of mind was running on that same principle which was just that principle we used to generate synthesis with neural networks in our computers.

Damian then took a nap due to some sleep deprivation the nights before, and me and Adrian had an interesting conversation in which I explained signaling, how the GAN’s sum function is set up to always aim but yet always miss in order to create something new, and how beauty is central to the scalar judgement and therefore the most important thing in the world because things try to show off how good they are at thriving with real handicaps, and why he wasn’t meaningless.

He had said he realized how meaningless we were because there were many cars with many people, and none of them knew about us –perhaps expecting me to be impressed with his philosophizing. I told him I felt the same way at one point, but that I no longer did.

“Why is that?” he asked.

“Because I realized that I was just making the binary move towards the random distribution, towards the dissolved chaos. But actually we are at the center of things. How many people actually know what we in this car have learned today? All people live by the truth that we know, but they don’t know it clearly themselves.

The reason you feel meaningless is because you don’t have a well-defined discriminator towards whom to be impressive. You feel at the center of things when you do things that appear difficult and yet you manage them. All celebrities, athletes, or anyone who is ascended to prominence in their local environment, are choosing to be seen doing difficult things.”

He connected that to what I had told him about runaway signaling creating beautiful male peacocks and not camo-disguised hiding peacocks or peacocks with claws to fight back the tigers, so I was proud of him, and he was probably proud of himself.

If it takes that little instigation to fill the gaps, I’m reasonably convinced the discovery is simple.

screen shot 2019-01-09 at 10.12.55 am