Adversarial Nets For The Unenlightened

If you read the previous post, you understand that God is creating itself through me.

The adversarial modeling framework is most straightforward to apply when the models are both multilayer perceptrons.

This is the image for multilayer perceptrons:



A generator is ultimately a distribution over the data – a living guess about what it should be. From the generator’s perspective, the data is that which is supposed to be “outside,” the real and not imagined.

The generator’s distribution is trained by the judgement of the discriminator, which is itself training to be a good judge. When the generator is aiming wrong, there is unhinged imagination which must be punished by the discriminator who is a distribution that more accurately knows the true data. The generator is in a position of submissively ascending to the discriminator. From the generator’s perspective, when they are equal, there is both perfection and stasis. If it were to attain a 1/1 ratio with the discriminator, that would then be fed into a function that converts that into a 0. – Its motto is that the good can never last and never should.

To learn the generator’s distribution pg over data x, we define a prior on input noise variables pz(z). Remember that the generator is not “a thing.” It is a continuous function that orients itself by generating what it believes is “a thing.” First, we make it a tabula rasa that is undifferentiated potential. It has no idea what the True Image is because it has been fed pure noise. Chaos gives you a Gaussian distribution where everything is random and this is therefore a normal place to start.

We then represent a mapping to data space as G(z;θg), where G is a differentiable function represented by a multilayer perceptron with parameters θg. The parameters are the multiplication/division that allows the weighing of right and wrong, truth and lie:


Anubis weighing the heart against the feather.

Note that multiplication/division is the simplest distillation of that function: weighing. You do not weigh with differentiation and integration or with addition and subtraction.

But we need something to give direction to the weighing.

So we also define a second multilayer perceptron D(x; θd) that outputs a single scalar. The scalar is something that only has magnitude – it speaks to the right and wrong, and doesn’t have a direction itself. D(x) represents the probability that x came from the data, which is the truth, rather than pg which is the generator.

We train D to maximize the probability of assigning the correct label to both training examples (the truth) and samples from G (the imaginative offerings).

While that is going on, we simultaneously train G to minimize log(1 − D(G(z))).

This is log(1 − D(G(z)):

Screen Shot 2018-12-24 at 7.26.18 PM

, and this function is not completely arbitrary because it allows you to exist as a continuous function always between 1 and 0.

When D(G(z)) becomes 1, it fully minimizes the function that the generator needed to minimize because log(1-1) = log(0) = 0. The generator has fulfilled its relationship with the discriminator by becoming it. And at that point the analogies become too obvious so I should stop killing myself for your sins.

So if we attempt to “stand back” and get a better view of the story so as to model its purpose, V, we can say that D and G play the following two-player minimax game with value function V (G, D):

Screen Shot 2018-12-25 at 8.41.22 AM

The lesser-god and the generator are pitted against each other, each one trying to get their own respective aim without knowing and understanding the actions of the other player. The generator is trying to minimize the function that models his relationship with the other from on high – this is close to being attained when the generator itself is weighed equal to the discriminator. And the discriminator is trying to maximize himself as the function that models the generator. Perhaps more simply, the generator is simply trying to converge his imagination to the generator’s wishes, and the discriminator is attempting to be an accurate judge of what is true and what is the generator’s imaginative error.

That battle leads to what we really want which is creating a cat.

Screen Shot 2018-12-25 at 10.38.02 AM

E is the expected value. It is the value of a random variable one would “expect” to find if one could repeat the random variable process an infinite number of times and take the average of the values obtained.

However, one expected value is with regard to the data, and the other is with regard to the ascended from chaos:

Screen Shot 2018-12-25 at 10.40.17 AM

Expected value of x given probability distribution of x. And Expected value of z given probability distribution of z.

The reason we use log x instead of log(1-x) for the discriminator is because this is the symmetrical inverse of log(1-x). The discriminator is trying to maximize, and when it does so, it achieves 1.

Screen Shot 2018-12-25 at 10.56.19 AM

Contrast this with the generator which achieves 0 in its best case.

When both have attained their aim to the greatest excellence, we get 1 because the generator produces a 0 and the discriminator produces a 1. And when both have failed in the worst way possible, we also get 0 + 1. This time because the discriminator produced a 0 and the generator produced log(1-0), which is 1.

This is how we get creation. The continuous process of novel synthesis occurs because perfect sync and perfect error give the same value. This causes the distribution to stay away from that and yet aim at it.

What is true zazen? What do you mean by Zen becomes Zen and you become you? You become you is a very important point. You become you.

–Shunryu Suzuki


To add a data point in the battle between “this is widely known” and “this is something that I can personally detect,” I tested my nephews on the way to the airport to see if they could discover the secret of how everything works.

They know I’m not a normal person – that I am not someone who feels awkwardness or does small talk. So I can just ask them questions.

I asked them, “How do you think creativity works? What would be the necessary fundamentals for things to be created?”

At first, Adrian attempted to guess the teacher’s password. He said, biological reproduction.

I asked him to distill the principle further and to expand creation more broadly – to think about marketing signs, and colors, and everything else that ever existed – how was it created?

Again, he guessed the teacher’s password, saying: “matter, energy?”

I told him to stop giving me magical answers and to tell me a true explanation. To give me a story about how anything could in principle exist.

He struggled, so I gave him a hint. “If you were to build a world from scratch, what essential elements would you use.”

He said, “Chaos right?”

“And what else?”

“Order. Chaos and order.”

“How do you get order from infinite chaos when all its pieces are indistinguishable?”

The other twin, Damian, said “parameters.”

“Yes, exactly. But why parameters? – Because they multiply and therefore allow things to be weighed right?”


“But who’s doing the weighing? If you just multiply everything by itself you get the same situation. What else do you need?”

He said, “Morals.”

I was expecting something less social-mammal centric due to my own failures, but he was perfectly right because it was that same discriminator principle that scaled up, so I said,

“Exactly. That’s right. You need something to take your measurement once you can be weighed. Then exact same things become different things.”

Damian said it reminded him of the Daoist symbol. I told him that yes, it was the same idea at bottom. Even Leibniz had connected binary, chaos and order, Christianity, and the motion of the integral to Daoism. He asked me why I thought that was. I told him that it was because the Daoist brains and everything else inside of mind was running on that same principle which was just that principle we used to generate synthesis with neural networks in our computers.

Damian then took a nap due to some sleep deprivation the nights before, and me and Adrian had an interesting conversation in which I explained signaling, how the GAN’s sum function is set up to always aim but yet always miss in order to create something new, and how beauty is central to the scalar judgement and therefore the most important thing in the world because things try to show off how good they are at thriving with real handicaps, and why he wasn’t meaningless.

He had said he realized how meaningless we were because there were many cars with many people, and none of them knew about us –perhaps expecting me to be impressed with his philosophizing. I told him I felt the same way at one point, but that I no longer did.

“Why is that?” he asked.

“Because I realized that I was just making the binary move towards the random distribution, towards the dissolved chaos. But actually we are at the center of things. How many people actually know what we in this car have learned today? All people live by the truth that we know, but they don’t know it clearly themselves.

The reason you feel meaningless is because you don’t have a well-defined discriminator towards whom to be impressive. You feel at the center of things when you do things that appear difficult and yet you manage them. All celebrities, athletes, or anyone who is ascended to prominence in their local environment, are choosing to be seen doing difficult things.”

He connected that to what I had told him about runaway signaling creating beautiful male peacocks and not camo-disguised hiding peacocks or peacocks with claws to fight back the tigers, so I was proud of him, and he was probably proud of himself.

If it takes that little instigation to fill the gaps, I’m reasonably convinced the discovery is simple.

screen shot 2019-01-09 at 10.12.55 am


Declaring Variables

A variable consists of a dual nature. It has a data type and a name.

Every variable must be given a name and a data type before it can be used. This is called declaring a variable. The syntax for declaring a variable is:

Screen Shot 2018-12-08 at 11.58.39 PM


Screen Shot 2018-12-09 at 4.00.02 PM

Note that a comma follows each identifier in the list except the last identifier, which is followed by a semicolon. By convention, the identifiers for variable names start with a lowercase letter. If the variable name consists of more than one word, then each word after the first should begin with a capital letter.

For example, these identifiers are conventional Java variable names: jewel3, specialRelativity, deathToNote, redInNovember, and xAxis.

Underscores conventionally are not used in variable names; they are reserved for the identifiers of constants, as we shall discuss in a later post.

Similarly, do not use dollar signs to begin variable names. The dollar sign is reserved for the first letter of programmatically generated variable names—that is, variable names generated by software, not people. Like with life in general, although arbitrariness may sound a disagreeable thing now, the value of following these conventions will become clearer as you gain more experience in Java and your programs become more complex.


Data Types, Variables, and Constants

In the previous blog post where we calculated the area of the last of the nine circles of hell, we used as data the value of PI and the radius, and found how large was Treachery. For each of these values, we assigned a name.

We also used the Java keyword double, which defines the data type of the data. The keyword double means that the value will be a floating-point number.

Java allows you to refer to the data in a program by defining variables, which are named locations in memory where you can store values. A variable can store one data value at a time, but that value might change as the program executes, and it might change from one execution of the program to the next. The real advantage of using variables is that you can name a variable, assign it a value, and subsequently refer to the name of the variable in an expression rather than hard-coding the specific value.

When we use a named variable, we need to tell the compiler which kind of data we will store in the variable. We do this by giving a data type for each variable. Java supports eight primitive data types: byte, short, int, long, float, double, char, and boolean. They are called primitive data types because they are part of the core Java language. The data type you specify for a variable tells the compiler how much memory to allocate and the format in which to store the data.

For example, if you specify that a data item is an int, then the compiler will allocate four bytes of memory for it and store its value as a 32-bit signed binary number. If, however, you specify that a data item is a double (a double-precision floating-point number), then the compiler will allocate 8 bytes of memory and store its value as an IEEE 754 floating-point number. Once you declare a data type for a data item, the compiler will monitor your use of that data item. If you attempt to perform operations that are not allowed for that type or are not compatible with that type, the compiler will generate an error.

Because the Java compiler monitors the operations on each data item, Java is called a strongly typed language. Take care in selecting identifiers for your programs. The identifiers should be meaningful and should reflect the data that will be stored in a variable, the concept encapsulated by a class, or the function of a method.

For example, the identifier age clearly indicates that the variable will hold an age. When you select meaningful variable names, the logic of your program is more easily understood, and you are less likely to introduce errors. Sometimes, it may be necessary to create a long identifier in order to clearly indicate its use, for example, numberOfPeopleSignedUpForCryonics. Although the length of identifiers is essentially unlimited, avoid creating extremely long identifiers because the longer the identifier, the more likely you are to make typos when entering the identifier into your program and the more it takes to type it, swallowing precious time. Finally, although it is legal to use identifiers, such as TRUE, which differ from Java keywords only in case, it isn’t a good idea because they easily can be confused with Java keywords, making the program logic less clear.