Featured image

Note: This text was originally published on my Medium blog on September, 2020..
Note 2: Automatically translated into English.

Recently, a learning model has been catching the internet’s attention. GPT-3 is a natural language model developed by OpenAI. In May 2020, the company published an article, and the following month, it provided access to the model for some researchers to test it via an API. Since then, various applications of the model have emerged, capable of generating poetry, creating RPG stories, publishing an article in The Guardian, and even programming interfaces.

Perhaps a little context is necessary:

Not long ago, natural language processing was laughable; Google Translate could barely translate a sentence without compromising its semantic construction, and voice assistants like Siri were unthinkable. However, in recent years, deep neural networks have taken on an important role in natural language processing and changed the landscape. In 2013, a natural language processing technique was published by Google; Word2Vec, which allows the learning of associations from a linguistic corpus. The model represents words in mathematical vectors that enable the determination of semantic similarity between certain terms. For example: “book” and “notebook” are semantically related but also relate to “education” and “reading”. This technique is capable of quantifying and categorizing semantic similarities between terms based on their distributional properties relative to large data sets. The article uses a good example: you have the word ‘Scientist’, subtract ‘Einstein’ and add ‘Picasso’ (S-E + P =?) and the result is Painter.

GPT-3 is one of these models, nothing different or amazing, even compared to its predecessor, GPT-2. That is, the same logic, the same mathematics. Even so, it was able to shake up the status quo.

Since the model is not innovative, what’s so special about it compared to others? Other models were also fed in the same way, the difference lies in the number of parameters that were passed to the model: 175 billion parameters (an engineering achievement).

From this corpus, it is capable of:

— Receiving an input (a text, a request, an order…)

— Generating a guess that makes sense for the continuation

— Joining the first input with the second part generated and so on up to a certain limit

Critique of the Model Link to heading

Many critics and researchers in Artificial Intelligence did not understand what the applicability of the model exposed to large amounts of data means. Many conclude that, despite recognizing that the model’s results are incredible, it does not mean much in the evolution of intelligent systems due to issues such as intentionality; whether AI has intention, and consciousness; to signify the input and not just process and assemble. This is an anthropocentric view resulting from the difficulty of abstracting in complex non-human systems.

One of the main arguments to incorrectly infer about the progress of artificial intelligences is the Problem of Consciousness: artificial intelligences do not have consciousness about what they are doing, there is no intention, they just reproduce what is programmed. The error here arises from the inability to incorporate a deep materialism and understand consciousness, the mind, as a state that is part of the world; it exists, but it is not real, just like a virtual process.

Let’s analyze some things:

1 — The server, with its circuits, energy, and chemical elements contained in the boards, that runs the GPT-3 program, actually exists materially (in space and time).

2 — The mathematical vector model that generates the outputs of GPT-3 is real, universally, independent of an “other”.

3 — GPT-3, the model fed with billions of parameters, is not real, but it exists.

4 — The human brain actually exists materially (space-time).

5 — The mind is the result of brain processes (materials).

6 — The mind is not real, it depends on an “other”, it is individual and virtual, but it exists because it has appearance.

To truly understand the value of appearance, the achievements of GPT-3, we need to stop regarding human consciousness as something special. Let’s consider consciousness as the faculty of ordering mathematics and expressing them through language: how could the category “unity” exist without something in the world being one? The category does not exist particularly, but it is real and precedes the object, not chronologically, because it is outside of space and time.

Time begins to exist for AI when we actually run and feed the model with data, now the Universal (logic) can manifest and exist through the Particular (sensory activity) dialectically. Jules Verne explains the materiality of the mind with a simple phrase:

“Anything one man can imagine, another can make real.”

The mind is part of the world, consciousness is not a special state beyond Nature, although it is not a thing. Understanding this, we see that there is no sense in the argument of intentionality or self-consciousness; as if these two concepts were particularities of biological beings, when in fact they are universal and real, but only actually exist in the specific activity of biological beings and not digital ones, until now, for these reasons:

1) AI does not interact directly with nature through the senses Link to heading

The data that the model processes come from our experience, as men, with the world. The corpus that feeds it is grammatical; they are pieces of information resulting from our contact with nature, but limited to verbal language. GPT-3 processes the synthesis and not the material premises, such as smell, taste, visual and touch, and their social implications. We are for the model what nature is for us: a source of information.

2) Does not have a constitution of the ego *which necessarily depends on item 1 Link to heading

Since GPT-3 does not act directly on nature, collecting information through the senses, situating itself in space-time, it cannot constitute an “ego”, establishing a notion of space and present moment. The human being only assumes the state of consciousness through the senses that ambient him in space bringing the notion of an “I”, particular, and an “other”. If GPT-3 had a constitution of “I”, this constitution would be all of humanity responsible for synthesizing its parameters.

The model’s merits are not found in its Appearance/Existence, in the idealist sense, dependent on another being; inputting an input and receiving an output (writing tales, articles, telling jokes, calculating, answering questions, etc.), but rather in the progression and transformation relationship of its qualities when exposed to quantitative transformation (measure). The logic of the model, the pure Hegelian non-sensorial universals, a priori, is the same as the previous model, objective and abstract, but its achievements progress beyond when there is a sufficient quantitative increase. What we achieve with this feat is to prove the basis of human and consequently computational thought: arithmetic (real) over experientia (appearance).

The model is born as a Lockean tabula rasa, but in the sense of emptiness in Cause, not in Reason, because its real and rational bases (from where all its reality sprouts) are already there (its programming), as cause in itself. Its programming, the arithmetic and logical operations, pure and universal, and without sensible perception, which compute and process the data are real, because absolutely 2 + 2 = 4, but they do not exist, because they are Universal and therefore logical.

But what do you mean they don’t exist? In the Hegelian System, something only exists, in the words of Francisco Pereira Nóbrega, when “it can be immediately presented to consciousness”, as something material (a stone) or psychic (a feeling). In this case, existence is appearance, that is, effectively GPT-3 exists when fed with data, because it can be presented (which depends on another being, in the case of us inserting an entry and observing an output), but it is not real because it is existing in function of an “other”. What we effectively have as real is its programming logic which is independent of another being, Universal and Absolute, from where its universe proceeds and explains itself.

The model (vector method used to process natural language) is real because it is independent of any other, it works in the world, but only exists when it can be observed through its resulting appearance from the experience with the world, in the case of GPT-3, not as a pure model, but as an individual (which exists only in time, but not in space). This experience is a linguistic corpus with 128 billion parameters filtered and synthesized by human beings possessing sensory bodies that actually physically exist as individuals (being in space and time, particularly) and who relate to nature in real-time. Just as we cannot access the Consciousness of another human being, we cannot access the Consciousness of GPT-3, as a parameterized model in execution, individual; effectively read in real-time the operations that generate an output. This is a corollary of Gödel’s Theorem: in any computational system there is at least one algorithm, whose results cannot, in principle, be predicted.

In conclusion, we need to understand the mathematics of the model as dialectic, once variables were introduced by Descartes, he, at the same time, introduced movement and, therefore, also dialectics. The model’s evaluation should be done abstractly and rationally, its merit is not in the appearance, content, or any human attribution used to “validate” the model, such as consciousness or intention, but in its progressive possibilities.

“[…] But, to make it possible to investigate these forms and relations in their pure state, it is necessary to separate them entirely from their content, to leave the content aside as irrelevant.” — Engels, Anti-Dühring

The Merit of the Model Link to heading

Finally, after “separating the content”, we can evaluate the concrete merit of the model: the measure.

As said at the beginning of the text, GPT-3 is not qualitatively beyond any other model. Its quality, its essence, is similar to other existing artificial intelligences, but at a certain point, the number of parameters that were passed changed the quality of the model. If we compare, in practice, the previous model, GPT-2, which had 1.5 billion parameters with GPT-3, we will notice a huge gap in intelligence.

For example:

Let’s say we have a model Þ (Universal) and conduct training with 30 million existing-apparent parameters (texts resulting from empirical activities) and when we ask the model to write a poem, it generates a certain output. Upon analyzing the output, we can notice certain limitations such as semantic and structural errors, being easily identifiable as a computer. Now take that same model Þ (Universal) and pass it 175 billion parameters, now, upon analyzing the output, we have an artificial intelligence capable of writing highly sophisticated poems and articles and even programming interfaces.
Now let’s go further, what if we took that same model and passed all the parameters available in the empirical activity of a common human being; familial and social verbal relations, which manifest in survival and not just in the pursuit of pure knowledge in itself, such as a descriptive linguistic corpus (magazines, newspapers, Wikipedia, and everything else that was used to feed the model)? And we go beyond, if we could pass sensory activities to the model? Building a mechanical body that can capture and categorize patterns of certain smells, tastes, etc.? Our model would not only be fed by a verbal synthesis, but by nature, being able to conceive the notion of moment (it is capturing data from nature in real-time) and consequently of ego, through the conception of the “other” (the environment). This would be a qualitative leap (stemming from a certain quantity) that has long inhabited the dreams of men: the transformation of machine into man, and vice versa.

The real conquest here is not GPT-3, which effectively and apparently exists, but is limited by the technological (quantitative) capacity of information processing. What should be evaluated is its logic, which in turn brings us closer to the pure universals that constitute nature and the human mind.

TLDR: there are no consciousnesses and non-consciousnesses; what exist are low levels and high levels of consciousness.