Difference between revisions of "CG4 – Questions α"

From arguably.io
Jump to navigation Jump to search
Line 226: Line 226:
<!-- add more on theory of mind and how it can be used to generate instant by instant hypotheses about the user's state of knowledge, perceptions        -->
<!-- add more on theory of mind and how it can be used to generate instant by instant hypotheses about the user's state of knowledge, perceptions        -->
<!-- and what the users goals and motivations might be                                                                                                    -->
<!-- and what the users goals and motivations might be                                                                                                    -->
<BR />
<BR />
<!--
<!--
'''''<Span Style="COLOR:BLUE; BACKGROUND:SILVER">Risks</SPAN>'''''
'''''<Span Style="COLOR:BLUE; BACKGROUND:SILVER">Risks</SPAN>'''''
<BR />
The following collects these reactions suggesting promise or risk. They seem to partition into questions of promise and Risk. The main risk categories are: '''''<SPAN STYLE="COLOR:RED">systemic, malicious</SPAN>''''' and '''''<SPAN STYLE="COLOR:RED">theoretical.</SPAN>'''''
The following collects these reactions suggesting promise or risk. They seem to partition into questions of promise and Risk. The main risk categories are: '''''<SPAN STYLE="COLOR:RED">systemic, malicious</SPAN>''''' and '''''<SPAN STYLE="COLOR:RED">theoretical.</SPAN>'''''
*'''''<SPAN STYLE="COLOR:RED">Systemic.</SPAN>''''' These risks arise innately from the emergence and adaptation of new technology or scientific insights. During the early years of private automobile usage the risks of traffic accidents was very low. This was because there were very few in private hands. But as they began to proliferate. Traffic accident risks escalated. Ultimately civil authorities were obliged to act to regulate their use and ownership. As some observers have pointed out, regulation usually occurs after there has been an unfortunate or tragic event. The pattern can be seen in civil aviation and later in control and usage of heavy transportation or construction equipment. In each case training became formalized and licensing became obligatory for airplane ownership and usage, trucking or heavy construction equipment.
*'''''<SPAN STYLE="COLOR:RED">Systemic.</SPAN>''''' These risks arise innately from the emergence and adaptation of new technology or scientific insights. During the early years of private automobile usage the risks of traffic accidents was very low. This was because there were very few in private hands. But as they began to proliferate. Traffic accident risks escalated. Ultimately civil authorities were obliged to act to regulate their use and ownership. As some observers have pointed out, regulation usually occurs after there has been an unfortunate or tragic event. The pattern can be seen in civil aviation and later in control and usage of heavy transportation or construction equipment. In each case training became formalized and licensing became obligatory for airplane ownership and usage, trucking or heavy construction equipment.
*'''''<SPAN STYLE="COLOR:RED">Malicious.</SPAN>''''' History if littered with examples of how a new scientific advance or technological advance was applied in ways not intended by the inventor. The Montgolfier hot air balloons were considered an entertaining novelty. Their use during World War One as surveillance and attack platforms cast a new and totally different perception on their capabilities. We should expect the same lines of development with CG4. Its peers and derivatives should be considered as no different.
*'''''<SPAN STYLE="COLOR:RED">Malicious.</SPAN>''''' History if littered with examples of how a new scientific advance or technological advance was applied in ways not intended by the inventor. The Montgolfier hot air balloons were considered an entertaining novelty. Their use during World War One as surveillance and attack platforms cast a new and totally different perception on their capabilities. We should expect the same lines of development with CG4. Its peers and derivatives should be considered as no different.
*'''''<SPAN STYLE="COLOR:RED">Theoretical.</SPAN>''''' CG4 has shown itself to be a powerful cognitive appliance or augmentation tool. Given that it is capable of going right to the core of what makes humans the apex predator we should take very seriously the kinds of unintended and unexpected ways that it can be applied. This way suggests considerable caution.
*'''''<SPAN STYLE="COLOR:RED">Theoretical.</SPAN>''''' CG4 has shown itself to be a powerful cognitive appliance or augmentation tool. Given that it is capable of going right to the core of what makes humans the apex predator we should take very seriously the kinds of unintended and unexpected ways that it can be applied. This way suggests considerable caution.<BR />
 
'''''<SPAN STYLE="COLOR:BLUE">Recent Reactions.</SPAN>''''' Since the most recent artificial intelligence systems have swept over the public awareness, sentiment has begun to crystallize. There have been four general types of sentiment that have crystallized over time. These include: voices of enthusiastic encouragement, cautious action, urgent preemption.
'''''<SPAN STYLE="COLOR:BLUE">Recent Reactions.</SPAN>''''' Since the most recent artificial intelligence systems have swept over the public awareness, sentiment has begun to crystallize. There have been four general types of sentiment that have crystallized over time. These include: voices of enthusiastic encouragement, cautious action, urgent preemption.
*'''''<SPAN STYLE="COLOR:BLUE">Enthusiastic Encouragement.</SPAN>''''' Several industry watchers have expressed positive reactions to the availability of CG4 and its siblings. Their position has been that these are powerful tools for good and that they should be viewed as means that illuminate the pathway forward to higher standards of living and human potential.
*'''''<SPAN STYLE="COLOR:BLUE">Enthusiastic Encouragement.</SPAN>''''' Several industry watchers have expressed positive reactions to the availability of CG4 and its siblings. Their position has been that these are powerful tools for good and that they should be viewed as means that illuminate the pathway forward to higher standards of living and human potential.
Line 262: Line 258:
<!-- that make this thing what it is and say how it does it all...                                                              -->
<!-- that make this thing what it is and say how it does it all...                                                              -->
<!-- given that that first pass on trying to explain it was almost but not quite there...                                      -->
<!-- given that that first pass on trying to explain it was almost but not quite there...                                      -->
'''''<Span Style="COLOR:BLUE; BACKGROUND:SILVER">CG4 – Theory of Operation: </SPAN>'''''CG4 is a narrow artificial intelligence system that is a Generative Pre-trained Transformer. <BR />  
'''''<Span Style="COLOR:BLUE; BACKGROUND:SILVER">CG4 – Theory of Operation: </SPAN>'''''CG4 is a narrow artificial intelligence system that is a Generative Pre-trained Transformer. <BR />  
In order to make sense of this one would be well advised to understand several fundamental concepts associated with this technology. Because this is a highly technical subject the following is intended to introduce the core elements. The reader is encouraged to review the literature and body of insight that is currently available as explanatory video content.<BR />
In order to make sense of this one would be well advised to understand several fundamental concepts associated with this technology. Because this is a highly technical subject the following is intended to introduce the core elements. The reader is encouraged to review the literature and body of insight that is currently available as explanatory video content.<BR />
Line 270: Line 265:
<!-- move it all over to version05 - put focus on theoretical issues topics;                                                            -->
<!-- move it all over to version05 - put focus on theoretical issues topics;                                                            -->
<!--                                                                                                                                    -->
<!--                                                                                                                                    -->
By way of clarifying the topics of this  work we organize these concepts into two primary groups. The first group offer basic information
By way of clarifying the topics of this  work we organize these concepts into two primary groups. The first group offer basic information
on the fundamental building blocks of Large Language Models of which CG4 is a recent example. The second group introduces or otherwise clarifies
on the fundamental building blocks of Large Language Models of which CG4 is a recent example. The second group introduces or otherwise clarifies

Revision as of 19:29, 16 October 2023

OPENAI.png

OpenAI - ChatGPT4.
In what follows we attempt to address several basic questions about the onrushing progress with the current focus of artificial intelligence. There are several competing actors in this space. These include OpenAI, DeepMind, Anthropic, and Cohere. A number of other competitors are active in the artificial intelligence market place. But for purposes of brevity and because of the overlap we will limit focus on ChatGPT4 (CG4). Further, we focus on several salient questions that that raise questions of safety, risk and prospects.
Specifically, risks that involve or are:

  • Interfacing/Access: how will different groups interact with, respond to and be affected by it; might access modalities available to one group have positive or negative implications for other groups;
    Interfacing - Synthesis.
  • Political/Competitive: how might different groups or actors gain or lose relative advantage; also, how might it be used as a tool of control;
    Political - Synthesis.
  • Evolutionary/Stratification: might new classifications of social categories emerge; were phenotypical bifurcations to emerge would or how would the manifest themselves;
    Evolutionary - Synthesis.
  • Epistemological/Ethical relativism: how to reconcile ethical issues within a society, between societies; more specifically, might it provide solutions or results that are acceptable to the one group but unacceptable to the other group;
    Epistemological - Synthesis

Synthesis.
Responding to these questions calls for some baseline information and insights about the issues that this new technology entails. We propose to suggest we look

  • Terms are included to help clarify crucial elements and contextualize CG4;
  • Sentiment is being expressed about it by knowledgeable observers;
  • Theory of Operation of technology paradigm used to produce its results;
  • Risks our approach has been to present a few commonly occurring risks, whether inherent or malicious as well as some theoretical risks that might emerge;
  • Insights are offered to serve as takeoff points for subsequent discussion;

Terms and Basic Concepts.
CG4 has demonstrated capabilities that represent a significant leap forward in overall capability and versatility beyond what has gone before. In order to attempt an assessment prospective risks suggests reviewing recent impressions at a later date as more reporting and insights have come to light. CG4 has already demonstrated that new and unforeseen risks are tangible; in some instances novel and unforeseen capabilities have been reported. It is with this in mind that we attempt here to offer an initial profile or picture of the risks that we should expect to see with its broader use. By way of of addressing this increasingly expanding topic we offer our summary along the following plan of discourse:

Overview and Impressions.

  • what has emerged so far; some initial impressions are listed;
  • next are some caveats that have been derived from these impressions;

Theory of Operation.
For purposes of brevity a thumb nail sketch of how CG4 performs its actions is presented;

  • included are some high level diagrams
  • also links to several explanatory sources; these sources include articles and video content;

Risks.
Our thesis identifies three primary types of risks; these include:

  • systemic these are inherent as a natural process of ongoing technological, sociological advance;
  • malicious: who known actors categories are; how might they use this new capability;
  • theoretical: or possible new uses that might heretofore not been possible;

Notes, References.
We list a few notable portrayals of qualitative technological or scientific leaps;

CG4 – Theory of Operation: CG4 is a narrow artificial intelligence system that is a Generative Pre-trained Transformer.
In order to make sense of this one would be well advised to understand several fundamental concepts associated with this technology. Because this is a highly technical subject the following is intended to introduce the core elements. The reader is encouraged to review the literature and body of insight that is currently available as explanatory video content.
By way of clarifying the topics of this work we organize these concepts into two primary groups. The first group offer basic information on the fundamental building blocks of Large Language Models of which CG4 is a recent example. The second group introduces or otherwise clarifies terms that have come to the forefront of recent public discussions and issues.
Fundamental Building Block Concepts.

  • GPT3: a smaller precursor version of the GPT4 system;
  • GPT3.5: a higher performance version of GPT3 but still falls short of the full-up GTP4 successor;
  • GPT4: a LLM with over one trillion parameters;
  • Chat GPT4: the conversational version of GPT4;
  • Artificial Neural Networks. Four excellent episodes providing very useful insight into some underlying theory of how artificial neural networks perform their actions.
    • From 3Blue1Brown: Home Page
    • Neural_Network_Basics Some basic principles in how neural networks structures are mapped to a problem.
    • Gradient DescentA somewhat technical topic requiring a careful examination on the part of the reader who is unfamiliar with the mathematical background.
    • Back Propagation, intuitively, what is going on?
      It graphically shows the mathematics behind how a neural network attempts to "home in" on a target; what we can see is the mathematics that underlies how training is iteratively done and the distance difference between the neural network's learning state gradually begins to converge with the actual desired response;
    • Back Propagation TheoryHow various layers in a deep learning neural network interact with each other.
      A much closer mathematical description of how prior and successive neuron values contribute to successive neuron values; then how they ultimately contribute to the output neuron that they connect to and whether it will activate or not; this is a highly mathematical description using advanced partial derivative calculus;
  • Deep Learning: the technique of using representing an artificial neural network type structure to solve extremely complex problems; a neural network it typically will work using a large number of what are called "hidden layers"; they achieve greater performance when they are trained on very large data sets; these data sets can range from millions to billions of examples;
Hidden Layers
  • Hidden Layer: One or more network layers that stands between an input layer and the final output layer. Each input layer is connected to all of the nodes in the first hidden layer. Subsequent hidden layers are added as a means to provide greater discriminatory resolution to the neural network. Each node in a hidden layer can be connected to each node in the following hidden layer. The more hidden layers the more sophisticated can be the ability of the neural network to perform its tasks.
    In neural networks, a hidden layer is located between the input and output of the algorithm, in which the function applies weights to the inputs and directs them through an activation function as the output. In short, the hidden layers perform nonlinear transformations of the inputs entered into the network. Hidden layers vary depending on the function of the neural network, and similarly, the layers may vary depending on their associated weights. Useful insight into what hidden layers are doing can be viewed using this hidden layers video For an interactive view of how hidden layers enable a neural network to learn a new function one can monitor this video by sentdex; it has no audio but provides excellent insight into how the various layers and nodes within a layer develops an approximation of the function that it is attempting to learn.
tokens are vectors of words or word fragment data
  • Parameters: are the coefficients of the model, and they are chosen by the model itself. It means that the algorithm, while learning, optimizes these coefficients (according to a given optimization strategy) and returns an array of parameters which minimize the error. To give an example, in a linear regression task, you have your model that will look like y=b + ax, where b and a will be your parameter. The only thing you have to do with those parameters is to initialize them.
  • Tokens: Making text into tokens. These become vector representations. They form the basis of the neuron values for each neuron in a neural network. The narrator provides a short and succinct introduction to the basic concepts of what a token is and how it is represented as a vector.
  • Generative Artificial Intelligence Generative artificial intelligence (AI) is artificial intelligence capable of generating text, images, or other media, using generative models. Generative AI models learn the patterns and structure of their input training data and then generate new data that has similar characteristics. A detailed oil painting of figures in a futuristic opera scene Théâtre d'Opéra Spatial, an image generated by Midjourney. In the early 2020s, advances in transformer-based deep neural networks enabled a number of generative AI systems notable for accepting natural language prompts as input. These include large language model chatbots such as ChatGPT, Bing Chat, Bard, and LLaMA, and text-to-image artificial intelligence art systems such as Stable Diffusion, Midjourney, and DALL-E.
Dream from MidJourney

A summary of the key elements of what generative artificial intelligence are explained in this video (11 minutes); note that a key and crucial point to keep in mind is that generative ai generates new output based upon a user's request or query; it uses massive amounts of training data to draw from in order to generate its output; note further that the "G" in GPT is derived from generative artificial intelligence;
Subsequent developments with generative ai systems have been moving forward apace at companies such as IBM. Their efforts are targeting a range of salient topic areas. To name a few their teams are addressing topics such as molecular structure, code, vision, earth science.

  • Encoder-Decoder. According to Analytics YogiIn the field of AI / machine learning, the encoder-decoder architecture is a widely-used framework for developing neural networks that can perform natural language processing (NLP) tasks such as language translation, etc which requires sequence to sequence modeling. This architecture involves a two-stage process where the input data is first encoded into a fixed-length numerical representation, which is then decoded to produce an output that matches the desired format. As a data scientist, understanding the encoder-decoder architecture and its underlying neural network principles is crucial for building sophisticated models that can handle complex data sets. By leveraging encoder-decoder neural network architecture, data scientists can design neural networks that can learn from large amounts of data, accurately classify and generate outputs, and perform tasks that require high-level reasoning and decision-making. An excellent if somewhat technical video describes and explains the architecture of what is going on with the encoder and decoder components of the GPT system.
  • Bidirectional Encoder Representations from Transformers (BERT) An insightful paper on several key features of what BERT is about and how it functions. official archive paper on BERT.
  • BERT - very useful video a prediction system that uses training data that uses statistical mechanisms to anticipate a subsequent input based upon the most recent input;
  • Large Language Model. Is a prediction system that uses training data to enable a deep neural network to perform recognition tasks. It uses statistical mechanisms to anticipate subsequent input based upon the prior input. Prediction systems use this training data to iteratively condition and train the deep learning system for its specific tasks. The following two videos provide focus on large language models; this is part one of large language models. it describes how words are used to predict subsequent words in an input text; here is part two and it is an expansion upon the earlier video but offers more technical detail. Together they provide a fairly concise synopsis of how large language models make predictions of how words are associated with each other in a body of text. They explain further how they require a very large training data set so as to achieve their performance.
Decoder
  • Transformer. A transformer is a deep learning system. This discussion presents key concepts of how transformers work from a more conceptual point of view. Its architecture relies on what is called parallel multi-head attention mechanism. The modern transformer was proposed in the 2017 paper titled "Attention is all you need".
    There is some overlap with topics described above in the section on encoder-decoder architecture. A significant contribution is because it enables creation of a neural network model that requires less training time than previous recurrent neural architectures; It addressed such issues as inadequate long short-term memory (LSTM); this had been a shortcoming for earlier models because contextually significant terms might be beyond the positional proximity of semantically meaningful terms. Its later variation have been adopted for training large language models on large (language) datasets, such as the Wikipedia corpus and Common Crawl, by virtue of the parallelized processing of input sequence. Earlier efforts to resolve such difficulties were fundamentally influenced by the Attention Is All You Need paper by Ashish Vaswani et al of the Google Brain team. A breakdown of its major concepts are presented in : Attention is all you need.
  • Generative Pre-trained Transformer: Generative pre-trained transformers (GPT) are a type of large language model (LLM) and a prominent framework for generative artificial intelligence. The first GPT was introduced in 2018 by OpenAI. GPT models are artificial neural networks that are based on the transformer architecture, pre-trained on large data sets of unlabeled text, and able to generate novel human-like content. As of 2023, most LLMs have these characteristics and are sometimes referred to broadly as GPTs. An overview of the important features can be viewed in the narrator touches on a number of related topics as well.
  • Generative Adversarial Networks. a GAN is A generative adversarial network (GAN) is a class of machine learning framework and a prominent framework for approaching generative AI. The concept was initially developed by Ian Goodfellow and his colleagues in June 2014. In a GAN, two neural networks contest with each other in the form of a zero-sum game, where one agent's gain is another agent's loss.
    Given a training set, this technique learns to generate new data with the same statistics as the training set. For example, a GAN trained on photographs can generate new photographs that look at least superficially authentic to human observers, having many realistic characteristics. Though originally proposed as a form of generative model for unsupervised learning, GANs have also proved useful for semi-supervised learning, fully supervised learning, and reinforcement learning. This video provides a short synopsis of the GANS capability.
  • Recurrent Networks. A recurrent neural network is a kind of deep neural network created by applying the same set of weights recursively over a structured input, to produce a structured prediction over variable-size input structures, or a scalar prediction on it, by traversing a given structure in topological order. Fully recurrent neural networks (FRNN) connect the outputs of all neurons to the inputs of all neurons. This is the most general neural network topology because all other topologies can be represented by setting some connection weights to zero to simulate the lack of connections between those neurons. The illustration to the right may be misleading to many because practical neural network topologies are frequently organized in "layers" and the drawing gives that appearance. However, what appears to be layers are, in fact, different steps in time of the same fully recurrent neural network. A recurrent neural network is a kind of deep neural network] created by applying the same set of weights recursively over a structured input, to produce a structured prediction over variable-size input structures, or a scalar prediction on it, by traversing a given structure in topological order. For an intuitive overview of recurrent neural networks have a look here
  • Recursive Networks. A recursive neural network is a kind of deep neural network created by applying the same set of weights recursively over a structured input, to produce a structured prediction over variable-size input structures, or a scalar prediction on it, by traversing a given structure in topological order. Recursive neural networks, sometimes abbreviated as RvNNs, have been successful, for instance, in learning sequence and tree structures in natural language processing, mainly phrase and sentence continuous representations based on word embedding. RvNNs have first been introduced to learn distributed representations of structure, such as logical terms. Models and general frameworks have been developed in further works since the 1990s. Addressing the task of attempting to transition from words to phrases in natural language understanding was addressed using recursive neural networks. A short but useful video presents several core elements of how the problem was solved.
  • Supervised or Unsupervised Learning. Supervised learning (SL) is a paradigm in machine learning where input objects (for example, a vector of predictor variables) and a desired output value (also known as human-labeled supervisory signal) train a model. The training data is processed, building a function that maps new data on expected output values. An optimal scenario will allow for the algorithm to correctly determine output values for unseen instances. This requires the learning algorithm to generalize from the training data to unseen situations in a "reasonable" way (see inductive bias). This statistical quality of an algorithm is measured through the so-called generalization error. This short video presents a summarized explanation of the difference. Unsupervised learning is a paradigm in machine learning where, in contrast to supervised learning and semi-supervised learning,algorithms learn patterns exclusively from unlabeled data.
  • Pretrained. A pretrained AI model is a deep learning model — an expression of a brain-like neural algorithm that finds patterns or makes predictions based on data — that’s trained on large datasets to accomplish a specific task. It can be used as is or further fine-tuned to fit an application’s specific needs. Why Are Pretrained AI Models Used? Instead of building an AI model from scratch, developers can use pretrained models and customize them to meet their requirements. To build an AI application, developers first need an AI model that can accomplish a particular task, whether that’s identifying a mythical horse, detecting a safety hazard for an autonomous vehicle or diagnosing a cancer based on medical imaging. That model needs a lot of representative data to learn from. This learning process entails going through several layers of incoming data and emphasizing goals-relevant characteristics at each layer. To create a model that can recognize a unicorn, for example, one might first feed it images of unicorns, horses, cats, tigers and other animals. This is the incoming data.

    Then, layers of representative data traits are constructed, beginning with the simple — like lines and colors — and advancing to complex structural features. These characteristics are assigned varying degrees of relevance by calculating probabilities. As opposed to a cat or tiger, for example, the more like a horse a creature appears, the greater the likelihood that it is a unicorn. Such probabilistic values are stored at each neural network layer in the AI model, and as layers are added, its understanding of the representation improves. To create such a model from scratch, developers require enormous datasets, often with billions of rows of data. These can be pricey and challenging to obtain, but compromising on data can lead to poor performance of the model.
Sim City Autonomous Agents
  • Autonomous Agent: an instance of CG4 that is capable of formulating goals and then structuring subtasks that enable the achievement of those subtasks. A recent development has come to light wherein researchers at Stanford University and Google were able to demonstrate autonomous and asynchronous problem solving by having CG4 able to call instances of itself or of CG3. The result was that they were able to create collective of asynchronous problem solvers. They enabled these problem solvers a mechanism to interact and communicate with each other. The result was a very small scale simulation of a village. The "village" consisted of 25 agents. Each agent was assigned private memory as well as goals.
    Of note is that the agent architecture is a bare bones minimal set of behavior controllers. A much more complex and sophisticated set can be envisioned wherein each agent can be developed out to the point that they become far more lifelike. This can mean that they might have goals but such characteristics as beliefs, which allow for correct or incorrect understanding, theory of mind of other agents, or users.
    It would be a fairly small step to postulate a substantially larger collection of agents. This larger collection of agents might be put to the use of solving problems involving actual real people in real world situations. For instance one can imagine creating a population consisting of hundred or thousands of agents. These agents might be instantiated to possess positions or values regarding a range of topics. They can further be configured to associate themselves with elements or factors in the world that they operate in.
    Autonomous Agents Simplified Architecture

    For instance, a subset of agents might be instantiated to exhibit a value to specific factors in the sim-world. A more concrete example might be that they attach considerable value to having the equivalent of "traffic management", i.e. the analog of "traffic lights" in their world vs. having the equivalent of "stop signs"; other agents might possess nearly opposite value; this sets up the possibility that in a larger collective that conflict can arise. With that conflict there might develop agents that lean toward mediation and compromise. Others might be more adamant and less cooperative. The upshot is that very complex models of human behavior can be modeled by adding more traits beyond those of goals and memory.

Recently there appears to have been a shift in landscape of the topic of artificial intelligence agents. What now appears to be coming into focus is the ability to construct specifically targeted tools that make use of multiple autonomous agents to cooperatively solve problems. Knowledgeable observers have been taking note of this trend and providing insight into what it means and how it might affect the further development of the field.

  • Recent Topics or Issues
  • Consciousness: Recent consciousness research has thrown more light on the subject. A number of models have developed in recent years that attempt to address the so-called "hard problem". The basis of this problem focuses on the question of "what it is like to be a bat or a dolphin or a wolf". And identifying the neural correlates that mediate the experience of being. The authors propose approaching the question of if artificial consciousness is possible they suggest the utilization of computational functionalism, empirical neuroscientific evidence then they suggest that a theory-heavy approach be used to assess the viability of the various models that have been proposed to date. They then list the current candidates. These include:
    • recurrent processing theory
    • global workspace theory
    • computational higher order theories
    • attention schema theory
    • predictive processing
    • agency and embodiment

In each case they set out what are described as indicator properties. An indicator property is a trait or feature that must be present for consciousness to be operative.

Artificial Consciousness
  • Awareness: This is a characteristic trait of an organism that is capable of sensing the world around itself. It likewise will possess the capability to recognize, interpret and respond to various internal states. A feature of awareness is the ability to record, synopsize, tag and store episodic memories. With awareness there may or may not be self consciousness. However the organism will be able to respond to changes in its environment.
  • Sentience: Derivative from the concept of sense. As in possessing sufficient sensorial apparatus to intercept states and developments in the real world such as temperature, light, chemical odors and acoustic patterns.
  • World Model: The ability to create an abstract representation of an external reality. This model postulates that the sentient agent can further create a self-model that is an actor in this external world model and can interact or otherwise affect state in the world model. But also very importantly that events in the world model can give rise to events that can affect the agent. These events can be either adaptively positive or entail adverse risk.
  • Emergence: experimenters and developers report observing a working system exhibit properties that had not heretofore been programmed in; they "emerge" from the innate capabilities of the system;
  • Alignment: the imperative of imparting "guard rails" or otherwise limitations on what an artificial intelligence system can be allowed to do;
  • Hallucinations. CG4 has produced results to queries wherein it created references that superficially look legitimate but upon closer inspection prove to be nonexistent.


Theory of Mind


  • Theory of Mind: In order for a species to build a society, successful socialization processes between members of a species is fundamental. Development of a society requires that family members develop means of communicating needs and wants with each other. When this step is successful then collections of families can aggregate into clans. The basic is that each member develop a means of formulating or otherwise formalizing representations of their own mental and physical state. The key step forward is to be able to attribute comparable representations to others. When this step is successful then a theory of mind can crystallize. Intentions, wants and needs can then be represented. Intentions, wants and needs can then be used to develop plans. The more sophisticated the representation of self-state the more refined the clan's adaptive success will be.
    A recent paper on Theory Of Mind has illuminated this topic and is worth perusing to see how advances in the user interface experience will develop going forward. It is worth noting that ascribing belief to the user's state of knowledge is a crucial factor. This topic is crucial in understanding false beliefs, how they are recognized and responded to. We can imagine that during an interaction session that CG4 or a derivative descendant might have one or more autonomous agents operating to address this exact question, moment by moment.

Overview and Summary so far. If we step back for a moment and summarize what some observers have had to say about this new capability then we might tentatively start with that:

  • is based upon and is a refinement of its predecessor, the Chat GPT 3.5 system;
  • has been developed using the generative predictive transformer (GPT) model;
  • has been trained on a very large data set including textual material that can be found on the internet; unconfirmed rumors suggest that it has been trained on 1 trillion parameters;
  • is capable of sustaining conversational interaction using text based input provided by a user;
  • can provide contextually relevant and consistent responses;
  • can link topics in a chronologically consistent manner and refer back to them in current prompt requests;
  • is a Large Language Models that uses prediction as the basis of its actions;
  • uses deep learning neural networks and very large training data sets;
  • uses a SAAS model; like Google Search, Youtube or Morningstar Financial;

Interim Observations and Conclusions.

  • this technology will continue to introduce novel, unpredictable and disruptive risks;
  • a range of dazzling possibilities that will emerge that will be beneficial to broad swathes of society;
  • some voices express urgent action to preclude catastrophic outcomes;
  • informed geopolitical observers urge accelerated action to further refine and advance the technology lest our rivals and adversaries eclipse us with their accomplishments;
  • heretofore unforeseen societal realignments seem to be inevitable;
  • recent advances in the physical embodiment of these tools represent a phase shift moment in history, a before-after transition;

At this point we note that we have:

  • reviewed CG4’s capabilities;
  • taken note of insights offered by informed observers;
  • presented a thumbnail sketch of how CG4 operates;
  • examined the primary risk dimensions and offered a few examples;
  • suggested some intermediate notes and conclusions;

By way of summarization some observers say that CG4:
is:

  • a narrow artificial intelligence;
  • an extension of Chat GPT 3.5 capabilities;
  • a sophisticated cognitive appliance or prosthetic;
  • based upon Generative Predictive Transformer (GPT) model; performs predictive modeling;
  • a world wide web 24/7 accessible SAAS;

can:

  • converse:
    • explain its responses
    • self critique and improve own responses;
    • responses are relevant, consistent and topically associated;
    • summarize convoluted documents or stories and explain difficult abstract questions
    • calibrate its response style to resemble known news presenters or narrators;
    • understand humor
    • convincingly accurate responses to queries suggests the need for a New Turing Test;
  • reason:
    • about spatial relationships, performing mathematical reasoning;
    • write music and poems
    • reason about real world phenomena from source imagery;
    • grasp the intent of programming code debug, write, improve and explanatory documentation;
    • understand and reason about abstract questions
    • translate English text to other languages and responding in one hundred languages
    • score in the 90% level on the SAT, Bar and Medical Exams

has:

  • demonstrated competencies will disruptively encroach upon current human professional competencies;
  • knowledge base, training data sets had 2021 cutoff date;
  • very large training data set (books, internet content (purported to be in excess of 1 trillion parameters);
  • no theory of mind capability (at present) - future versions might offer it;
  • no consciousness, sentience, intentionality, motivation or self reflectivity are all lacking;
  • earlier short term memory; current subscription token limit is 32k (Aug 2023);
  • show novel emergent behavior; observers are concerned that it might acquire facility for deception;
  • shown ability to extemporize response elements that do not actually exist (hallucinates);
  • shown indications that a derivative (CG5 or later) might exhibit artificial general intelligence (AGI);

Intermediate Summary

  • Trajectory. Advances in current artificial intelligence systems have been happening at almost break-neck speed. New capabilities have been emerging which had been thought to not be possible for several more years. The most recent developments in the guise of autonomous agents as of September 2023 strongly suggest that a new iteration of capabilities will shortly emerge that will cause the whole playing field to restructure itself all over again. Major driving factors that will propel events forward include:
    • Autonomous agent based ensembles. This area is developing very quickly and should be monitored closely. The impact of this development will usher in a qualitative change in terms of what systems based upon large language models are capable of doing.
    • Quantum Computing. The transition to quantum computing will eclipse everything known in terms of how computing based solutions are used and the kinds of problems that will migrate into the zone of solubility and tractability. Problems that are currently beyond the scope of von Neuman based computational tools will shortly become accessible. The set of possible new capabilities and insights will be profound and can not be guessed at as of this writing in September 2023. Passing the boundary layer between von Neuman based architectures and combined von Neuman and quantum modalities will come to be viewed as a before-after even in history. The difference in capabilities will rival the taming of fire, the invention of writing and the acquisition of agriculture. The risks will likewise be great. The means that rivals will be able to disrupt each other's societies and economies can not be guessed at. In short we should expect a period of turbulence unlike anything seen so far in recorded history.
    • Extended Cognitive Processing Models. Ongoing work in how to determine if consciousness can be synthesized into something operable is ongoing and will increase in terms of salience going forward. The notion that a machine can plausibly exhibit consciousness is a topic that goes back centuries. The more recent attempts at guessing the implications of this prospect range from the earlier Space Odyssey's HAL9000 machine to the more recent variants of "Her" and "Ex Machina". Should this pathway forward yield results then answers to major questions will become imperative. Further, the inherent risks associated with a conscious entity capable of operating at electronic or quantum speeds carries profound implications.
  • Concerns. Informed and knowledgeable observers have expressed a range of positions ranging from very favorable to panic. In many cases these positions have been highly localized and limited to events and developments within the US. In others the focus has been on the realities of geopolitics and the presence of determined rivals.
    The very brief sampling of positions thus far is very limited. Going forward we should expect many more voices to join the debate. Some informed observers have already petitioned central government law makers to be more attentive and take note of the rapid pace of developments.
  • Risks. Responding to the picture as is in evidence so far strongly suggests that the last major risk category be included, i.e. hypothetical risks. With this full set available it should be possible to offer a preliminary assessment of how risks will manifest. What we can observe is that this is a new technology and it will exhibit applications and consequences comparable to those that have gone before. Specifically this means that there will be groups that benefit from its availability as well as those who will be victimized by it. Only time will tell as more cases are brought to light.