Darwin2049/chatgpt4 version03

From arguably.io
Revision as of 22:43, 30 July 2023 by Darwin2049 (talk | contribs) (Created page with " {20230729: THIS ELEMENT REQUIRES A FEW MORE PASSES; AT MINIMUM THEY SHOULD INCLUDE: DEMARCATE THE MAJOR COMPONENTS INTO SELF CONTAINED SECTIONS; SECTIONS SHOULD LOGICALLY FOLLOW, ONE AFTER THE OTHER; DO A SANITY CHECK ON EACH SECTION; THIS MEANS THAT EACH SECTION SHOULD SUMMARIZE WHAT IT PURPOSE IN LIFE IS; IT SHOULD PICK UP FROM THE PREVIOUS SECTION AND "STATE ITS CASE"; VERIFY THAT THE EXAMPLES PROVIDED (WRITERS STRIKE, LUDDITES, HUMANS) SPEAK TO THE SPECIFIC ACTUAL...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

{20230729: THIS ELEMENT REQUIRES A FEW MORE PASSES; AT MINIMUM THEY SHOULD INCLUDE:

DEMARCATE THE MAJOR COMPONENTS INTO SELF CONTAINED SECTIONS; SECTIONS SHOULD LOGICALLY FOLLOW, ONE AFTER THE OTHER; DO A SANITY CHECK ON EACH SECTION; THIS MEANS THAT EACH SECTION SHOULD SUMMARIZE WHAT IT PURPOSE IN LIFE IS; IT SHOULD PICK UP FROM THE PREVIOUS SECTION AND "STATE ITS CASE"; VERIFY THAT THE EXAMPLES PROVIDED (WRITERS STRIKE, LUDDITES, HUMANS) SPEAK TO THE SPECIFIC ACTUAL PROBLEM AS PRESENTED; IN THE RISKS SECTION TIE THE EXAMPLES TOGETHER A BIT MORE CLOSELY - SPECIFICALLY THE PAPER PRESENTED BY THE LEADERS OF O.AI AND D.M... ADD A CRUCIAL RECENTLY IDENTIFIED RISK OF EMERGENCE AND ITS OPACITY; VERIFY THAT ALL IMAGES HAVE SOURCING INFORMATION IN THE DISCUSSION AND SYNTHESIS SECTION CHARACTERIZE EACH IN TERMS OF POSITIVE/NEGATIVE IMPACT/RISK; ADD A CONCLUSION SECTION THAT SPOTLIGHTS THE IMPACT OF QUANTUM COMPUTING ON ALL OF THE ABOVE; ONCE THIS HAS BEEN TIGHTENED AND "CLOSED UP" CONNECT IT TO "QUESTIONS-PART-2.0 WHY DOES DISCOURSE STOP WITH INTERFACE QUESTIONS? WHAT ABOUT EVOLUTION, POLITICAL AND EPISTIMOLOGICAL?}

Interface Questions. This is presented as a multifaceted question. Its focus is on the risks associated with how target audiences access and use the system. The list of focal topics as currently understood but which may grow over time include:

how this new technology will respond and interact with different communities; how will these different communities interact with this new technology; what if any limitations or “guard rails” are in evidence or should be considered depending upon the usage focus area; might one access modality inherit certain privileges and capabilities be considered safe for one group but risk for other groups; if so, how might the problem of “leakage” be addressed; in the event of an unintended “leakages” (i.e. “leaky interface”) what might be the implications of the insights, results, capabilities   Overview. In the following we try to analyze and contextualize the current known facts surrounding the OpenAI ChatGPT4 (CG4) system.

CG4 – What is it: we offer a summary of how OpenAI describes it; put simply, what is CG4? Impressions: our focus then moves to examine what some voices of concern are saying; Risks and Impact: we shift focus to what ways we expect it to be used either constructively or maliciously; here we focus on how CG4 might be used be used in expected and unexpected ways; Fundamentals. Starting with the basics here is a link to a video that explains how a neural network learns. From 3Blue1Brown:

Neural_Network_Basics Gradient Descent Back Propagation, intuitively, what is going on? Back Propagation Theory CG4 – What is it: CG4 is a narrow artificial intelligence system, it is based upon what is known as a Generative Pre-trained Transformer. According to Wikipedia: Generative pre-trained transformers (GPT) are a type of Large Language Model (LLM) and a prominent framework for generative artificial intelligence. The first GPT was introduced in 2018 by the American artificial intelligence (AI) organization OpenAI.

GPT models are artificial neural networks that are based on the transformer architecture, pretrained on large data sets of unlabeled text, and able to generate novel human-like content. As of 2023, most LLMs have these characteristics and are sometimes referred to broadly as GPTs. Generative Pre-Trained Language models are fundamentally prediction algorithms. They attempt to predict a next token or element from an input from the previous or some prior element. Illustrative video describing how the prediction process works. Google Search is attempting to predict what a person is about to type. Generative Pre-Trained language models are attempting to do the same thing. But they require a very large corpus of langue to work with in order to arrive at a high probability that they have made the right prediction.

GooglePredict.jpg Large Language Models are are attempting to predict the next token, or word fragment from an input text. In part one the narrator describes how an input is transformed using a neural network to predict an output. In the case of language models the prediction process is attempting to predict what should come next based upon the word or token that has just been processed. However in order to generate accurate predictions very large bodies of text are required to pre-train the model.

Part One. In this video the narrator describes how words are used to predict subsequent words in an input text. Part Two. Here, the narrator expands on how the transformer network is constructed by combining the next word network with the attention network to create context vectors that use various weightings to attempt to arrive at a meaningful result. Note: this is a more detailed explanation of how a transformer is constructed and details how each term in an input text is encoded using a context vector; the narrator then explains how the attention network uses the set of context vectors associated with each word or token are passed to the next word prediction network to attempt to match the input with the closest matching output text.

Transformer.png Generative pre-trained transformers are implemented using a deep learning neural network topology. This means that they have an input layer, a set of hidden layers and an output layer. With more hidden layers the ability of the deep learning system increases. Currently the number of hidden layers in CG4 is not known but speculated to be very large. A generic example of how hidden layers are implemented can be seen as follows.

The Generative Pre-training Transformer accepts some text as input. It then attempts to predict the next word in order based upon this input in order to generate and output. It has been trained on a massive corpus of text which it then uses to base its prediction on. The basics of how tokenization is done can be found here.


Tokenization is the process of creating the mapping of words or word fragments to their position in the input text. The training step enables a deep neural network to learn language structures and patterns. The neural network will then be fine tuned for improved performance. In the case of CG4 the size of the corpus of text that was used for training has not been revealed but is rumored to be over one trillion parameters.

Tokens00.png

They perform their magic by accepting text as input and assigning several parameters to each token that is created. A token can be a whole word or part of a word. The position of the word or word fragment. The Graphics in Five Minutes channel provides a very concise description of how words are converted to tokens and then how tokens are used to make predictions.

  • Transformers (basics, BERT, GPT)[1] This is a lengthy and very detailed explanation of the BERT and GPT transformer models for those interested in specific details.
  • Words and Tokens This video provides a general and basic explanation on how word or tokens are predicted using the large language model.
  • Context Vectors, Prediction and Attention. In this video the narrator expands upon how words and tokens are mapped into input text positions and is an excellent description of how words are assigned probabilities; based upon the probability of word frequency an expectation can be computed that predicts what the next word will be.

DeepLearning.jpg image source:IBM. Hidden Layers

Chat GPT4 is Large Language Model system. Informal assessments suggest that it has been trained on over one trillion parameters. But these suspicions have not been confirmed. If this speculation is true then GC4 will be the largest large language model to date. According to Wikipedia: A Large Language Model (LLM - Wikipedia) is a Language Model consisting of a Neural Network with many parameters (typically billions of weights or more), trained on large quantities of unlabeled text using Self-Supervised Learning or Semi-Supervised Learning. LLMs emerged around 2018 and perform well at a wide variety of tasks. This has shifted the focus of Natural Language Processing research away from the previous paradigm of training specialized supervised models for specific tasks.

It uses what is known as the Transformer Model. The Turing site offers useful insight as well into how the transformer model constructs a response from an input. Because the topic is highly technical we leave it to the interested reader to examine the detail processing steps.

The transformer model is a neural network that learns context and understanding as a result of sequential data analysis. The mechanics of how a transformer model works is beyond the technical scope of this summary but a good summary can be found here.

If we use the associated diagram as a reference model then we can see that when we migrate to a deep learning model with a large number of hidden layers then the ability of the deep learning neural network escalates. If we examine closely the facial images at the bottom of the diagram then we can see that there are a number of faces. Included in the diagram is a blow up of a selected feature from one of the faces. In this case it comes from the image of George Washington. If we are using a deep learning system with billions to hundreds of billions of parameters then we should expect that the ability of the deep learning model to possess the most exquisite ability to discern extremely find detail recognition tasks. Which is in fact exactly what happens.


We can see in this diagram the main processing steps that take place in the transformer. The two main processing cycles include encoding processing and decoding processing. As this is a fairly technical discussion we will defer examination of the internal processing actions for a later iteration.


Transformer00.png The following four references offer an overview of what basic steps are taken to train and fine tune a GPT system.

"Attention is all you need" Transformer model: processing

Training and Inferencing a Neural Network

Fine Tuning GPT

General Fine Tuning

An Overview. If we step back for a moment and summarize what some observers have had to say about this new capability then we might tentatively start with that:

is based upon and is a refinement of its predecessor, the Chat GPT 3.5 system; has been developed using the generative predictive transformer (GPT) model; has been trained on a very large data set including textual material that can be found on the internet; unconfirmed rumors suggest that it has been trained on 1 trillion parameters; is capable of sustaining conversational interaction using text based input provided by a user; can provide contextually relevant and consistent responses; can link topics in a chronologically consistent manner and refer back to them in current prompt requests; is a Large Language Models that uses prediction as the basis of its actions; uses deep learning neural networks and very large training data sets; uses a SAAS model; like Google Search, Youtube or Morningstar Financial; Some Early Impressions

possess no consciousness, sentience, intentionality, motivation or self reflectivity; is a narrow artificial intelligence; is available to a worldwide 24/7 audience; can debug and write, correct and provide explanatory documentation to code; explain its responses write music and poems translation of English text to other languages; summarize convoluted documents or stories score in the 90% level on the SAT, Bar and Medical Exams provide answers to homework, self critiques and improves own responses; provide explanations to difficult abstract questions calibrate its response style to resemble known news presenters or narrators; provides convincingly accurate responses to Turing Test questions; As awareness of what how extensive CG4's capabilities came to light several common impressions were articulated. Several that seemed to resonate included impressions that were favorable, but in many instances there were impressions that were less so. Following are a few of those that can be seen leading many discussions on the system and its capabilities.

Favorable.

Convincingly human: has demonstrated performance that suggests that it can pass the Turing Test; Possible AGI precursor: CG4 derivative such as a CG5 could exhibit artificial general intelligence (AGI) capability; emergent capabilities: recent experiments with multi-agent systems demonstrate unexpected skills; language skills: is capable of responding in one hundred languages; real world: is capable of reasoning about spatial relationships, performing mathematical reasoning; Concerns.

knowledge gaps: inability to provide meaningful or intelligent responses on certain topics; deception: might be capable to evade human control, replicate and devise independent agenda to pursue; intentionality: possibility of agenda actions being hazardous or inimical to human welfare; economic disruption: places jobs at risk because it can now perform some tasks previously defined within a job description; emergence: unforeseen, possibly latent capabilities; “hallucinations”: solution, answers not grounded in real world; Contemporaneous with the impressions that lead many discussion were expressions of concern that this new capability brought inherent risks. In discussions of risk three main categories emerged that received much public attention. These risks broke down into possible new ways that it could be used for malicious purposes. Other discussions focused on more theoretical risks.

In other words, things that might be possible to do when using this tool. With any new technological development there were necessarily other risks that did not fall into either category, i.e. deliberate malicious use or possible or imagined uses that could represent either a benefit or a risk to society or various elements of society.

These risks might be considered to be more systemic risks. These are risks that arise innately as a result of use or adoption of that new technology. A case in point might be the risks of traffic accidents when automobiles began to proliferate. Prior to their presence there were no systematized and government sanctioned method of traffic management and control.

One had to face the risk of dealing with what were often very chaotic traffic conditions. Only after unregulated traffic behavior became recognized did various civil authorities impose controls on how automobile operators could operate.

Going further as private ownership of automobiles increased even further, vehicle identification and registration became a common practice. Even further, automobile operators became obliged to meet certain basic operations competence and pass exams that verified operations competence.

The impetus to regulate how a new technology recurs in most cases where that technology can be used positively or negatively. Operating an aircraft requires considerable academic and practical, hands on training.

After the minimum training that the civil authorities have demanded a prospective pilot can apply for a pilot's license. We can see the same thing in the case of operators of heavy equipment such as long haul trucks, road repair vehicles and comparable specialized equipment.

Anyone familiar with recent events in both the US and in various European countries will be aware that private vehicles have been used with malicious intent resulting in severe injury and death to innocent bystanders or pedestrians. We further recognize the fact that even though powered vehicles such as cars or trucks require licensing and usage restrictions they have still been repurposed to be used as weapons.

Recent Reactions. Since the most recent artificial intelligence systems have swept over the public consciousness sentiment has begun to crystallize. There have been three primary paths that sentiment have taken. These include: the voice of caution, the voice of action and the voice of preemption.

Columbus00.jpg The Voice of Caution. Elon Musk and eleven hundred knowledgeable industry observers or participants signed a petition to the US Congress urging caution and regulation of the rapidly advancing areas of artificial intelligence. They voiced concern that the risks were very high for the creation and dissemination of false or otherwise misleading information that can incite various elements of society to action. They also expressed concern that the rapid pace of the adaption of artificial intelligence tools can very quickly lead to job losses across a broad cross section of currently employed individuals. This topic is showing itself to be dynamic and fast changing. Therefore it merits regular review for the most recent insights and developments.

{20230718: SOMEHOW THE WORDING IS NOT QUITE EXPRESSING THE RISKS OF DOING NOTHING WHEN FACED WITH A RELENTLESS ADVERSARY SUCH AS THE CCP. MAKE THIS PART MORE FOCUSED IN SAYING THAT GOING UP AGAINST THE CCP AFTER HAVING STOPPED AI PROGRESS WOULD BE LIKE THE SANTA MARIA GOING UP AGAINST THE GERALD FORD AIRCRAFT CARRIER...

MOREOVER WORSE THAT THE CCP WILL COME AT THE WEST LIKE KING KONG GOING AFTER AN ANT HILL... )


The Voice of Action. In recent weeks and months there have been sources signaling that the use of this new technology will make substantial beneficial impact on their activities and outcomes. Therefore they wish to see it advance as quickly as possible. Not doing so would place the advances made in the West and the US at risk.

Which could mean foreclosing on having the most capable system possible to use when dealing with external threats such as those that can be posed by the CCP.

The Voice of Preemption. Those familiar with geopolitics and national security hold the position that any form of pause would be suicide for the US because known competitors and adversaries will race ahead to advance their artificial intelligence capabilities at warp speed. In the process obviating anything that the leading companies in the West might accomplish. They argue that any kind of pause can not even be contemplated.

Ford00.jpg Some voices go so far as to point out that deep learning systems require large data sets to be trained on. They posit the reality that the PRC has a population of 1.37 billion people. A recent report indicates that in 2023 there were 827 million WeChat users in the PRC.

Further that the PRC makes use of data collection systems such as WeChat and TenCent. In each case these are systems that capture the messaging information from hundreds of millions of PRC residents on any given day. They also capture a comparably large amount of financial and related information each and every day. The result is that the PRC has an almost bottomless sea of data with which to work when they wish to train their deep learning systems with. Furthermore they have no legislation that focuses on privacy. The result is that the major artificial intelligence efforts can benefit from a totally unrestricted pathway forward to develop the most advanced and sophisticated artificial intelligence systems on the planet. And in very short time frames.

If viewed from the national security perspective then it is clear that an adversary with the capability to advance the breadth of capability and depth of sophistication of an artificial intelligence tool such as ChatGPT4 or DeepMind will have an overarching advantage over a power that is almost an order of magnitude smaller and has innumerable legislative restrictions and roadblocks to continued research and development.

The Voice of King Kong. A studied scrutiny of the available reports suggests that there is a very clear awareness of how quantum computing will be used in the area of artificial intelligence. Simply put, very little attention seems to be focused on the advent of quantum computing and how it will impact artificial intelligence progress and capabilities.

Kingkong00.jpg What has already been made very clear is that even with the very limited quantum computing capabilities currently available, these systems prove themselves to be orders of magnitude faster in solving difficult problems than even the most powerful classical supercomputer ensembles.

If we take a short step into the near term future then we might be obliged to attempt to assimilate and rationalize developments happening on a daily basis. Any or even all of which can have transformative implications. The upshot is that as quantum computing becomes more prevalent the field of deep learning will take another forward "quantum leap" - literally which will put those who possess it at an incalculable advantage. It will be like trying to go after King Kong with Spad S. - firing with bb pellets.

The central problem in these most recent developments arises because of the well recognized human inability to process changes that happen in a nonlinear fashion. If change is introduced in a relatively linear fashion at a slow to moderate pace then most humans are able to adapt and accommodate the change. However if change happens geometrically like what we see in the areas of deep learning then it is much more difficult to adapt to change.


20230730
1545 ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
SECTION 01
INTRODUCTION ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

The following tract is intended to take a brief look at the technology known as Large Language Models. Our focus is on the OpenAI product known as ChatGPT 4 (CG4). The intent of our focus is to examine some central questions as to how we might expect it to impact social, political, economic and personal discourse. We begin by posing a small set of questions as to how we may expect new risks to emerge as a result of this technology. The questions that we pose will briefly focus on: • Interfacing: how various user groups and communities will interact with CGT4 and its brethren and how various communities might respond to it and each other; • Political: or how various groups might be able to use this technology to shift public sentiment in various directions; • Evolutionary: here we offer some thoughts on how adept groups might gain advantage relative to each other by having access to this new capability; • Epistemological: questions spotlight how we might expect this new capability to be used in conditions wherein different social groups or communities with deeply held values and positions might be at variance with each other; then in a broader context how to expect this new capability to manifest its behavior from one society with its values relative to another society with different values.

By way of responding to these questions we approach the topic by addressing Impressions. We note that the range of impressions have varied over time. Earlier impressions expressed guardedly favorable sentiment. Over time, later impressions have been sounding alarms over the risks that this new reality can bring. Risks. In order to speak to the issue of risks we suggest a set of risk types. These include: systemic, malicious and theoretical. We take each in turn and offer some examples for each. Our obvervations so far suggest that we should expect to see a historical repeat of how societies respond to new and potentially disruptive scientific or technological developments. Discussion. We conclude with a section that offers some speculations on how this new technology can be expected to be applied. As of this writing our belief is that this new CG4 and related technology intrinsically offers both positive and negative outcomes. It is therefore inherently a dual use technology and we should prepare ourselves for surprises and disruptions. Questions. Interface. This is presented as a multifaceted question. Its focus is on the risks associated with how target audiences access and use the system. The list of focal topics as currently understood but which may grow over time include: • how this new technology will respond and interact with different communities; • how will these different communities interact with this new technology; • what if any limitations or “guard rails” are in evidence or should be considered depending upon the usage focus area; • might one access modality inherit certain privileges and capabilities be considered safe for one group but risk for other groups; if so, how might the problem of “leakage” be addressed; • in the event of an unintended “leakages” (i.e. “leaky interface”) what might be the implications of the insights, results, capabilities; Political. Here we examine a few issues that can arise within the political sphere regarding how CG4 might impact the existing political processes and dynamics. Evolutionary. Our focus at this point is to view how or if there might be selective developmental or evolutionary processes at work that might preference some communities relative to others. Epistimological. We then take a few instances where systems such as CG4 might respond under various conditions. Specifically conditions wherein the social and cultural values of one population might prescribe one course of action instead of another one. Or if we look at this same question from a larger perspective where one society’s values may be at variance with anothers. Overview. In the following we try to analyze and contextualize the current known facts surrounding the OpenAI ChatGPT4 (CG4) system. • CG4 – What is it: we offer a summary of how OpenAI describes it; put simply, what is CG4? • Impressions: our focus then moves to examine what some voices of concern are saying; • Risks and Impact: we shift focus to what ways we expect it to be used either constructively or maliciously; here we focus on how CG4 might be used be used in expected and unexpected ways; Fundamentals. Starting with the basics here is a link to a video that explains how a neural network learns. From 3Blue1Brown: • Neural_Network_Basics • Gradient Descent • Back Propagation, intuitively, what is going on? • Back Propagation Theory CG4 – What is it: CG4 is a narrow artificial intelligence system, it is based upon what is known as a Generative Pre-trained Transformer. According to Wikipedia: Generative pre-trained transformers (GPT) are a type of Large Language Model (LLM) and a prominent framework for generative artificial intelligence. The first GPT was introduced in 2018 by the American artificial intelligence (AI) organization OpenAI. GPT models are artificial neural networks that are based on the transformer architecture, pretrained on large data sets of unlabeled text, and able to generate novel human-like content. As of 2023, most LLMs have these characteristics and are sometimes referred to broadly as GPTs. Generative Pre-Trained Language models are fundamentally prediction algorithms. They attempt to predict a next token or element from an input from the previous or some prior element. Illustrative video describing how the prediction process works. Google Search is attempting to predict what a person is about to type. Generative Pre-Trained language models are attempting to do the same thing. But they require a very large corpus of langue to work with in order to arrive at a high probability that they have made the right prediction. Large Language Models are are attempting to predict the next token, or word fragment from an input text. In part one the narrator describes how an input is transformed using a neural network to predict an output. In the case of language models the prediction process is attempting to predict what should come next based upon the word or token that has just been processed. However in order to generate accurate predictions very large bodies of text are required to pre-train the model. • Part One. In this video the narrator describes how words are used to predict subsequent words in an input text. • Part Two. Here, the narrator expands on how the transformer network is constructed by combining the next word network with the attention network to create context vectors that use various weightings to attempt to arrive at a meaningful result. Note: this is a more detailed explanation of how a transformer is constructed and details how each term in an input text is encoded using a context vector; the narrator then explains how the attention network uses the set of context vectors associated with each word or token are passed to the next word prediction network to attempt to match the input with the closest matching output text. Generative pre-trained transformers are implemented using a deep learning neural network topology. This means that they have an input layer, a set of hidden layers and an output layer. With more hidden layers the ability of the deep learning system increases. Currently the number of hidden layers in CG4 is not known but speculated to be very large. A generic example of how hidden layers are implemented can be seen as follows. The Generative Pre-training Transformer accepts some text as input. It then attempts to predict the next word in order based upon this input in order to generate and output. It has been trained on a massive corpus of text which it then uses to base its prediction on. The basics of how tokenization is done can be found here. Tokenization is the process of creating the mapping of words or word fragments to their position in the input text. The training step enables a deep neural network to learn language structures and patterns. The neural network will then be fine tuned for improved performance. In the case of CG4 the size of the corpus of text that was used for training has not been revealed but is rumored to be over one trillion parameters. They perform their magic by accepting text as input and assigning several parameters to each token that is created. A token can be a whole word or part of a word. The position of the word or word fragment. The Graphics in Five Minutes channel provides a very concise description of how words are converted to tokens and then how tokens are used to make predictions.

  • Transformers (basics, BERT, GPT)[1] This is a lengthy and very detailed explanation of the BERT and GPT transformer models for those interested in specific details.
  • Words and Tokens This video provides a general and basic explanation on how word or tokens are predicted using the large language model.
  • Context Vectors, Prediction and Attention. In this video the narrator expands upon how words and tokens are mapped into input text positions and is an excellent description of how words are assigned probabilities; based upon the probability of word frequency an expectation can be computed that predicts what the next word will be.

image source:IBM. Hidden Layers Chat GPT4 is Large Language Model system. Informal assessments suggest that it has been trained on over one trillion parameters. But these suspicions have not been confirmed. If this speculation is true then GC4 will be the largest large language model to date. According to Wikipedia: A Large Language Model (LLM - Wikipedia) is a Language Model consisting of a Neural Network with many parameters (typically billions of weights or more), trained on large quantities of unlabeled text using Self-Supervised Learning or Semi-Supervised Learning. LLMs emerged around 2018 and perform well at a wide variety of tasks. This has shifted the focus of Natural Language Processing research away from the previous paradigm of training specialized supervised models for specific tasks. It uses what is known as the Transformer Model. The Turing site offers useful insight as well into how the transformer model constructs a response from an input. Because the topic is highly technical we leave it to the interested reader to examine the detail processing steps. The transformer model is a neural network that learns context and understanding as a result of sequential data analysis. The mechanics of how a transformer model works is beyond the technical scope of this summary but a good summary can be found here. If we use the associated diagram as a reference model then we can see that when we migrate to a deep learning model with a large number of hidden layers then the ability of the deep learning neural network escalates. If we examine closely the facial images at the bottom of the diagram then we can see that there are a number of faces. Included in the diagram is a blow up of a selected feature from one of the faces. In this case it comes from the image of George Washington. If we are using a deep learning system with billions to hundreds of billions of parameters then we should expect that the ability of the deep learning model to possess the most exquisite ability to discern extremely find detail recognition tasks. Which is in fact exactly what happens. We can see in this diagram the main processing steps that take place in the transformer. The two main processing cycles include encoding processing and decoding processing. As this is a fairly technical discussion we will defer examination of the internal processing actions for a later iteration. The following four references offer an overview of what basic steps are taken to train and fine tune a GPT system. "Attention is all you need" Transformer model: processing Training and Inferencing a Neural Network Fine Tuning GPT General Fine Tuning An Overview. If we step back for a moment and summarize what some observers have had to say about this new capability then we might tentatively start with that: • is based upon and is a refinement of its predecessor, the Chat GPT 3.5 system; • has been developed using the generative predictive transformer (GPT) model; • has been trained on a very large data set including textual material that can be found on the internet; unconfirmed rumors suggest that it has been trained on 1 trillion parameters; • is capable of sustaining conversational interaction using text based input provided by a user; • can provide contextually relevant and consistent responses; • can link topics in a chronologically consistent manner and refer back to them in current prompt requests; • is a Large Language Models that uses prediction as the basis of its actions; • uses deep learning neural networks and very large training data sets; • uses a SAAS model; like Google Search, Youtube or Morningstar Financial;