There are two areas where complex text is used – legislation and Defence specifications, and, to a lesser extent, V&V. Both areas suffer from the vulnerability of people charged with administering them not understanding the documents well enough. Humans have a Four Pieces Limit – that is, they cannot keep more than four things “live” in their heads – everything else is clumped into a constant, and the effect of changing any of the live things on those things held constant is ignored (which was probably to bounce back and change one of the other live things). Add in the limited time for training – experts in avionics have little understanding of legal niceties, and we have the current situation. A large complex document is only understood in part by those entrusted to create or administer it, and there is minimal overlap of the different parts that are understood. Without more appropriate tools, costly mistakes are inevitable.
In greatly simplified form,
this shows the areas missed
when people stick to their specialty
What can be done about it? We are suggesting that the complex text not be stored just as text, but rather in a densely annotated form, which can be activated. Many words have multiple meanings – the annotations show the particular meaning the word has at that position in the text, helping someone who is not an expert to understand the meaning of that part of the document. The dense annotation of the document is not just more words, but can also be used to find errors or gaps – it becomes a piece of machinery.
We are using a vocabulary of around 50,000 words, with around 100,000 definitions. There are also about 10,000 wordgroups (groups of words in general use – ambient temperature, a bridge too far, gravitational field, and groups of words specific to the particular document – Director General of National Intelligence, entrusted public official). The number of definitions per word or wordgroup ranges from one (about 10,000 words) to 72 for “set” and 82 for “run”.
“Set” is a good example – a chess set, a movie set, a movie set in Hawaii, a set of tennis, the rain set in, he is set in his ways – many meanings and collocations – he set off on a journey, he set off a bomb. Each new document that is read will typically have a few words that need to be added (a few minutes work).
Simple, common words usually have the most definitions – “about” has three parts of speech – adverb, preposition and used as a verb form – with a total of twenty two definitions. They sound easy, but there is often a sting in the tail – “do something about something” is defined as “do something so as to affect something” – there can be a long and uncertain causal chain that was just dropped into the statement by using “about”.
Many words have several parts of speech, with a word being both a noun and a verb as the most common. Some words can have up to six parts of speech (off, up, double).
The dictionary uses an additional layer, called Subcategorisation. A verb may have several different forms, recognisable in the text – transitive, intransitiver, ditransitive, infinitive – about a hundred all told.
Some words can be adverbs and prepositions – he turned on the light, he turned on a dime. Even some nouns can fill multiple roles – a noun supporting a clause – the idea that the world is flat, or an infinitive – the need to eat, or both a singular form and an aggregate form – a nickel and the metal nickel, or the plural form is the same as the singular form – fish, aircraft.
Some words carry a very powerful effect (we will be using examples from Anti-Money Laundering Legislation).
if the person does so reckless as to whether:
- the other person is a shell bank;
Words like “reckless” (AML) or “best efforts” Robodebt) can have far-reaching effects (or can be ignored with disastrous consequences because there is no definition of what to do – a hidden but accessible annotation can change that).
Many words can be treated as independent – “a large black car” – the adjectives are addressing different attributes of the car. Other words need to be clumped into an object before that object is operated on by other adjectives or prepositional phrases.
Some adjectives clump better with the noun, some adjectives clump better after the preposition is clumped.
an (indigenous person) of Hawaii
a sudden (loss of consciousness)
One definition of “plane” – an imaginary flat surface of unlimited extent.
The “flat surface of unlimited extent” is what is imaginary, so the adjective, object and prepositional phrase needs to be clumped into a single object before “imaginary” is applied.
The subject of a prepositional phrase can be particularly difficult to determine in complex text. “He put the money on the table” seems straightforward, with a latch to keep the money on the table when “put” has completed, but “He put the money on the table in his office into his bank account” is more difficult. In complex text, they can be far more complicated than that, with a preposition jumping over a chain of prepositions preceding it, with a severe cost penalty if someone gets it wrong. “He put the money from the bank in Fresno on the table in his office (the “on” jumps back to the “money”), with a note explaining its provenance”. The link from the “its” to what it is referring to can be worked out each time the text is read, or it can be saved and shown on demand, as can all the other links and clumping.
Complex text can start out fairly simple, and then have conditions imposed on it, and conditions on the conditions, so it becomes very difficult to see a clear line to an object. The effort to define something exactly becomes largely self-defeating, given the same thing is happening to the objects around it.
People may have spent considerable time familiarising themselves with a piece of legislation or specification. Then small changes are made – they seem small – not enough to trigger refamiliarisation – but over time change the structure of what is described in the text. The legislation or specification can be run as a simulation to assess the effect of the changes.
An English Lesson?
We are attempting to show just how complex the text of legislation or specifications can be.
There is a deeper point. The words are no longer marks on paper or on a screen, to be handled by the reader’s unconscious mind, but objects in an undirected network which can be activated, and display their expected behaviour (in an abstract sense). That is, inconsistencies and incoherence can be found, using the words and clumps of words as pieces of machinery which can radiate what they do, or operate on what other words radiate.
Words in a complex text document can carry their (not immediately visible) specific meaning, based on the words around them. The result is a much denser structure.
Here is the expanded definition of imprudent (“not prudent” would have been easier, but the definitions are a bit different):
Some more examples.
The verb Pay supports a BiTransInfinitive form: (of person) pay (someone) (money) (to do something) with the subject of the infinitive verb depending on the verb.
The Paid relation puts out a logical to activate ToMow.
John is expected to mow the lawn. Fred’s mower is faulty, and John is injured.
The subject of the infinitive relation shifts from John to Fred as the verb changes. This is an example of where meaning is crucial to understanding.
Why do we need to go into such detail?
Because you, and everyone else who reads a piece of complex text is going into deep detail in their Unconscious Mind. If all our Unconscious Minds were exactly the same, and our interests and training were the same, it would all work well. But our understanding of words, and our interests, are not the same. There will be differences in understanding, or some areas of the complex text seem unfathomable or irrelevant, and are ignored.
(Aside – we are far from being able to fully emulate the Unconscious Mind – see the article on Lane Following).
The obvious solution is to use a machine to do the dog work. It needs a large vocabulary, the ability to clump words into objects, and the analytic ability to find errors and gaps. It has to closely emulate the workings of the Unconscious Mind, while operating at a level where we can see what is happening.
The benefit is that we have an English Language interface to a machine, and can describe our problem to it. It can’t do everything, but it can hold a large and complex piece of legislation or specification in its head, with everything connected – something ten or a hundred pages away in the text is only a link away in the network.
Comparison With Generative AI
Unlike Generative AI, AGI is intended to show things exactly as they are – Right and Wrong. The initial legislation or specification is likely to have mistakes, inconsistencies, incoherence, gaps. When these have been removed, it continues as a guide to help people understand what the specification is saying. It can be activated to show, in an abstract sense, what the specification describes. If masses are added for dynamic objects, it can work out how much energy needs to be dumped to stop from 100 km/hour.
Active Structure as an AGI tool makes no attempt to cobble together pieces of text without knowing their meaning.
Complex objects and operations need complex text to describe them, with the unfortunate result that many people who are expert in a particular area will understand little of the text in the specification that surrounds their specialty. This can lead to monumental problems.
We are suggesting that the words in the text also be turned into pieces of machinery that represent what the words do or stand for. Then people can collaborate more effectively, either by seeing an underlying description that they can understand, or seeing how parts of what is described would work.
Interactive Engineering Pty Ltd