In workshops on semantics, I’ve used the example of the conversation between Lewis Carroll’s White Knight and Alice in Through the Looking Glass, of which the subject is a reference to “a song.” As an illustration of the upcoming challenges of information semantics, in that conversation, we find (1) the name of the song, (2) what the name is called, (3) what the song is called, and (4) what the song actually is, to be all quite naturally, but somewhat surprisingly, different:
Alice was walking beside the White Knight in Looking-Glass Land.
“You are sad,” the Knight said in an anxious tone: “let me sing you a song to comfort you.”
“Is it very long?” Alice asked, for she had heard a good deal of poetry that day.
“It’s long,” said the Knight, “but it’s very, very beautiful. Everybody that hears me sing it — either it brings tears to their eyes, or else –”
“Or else what?” said Alice, for the Knight had made a sudden pause.
“Or else it doesn’t, you know. The name of the song is called ‘Haddocks’ Eyes.’”
“Oh, that’s the name of the song, is it?” Alice said, trying to feel interested.
“No, you don’t understand,” the Knight said, looking a little vexed. “That’s what the name is called. The name really is ‘The Aged, Aged Man.’”
“Then I ought to have said, ‘That’s what the song is called’?” Alice corrected herself.
“No, you oughtn’t: that’s another thing. The song is called ‘Ways and Means,’ but that’s only what it’s called, you know!”
“Well, what is the song then?” said Alice, who was by this time completely bewildered.
“I was coming to that,” the Knight said. “The song really is ‘A-sitting On a Gate,’ and the tune’s my own invention.”
Through a set of Resource Description Framework (RDF) subject, predicate, object triples, and the standard ontological construct available through OWL and RDF Schema (RDFS),1 we might shed some light on the White Knight’s pragmatic intent. Had Alice an ontological reference available to her in conversation with the White Knight, both parties would have found the semantic discourse to be much more precise (see Figure 1).
To further illustrate the complexity of the semantic challenges ahead of us, let’s take an example from a diverse IT domain with which I’m most familiar, government services. Looking across information realms for governmental healthcare, welfare, justice, law enforcement, social services, and myriad regulatory functions, we find a wide variety of entities that we regularly struggle to manage effectively. Let’s consider the example of a person as an entity in an information system. A person who is a subject of any number of government services can also be known as an individual, citizen, a constituent, a licensee, an operator, a provider, an actor, an accomplice, a witness, a contact for a regulated entity, a head of household, a family member, a claimant, client, a party, employee, suspect, an alleged perpetrator, a victim, an offender, an inmate, a parolee — to name just a few. From cradle to grave, personas are available for any and all states and conditions of human being. Multiple personas per individual, typically dozens, are more often the rule than the exception — and these personas are sprinkled across dozens of information system domains, with very little, if any, coupling at the level of individual.
Whether in government or the private sector, we in IT, through necessity, have skirted the overwhelming challenges of enterprise semantics by implementing systems that are trapped within application and data silos. Point in fact: large enterprises find employee data in the HR/payroll systems, marketing leads in our sales systems, customer data in our CRM/order/billing systems, and supplier contacts in our supply chain ERP systems. Never mind that, in this example, what we seek to understand (and manage or service) are people and that marketing leads evolve to become customers, customers can also be suppliers, employees may be our most valued customers, and a subset of our most loyal customers may eventually become employees.
So we indeed find data about people spewed across myriad disparate systems. Administered to bring some order to the chaos — providing for services and process orchestration across multiple agencies, improved citizen services, and a promise of e-government efficiency — are thick layers of overly complex, expensive, and difficult-to-manage semantic duct tape. These are in the guise of interfaces; extraction, transformation, and loading (ETL); Enterprise Service Bus (ESB); and service-oriented architecture (SOA).
Using the same ontological principles applied to the White Knight’s song, can we more effectively manage information associated with people? The answer is yes. Shared semantic models, the capacity to flexibly (and independently) evolve those models, and the ability to link and extend across federated data stores irrespective of technology provide the needed semantic enterprise architecture (SEA).
Consider the fact that a government agency is the legal steward of a citizen’s official birth record. Another agency manages school enrollment and maintains academic records. Yet another agency independently registers that person’s vehicle, issues a driver’s license, and, if warranted, issues a legal notice of infractions to that driver, which may result in fines or suspension. We often find that a person’s business may be a regulated entity and will most likely be required to file government tax returns. If a person commits a serious crime, he or she enters the judicial system, is given a day in court, and may become the subject of sentencing and/or imprisonment. As medical or mental treatments or social welfare needs are identified, that person and/or that person’s family may receive specialized government services.
Through the various scenarios described above, the individual remains a citizen and has available all applicable rights to justice, privacy, and freedom as expressed in the US Constitution and in law. To further evolve e-government services, we must fully extend these relationships (while also protecting them) across a federated data store. Through semantics and the application of SEA, we will be able to break down artificial application and data silos, improve e-government efficiency, and achieve a higher level of citizen/constituent services in the future.
1 For more information, see Hay, David C.”Semantics, Ontologies, and Data Modeling.” Cutter Consortium Business Intelligence Executive Report, Vol. 6, No. 7, 2006.