WikiISI/Concepts and Ontologies

From AI Wiki

Jump to: navigation, search

Intelligence and Security Informatics Concepts/Ontologies

Contents

Abstract

An overview of concepts and ontologies currently in the Intelligence and Security Informatics fields is given. Different domains are examined like deception detection and electronic surveillance and some of the foundational concepts are presented. Also information exchange systems like GJXDM and WNV-BOT Portal are presented as they mark the beginning of establishing widely used ontologies in the ISI field.

Introduction

Concepts are an essential block in Intelligence and Security Informatics (ISI). When it comes to machines they are not able to understand the world the same way a human does so, this gives rise to the need of creating concepts that a machine can understand and process. Such concepts are subsequently linked to one another and structured so they form an ontology that governs the concepts in a particular domain.

Ontologies are content theories about the categories of real world objects in a given domain, their properties, and the relations between objects that are possible in the respective domain of knowledge. They provide potential terms for describing our knowledge about the domain [1]. The ontology can be thought of as both a vocabulary, but most important also as a conceptualization of that vocabulary. As Chandrasekaran et al. point out, the vocabulary for a given domain is only as good as the underlying conceptualizations. If we want to gain an understanding of the ISI domain, we need to first look at the general concepts inherent in that domain. After careful analysis, we can then build a vocabulary, such as the ones mentioned further below [1]. Ontologies are developed to provide semantic interoperability between different entities, such as police, hospitals, first responders, and intelligence agencies.

To that end, we will first look at subsections of ISI and then introduce some existing ontologies.

Concepts

What is a concept

According to Merriam-Webster dictionary, a concept is 'an abstract or generic idea generalized from particular instances'. Concepts help to simplify the vast amount of information, as is present open source on the Web in the ISI domain. Concepts are closely related to category formation and classification. In fact, concept mining is at the intersection of AI and linguistics. Traditionally, subject-specific thesauri or general language models such as WordNet [2] have been used to map words to concepts. Concepts are interrelated, and previous work in Educational Sciences has focused on mapping concepts [3]. The interrelatedness of meaning between different terms constitutes a concepts, and has been formalized in Semantic Networks. [4] Figure 1 shows such a semantic network at the abstract level.

Figure 1. Concepts and Relations forming a Semantic Network
Figure 1. Concepts and Relations forming a Semantic Network
Figure 2. A Concept Map
Enlarge
Figure 2. A Concept Map

Figure 2 on its left hand side shows a list of concepts for making a concept map related to plants. This is just one possible concept map, as the nature of concept maps makes them dependent on the map creator's understanding of the subject. As the understanding grows, the concept map will change. Thus, concept map design is highly iterative, which is precisely the reason that it is such a good tool for use in educational settings. [5]

From Concepts to Ontologies

Ontologies are concepts embedded in a particular domain, linked by relationships. They provide for interoperability in a specific domain between different entities. They are content theories providing potential terms for describing knowledge of a domain [1]

The Semantic Web with its conceptual nature, for example using the Resource Description Framework, and languages using it, such as OWL, have made concept definition and interchange much easier, thereby facilitating meaningful data interchange. An example of RDF is given below:

<?xml version="1.0"?>
<rdf:RDF
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" 
xmlns:cd="http://www.recshop.fake/cd#"> 
<rdf:Description
rdf:about="http://www.recshop.fake/cd/Empire Burlesque">
 <cd:artist>Bob Dylan</cd:artist>
 <cd:country>USA</cd:country>
 <cd:company>Columbia</cd:company>
 <cd:price>10.90</cd:price>
 <cd:year>1985</cd:year>
</rdf:Description>
<rdf:Description
rdf:about="http://www.recshop.fake/cd/Hide your heart">
 <cd:artist>Bonnie Tyler</cd:artist>
 <cd:country>UK</cd:country>
 <cd:company>CBS Records</cd:company>
 <cd:price>9.90</cd:price>
 <cd:year>1988</cd:year>
</rdf:Description>
.
.
.
</rdf:RDF>   

Description contains elements that describe the resource, while the root rdf: defines the XML which will be an RDF document (W3schools.com).

Applications

Non-Obvious Relationship Awareness/Identity Resolution

By linking the concepts together one can establish relationships that were previously not obvious. Such a technique is used by NORA (Non-Obvious Relationship Awareness) which was developed by Jeffrey Jonas for detecting cheaters in casinos. Casinos are required by law to refuse to do business with known cheaters or other flagged individuals. The vast number of people going through the Las Vegas casinos is a problem in terms of casino security. It makes it impossible for the security staff to detect them and thus the casino might actually lose its gaming license. A computerized system for recognizing these individuals was needed and Jonas developed NORA to satisfy this need. Several problems had to be addressed in order to create a successful system. One such problem was the fact that casinos do not store all their data centrally. Information systems such as hotel reservations, human resource management and vendors are disparate (Jonas, 2006[6]). Another issue is the fact that data entries might be accidentally or intentionally misspelled so traditional matching algorithms would not be able to give a proper match. Nora was able to address all these isuues by collecting data from all the different sources and establishing links between people that might seem unrelated by using a complex set of rules. For example Billy the Kid with SSN: 555-55-4124 might seem unrelated to The Kid with SSN:555-55-2144 in traditional systems, NORA however is able to detect that it is the same person (Kushner 2006 [7]). Another feature of NORA is that it can flag people even if they are not known offenders but fit the general profile of an offender.

Figure 3 – IBM Identity resolution example
Enlarge
Figure 3 – IBM Identity resolution example

NORA’s underlying code was later improved to create the IBM Identity resolution and relationship resolution (Jonas 2006). One of the key characteristics of IBM’s Identity Resolution is that it can resolve identities in real time as new data come in without the need of processing the all data again. Figures 3a and 3b show one example of the possibilities for IBM’s Identity resolution. Assume that two entries were found from different sources that seem unrelated. In this case the resolution engine identifies these two records as different people. In Figure 3b however a third record was entered in the system. This record has similarities among both the previous records. Record C has the same drivers license as record A and the same phone number as record B. Furthermore, the resolution engine is not fooled by the slight name difference (Randy and Randal have the same root). It is therefore able to link these three records together identifying them as the same person. Relationship awareness works concurrently with the Identity resolution engine and it’s able to update the relationship degrees in real time as well. One of the features that give an advantage to the relationship resolution engine is that it can assign relationship scores so that some relationships are more important than others. For example three records indicating that a person has been living at the same address for ten years will assign a grater relationship score to that address than a different address where the person only lived 3 months (Jonas 2006).

Ontologies

Ontologies in general

An ontology is an explicit definition of a domain which formalizes prior unstructured relationship and context. Concepts are defined within the ontology along with their properties and attributes. To structure the knowledge within a domain, constraints are applied on the properties and concepts. Within the ontology a common vocabulary and a shared understanding exists.

Why do we need ontologies

We need ontologies to share a common understanding of the structure of information among people or other systems. Doing so it enables the users to discover new information. For example consider the following two statements

A "is a" B

B "is a" C

by utilizing the relationships and concepts defined within an ontology we can conclude that A "is a" C. So by understanding inter-relationships among concepts and data, automated procedures can reason and infer new information. By using ontologies we don't have to re-invent the wheel every time by allowing to reuse domain knowledge. Furthermore, ontologies are simple to update and extend the domain knowledge.

Defining an ontology

In order to define an ontology we first need to define terms(concepts) in the domain and the relations among them. These concepts are considered to be objects (Classes) which allows for feature inheritance. The next step would be to arrange concepts in a hierarchy by assigning them to sub-classes and super-classes. Define attributes of the classes (Properties) and any constraints on their values and relations which will provide better knowledge organization. Finally we have to define individuals i.e. import the knowledge of the domain by creating instances. (http://it.ojp.gov)

Applications

In this section several ontologies are reviewed and analyzed by examining several of the concepts presented earlier.

Web Ontology Language (OWL)

OWL's predecessor was the DARPA Agent Markup Language (DAML). It integrated concepts such as resource description, to describe sets of facts. The markup language DAML+OIL (Ontology Inference Layer) that was developed at DARPA is the basis of OWL. [8]

OWL is a set of XML elements and attributes, with standardized meaning, that are used to define terms and their relationships [9]. OWL was created so that applications can process the data in a document easier as opposed to situation where the goal is for better document representation to humans[10]. It allows definition of classification systems with inheritance of properties and it builds on top of the Resource Description Framework (RDF). OWL comes in three versions which differ in the degree of relationships complexity that can be described by each[10]. These are:

  • OWL Lite Allows a Classification Hierarchy and Simple Constraints. Relationships between terms can only have a value of 0 or 1 meaning the relationship can be either true or not true at all.
  • OWL DL (Description Logic) Allows Maximum Expressiveness with Computational Completeness.
  • OWL Full Allow Complete Expressiveness and Syntactic Freedom but no Computational Guarantees so a statement could end up in an endless loop with no result produced.

An example of OWL statement is given below. Suppose we want to describe a ‘student’. Humans when they refer to a student they know that a student is a person. We need to define a person to a computer as well. So the definition of a person in OWL would be

<owl:Class rdf:ID="Person"/>

And then we need to define a student

<owl:Class rdf:ID="Student">

<rdfs:subClassOf rdf:resource="#Person"/>

</owl:Class>

Which is interpreted by a computer that a student is a sub Class of the Person Class.


Ontology building tools - Protégé

Protégé has been developed at Stanford and is an ontology editor. It is useful for putting together XML-based ontologies based on the OWL language.

An OWL ontology consists of classes, class properties linking two instances of classes together, and class instances (individuals). Protégé-OWL is a plugin for the ontology building tool Protégé. It allows for building OWL-compliant ontologies, implementing a rich set of operators, and allowing not only for concept definition, but also concept description. The logical model allows for a building block approach to ontology building, in which the user can describe complex concepts with the help of simple ones [11].

The steps in building an OWL ontology are as follows:

Class Creation

As a first step in building an ontology, we need to define a class hierarchy, which is often also called a "taxonomy". A class with the identifier owl:Thing is the class of all individuals. In contrast, the class with the identifier owl:Nothing is the empty class. Therefore, implicitly all classes created are subclasses of the class owl:Thing. This is comparable to the java.lang.Object class hierarchy root node in the Java programming language. There are no standardized naming conventions for classes, but applying a consistent naming scheme to all classes and subclasses ensures better human comprehension of the ontology. In OWL, classes are assumed to overlap by default, so that a member of one class could also be a member of another class. To prevent this, Protégé offers the option of making classes disjoint from each other.

Via the OWL Wizards plug-in, creation of classes and subclasses is very easy. In OWL, subclasses of classes are assumed to have necessary implication. This means that all instances of that subclass are then also instances of the superclass, without exception.

Lastly, Protégé allows for adding metadata annotations to classes. Those annotations can be made to conform to commonly accepted metadata standards, such as Dublin Core's metadata elements; e.g. Title, Date, Creator, Description, Subject. [12]

Creating Ontologies in Protégé

In order to illustrate how an ontology is built, we are demonstrating the use of Protégé to build a, albeit simple, terrorism ontology. We start by making classes, then add subclasses, then add properties and their domains and ranges. We also show distinctions between functional properties (features) and superproperties. Protégé also allows for defining restrictions for properties using set theory.

Click on the thumbnail to see a higher resolution of the image.

Global Justice XML Data Model (GJXDM)

Figure 19 - Part of GJXDM structure
Figure 19 - Part of GJXDM structure

GJXDM was developed by the Department of Justice and it was based on the Global Justice XML Data Dictionary. Structure and relationships were introduced to GJXDD and this created the GJXDM. The idea behind GJXDM is to extract the commonalities between concepts and give them appropriate structure [1] For example the arresting officer and the arrested subject both share common features such as name. Therefore we could have a more general category called ‘Person’ and both the ArrestingOfficer and the Subject would inherit features from the Person category. Then each type of person could have other features more specific to it. An ArrestingOfficer has a badge number where a Subject does not.

There are two main kinds of relationships in GJXDM these are the ‘is_a’ and ‘has_a’ type of relationships. ‘is_a’ type is used when one type inherits characteristics from another type. In the example given above a Subject ‘is_a’ person. On the other hand the ‘has’ relationship show what kind of information an object has. Again from the example given above this type of relationship would be an ArrestingOfficer ‘has_a’ BadgeNumber.

Using these types of relationships a good structure is created and ultimately all the types that exist in GJXDM inherit their characteristics from a generic type called SuperType. Figure 1 shows a part of the GJXDM structure and relationships between types and objects. A Person, Location and Drivers License are all SuperTypes shown by ‘is_a’ relationships. A Person however is associated with a residence and a Drivers license which is shown using ‘has_a’ type relationships. We can therefore conclude from this illustration that a Subject is a person therefore it has a drivers license and a residence associated with it.

EDXL

The Emergency Data Exchange Language (EDXL) has been chaired by Department of Homeland Security. It brings together law enforcement, public safety, emergency response, fire and rescue, and emergency medicine people. It provides a common message header, with agency-specific XML embedded [2]

EDXL consists of several elements. Those elements are specified as the Distribution Element, the Alert Message Set, the Resource Message Set, and the GIS Message Set.

Distribution Element [3]
Even though the distribution element as a container provides routing information for the enclosed message, that information does not guarantee actual physical routing of the message. The Distribution Element specification was ratified by the Organization for the Advancement of Structured Information Standards (OASIS) standard in June of 2006. [4] The EDXL Distribution element consists of three sub-elements:

  • <EDXLDistribution>
  • <targetArea>
  • <contentObject>

The <EDXLDistribution> element contains date information, information about the type of distribution (report, alert ...), sender and receiver information, distribution channel information, and addressing information.

An application of the Distribution Element could be to encapsulate and distribute one or more emergency messages. For example, if a radiological sensor at an airport triggered a dirty bomb alarm, authorities could decide to send multiple messages to several jurisdictions. One could imagine federal and state as possible levels. Authorities would in this case encapsulate two Common Alerting Protocal (CAP) messages; one to the area of the alert, and one to the emergency response agencies that a state of emergency has been declared and that they should stay on alert.

Here is an example of the distribution element section of such a message:

<?xml version="1.0" encoding="UTF-8" standalone="no"?> 
<EDXLDistribution xmlns="urn:oasis:names:tc:emergency:EDXL:DE:1.0"> 
<distributionID>Sandia001</distributionID> 
<senderID>dellis@sandia.gov</senderID>  
<dateTimeSent>2005-08-07T18:05:00-07:00</dateTimeSent> 
<distributionStatus>Actual</distributionStatus>  
<distributionType>Report</distributionType> 
<senderRole> 
 <valueListUrn>urn:sandia:gov:sensors:senderRole</valueListUrn>
 <value>SENTRY sensor managment system</value> 
</senderRole> 
<recipientRole>  
 <valueListUrn>urn:sandia:gov:sensors:reciepentRole</valueListUrn> 
 <value>Warning and Reporting Devices</value>  
 <value>Hazard Prediction applications</value> 
 <value>Situational Awarness applications</value>  
</recipientRole> 

  
<keyword> 
 <valueListUrn>urn:sandia:gov:sensors:keywords</valueListUrn> 
 <value>SNM Detection</value> 
</keyword> 

 <explicitAddress> 
   <scheme>DMIS COGs</scheme> 
   <value>1734</value> 
   <value>3520</value> 
 </explicitAddress> 

 <explicitAddress> 
   <scheme>e-mail</scheme> 
   <value>dellis@sandia.gov</value> 
 </explicitAddress> 
  
<combinedConfidentiality>Unclassified</combinedConfidentiality> 


The optional <targetArea> element describing the geographic or political target area for delivering the enclosed message is shown below. In general, the <targetArea> element specifies where the sender of the message wants the message to be sent. Its intent is to target dissemination of a message to the right places. Given our sample message container, below we can see an example of such a <targetArea> element and its attributes.

<targetArea> 
 <country>US</country> 
 <subdivision>California</subdivision> 
 <locCodeUN>USA-SF</locCodeUN>  
</targetArea>  

Finally, the <contentObject> sub-element allows for inclusion of the actual message content into the distribution element container. Such content can be either XML encoded or non-XML encoded. Furthermore, the content can be encrypted and signed. In our current example, XML-encoded content, specifically a Common Alerting Protocol (CAP) message, is included in the <contentObject>. Note that the <embeddedXMLContent> tag specifies the type of content as being a CAP message.

 <contentObject>  
  <contentKeyword>  
   <valueListUrn>"urn:sandia:gov:sensor:detection.event.id"</valueListUrn>  
   <value>10.2.2.1:2005-08-07T18:00:00Z</value>  
  </contentKeyword>  
  <confidentiality>Unclassified</confidentiality> 
  <xmlContent>  
  <embeddedXMLContent xmlns:cap1.0="http://www.incident.com/cap/1.0">  
   <EncryptedData xmlns="http://www.w3.org/2001/04/xmlenc#"  
Type="http://www.w3.org/2001/04/xmlenc#Element"> 
<EncryptionMethod Algorithm="http://www.w3.org/2001/04/xmlenc#tripledes-cbc"/> 
<KeyInfo xmlns="http://www.w3.org/2000/09/xmldsig#"> 
<KeyName>deskey.bin</KeyName>  
</KeyInfo> 
<CipherData> 
<CipherValue>GSCinwYBtwJxp6kcZPGqE6rybCfsnvI6Lz+IZVPqnRfnI1hWq7cI2WT4BsjBBQCu  
TE68pCQ/keOGtvYJ5yNVZEuAnIhOf37OEiqk1rcBARXb03LCYvlXYKA1zmEC5yFT  
CUcyCMV146G4eNU1H7F+wbMjbSgHjOYgYe+rpjOVYAK9Gs4Uj+CWhijjxpr5Y/vX  
1NEtHFhLsXC9cSfhXWVmi3veXwbDycC+QtcvQL/Rfr45bDwsJnCCutTzfmoqF1CS  
BgYUi6osW+XhoRkAttzKbRADVZ6bG5SMkZN0SKiwSaCyKyMKjdpiQwYQhjUXUoAn  
veBylXREqfmtOIm/pT7Y45pabWNG9l3aljil8P7qZ5Y26Q0X+i0U+eEGuafHrMVb  
S/QBpAkNbP5/f9UR3B4t5t7hLOsvDXdR6CWFBNsrczLjZ7YC2O+g1HBl8YsQdREA  
And3PKgoy8QlKv6ZLA+aJzQpSvzbSu3btgN6vyF3GGPqKprVIYRFouaJHYgL81zn  
zZovnH4lubwa+YPgD0H48a/FM2LaA8euPzMFDWIki0fm5DoZZzYCmPKmfLJS10RG  
lUKzW0svDw8I1AwX6LBssPm+hoBa7HzTnuM40FD+vsmET+p0bqBtaUSnDHrHXLzp  
P6TrcNr5R5cxQ4C+shwezFQDNKbioyC6m5PaTH/6qhlTmE32vP8ySnMKvL74QCfP  
w8hTZxwq9UVLPq2WKJcI0Phc1e3HoYkBTpVk9OUf/CVaxMXGOiXReeLXGPC1IQnn  
a6xw7ImkgeCFcY+rcttq2fE3UqWtc5R6J16/Jv666K9fgCbXRVhaBdMDYpz0GKFa  
gMJulUK6zTtah+bidtUrF31UWAX+wqIqmDFJ1ivJaRbLEiEVCrt0jKwOjuR41dDD  
VS5j2BuvmZ5TILnOnHFU6H4GudnwjpL01eLrwELSfKmbQmUx2A0L6NBj9SRkXXHW 
3nC51yn37laIO5KSvL5Gkb6jxFVrcUvrcp4pX5czw7/VWbxWoRPY7Bus3akhPu+I  
/jchv7SiVRP5mX4Ewh+yeduYX+UZLo07m5zhAMtmFdiLJV9tgHVTJf7ZJ3bGWP+h  
Et0Nl98hGV362cSRhkoLJwNmOgIpGXSMO6T5nA1MZhJ6CkCP8QV1zpKrLVJSRZRn  
BDDIiH+f6+6aUno6fotGUA==</CipherValue>  
</CipherData>  
</EncryptedData>  
  </embeddedXMLContent>  
  </xmlContent>  
 <Signature xmlns="http://www.w3.org/2000/09/xmldsig#"> 
<SignedInfo>  
<CanonicalizationMethod Algorithm="http://www.w3.org/2001/10/xml-exc-c14n#"/> 
<SignatureMethod Algorithm="http://www.w3.org/2000/09/xmldsig#rsa-sha1"/>  
<Reference> 
<Transforms>  
<Transform Algorithm="http://www.w3.org/2000/09/xmldsig#enveloped-signature"/>  
</Transforms> 
<DigestMethod Algorithm="http://www.w3.org/2000/09/xmldsig#sha1"/> 
<DigestValue>przigAg811cHIqSXpIrFg1BGx20=</DigestValue> 
</Reference> 
</SignedInfo>  
<SignatureValue>BH8MGSS9QAlgq7a7I7PF5XjKsqaDumTt3cSWxBmwErByvQuaarOgH6MMflVLkG0Y 
tv6zaOqR6Kis4giTqtZBo8QCkGukpre2gurdi3Ws0yO3Wt8nWrcH3QAUllhocXpV  
gXahZ8MzHc9zuJq9+bl+S72czTjS0UdCvk/MMRV/xhwZ/1QSn+ffh0s3RU6Cn1Q0 
hycJGVdUpY0a+1r0quu2pmdLZnIQVY1stWNFS3wI9RzdslwzoGP9/nRARGS0kLf1  
De+WB4Xdar48A9WJwng0iA==</SignatureValue>  
<KeyInfo> 
<KeyName>rsakey.pem</KeyName>  
</KeyInfo> 
</Signature></contentObject> 

Resource Message Set [5]
The Resource Message Set as of 2007 is still a draft specification. It currently defines 16 message types related to communication that helps distribute resources among agencies before, during, or after emergency events. Commmunication happens between suppliers and consumers of a needed resource. Examples of message types include:

  • Request Resource, sent by resource consumer
  • Response to Request Resource, sent by resource supplier
  • Commit Resource, sent by resource supplier
  • Request Resource Deployment Status, can be sent by both consumer and supplier
  • Report Resource Deployment Status, , can be sent by both consumer and supplier


Figure 20 - Abstract Reference Model of Resource Message Set
Figure 20 - Abstract Reference Model of Resource Message Set

As shown in Figure 20, Resource Message is the name of the EDXL Payload. Message stands for the specific message, which can be either request or response overall type, Party describes all individuals and organizations associated with the message, Resource describes the actual resource being requested or acknowledged (e.g. equipment, water, medicine), Schedule describes when the message is needed, when it arrived, and other temporal information. Assigment stands for geospatial or jurisdictional information describing to whom the message is assigned.
Alert Message Set [6]
The Common Alerting Protocol (CAP) became an official OASIS standard in 2004. [7] A CAP message consists of an alert segment with e.g. purpose and status information, an info segment with urgency and severity information, a resource segment containing documents related to the alert, as well as an area segment containing location information. CAP is now adopted as a payload within EDXL container messages' distribution element, specifically the content element:

<contentOject>
<Payload  xmlns:cap="http://www.incident.com/CAP/1.0";>
<cap:identifier>SENTRY_01</cap:identifier>
<cap:sender>SENTRY</cap:sender>
<cap:sent>2005-05-01T18:08:00-06:00</cap:sent>
<cap:status>Test</cap:status> 
<cap:msgType>Alert</cap:msgType>
<cap:scope>Public</cap:scope> 
<cap:note>This comment is Extraordinarily tremendous!!</cap:note>
<cap:incidents>MWM-13407</cap:incidents> 
 <cap:info>	
   <cap:category>Other</cap:category>
   <cap:event>chemical</cap:event>
   <cap:urgency>Immediate</cap:urgency>
   <cap:severity>Unknown</cap:severity>
   <cap:certainty>Unknown</cap:certainty>
   <cap:senderName>SENTRY</cap:senderName>
   <cap:headline>LV 2 CWA Low Blister</cap:headline>
    <cap:description>SENTRY has detected a(n) chemical event: MWM-13407 from Site: Tiedown. Its Primary Source is: LV 2  
CWA Low Blister, it has a priority of 4, and is located at -77.099953,38.800035  Comment:  This comment is Extraordinarily 
tremendous!!
</cap:description>
<cap:resource>
<cap:resourceDesc>Video clip from Sensor: CWDET-01, Site: Tiedown, having a Detection class of CWDET.Video specs: Length: 23 Type:   
mpeg Channel Band: Channel A
</cap:resourceDesc>
<cap:uri>http://sentry/videoClip1</cap:uri>
</cap:resource>
<cap:resource>
<cap:resourceDesc>Photo image from Sensor: CWDET-01, Site: Tiedown, having a Detection class of CWDET. Photo specs:  Type:  jpeg    
Channel Band: Channel B
</cap:resourceDesc>
<cap:uri>http://sentry/photoCapture</cap:uri>
</cap:resource>
</cap:info>
</Payload>
</contentOject>  

As we can see, the content describes urgency and other attributes, and also provides a CAP resource, in this case an image from the sensor.

GIS Message Set [6]
This message set ais not implemented as part of EDXL yet. It provides geospatial context and helps identify, track, and forecast emergency response resources.


To summarize, it is the Distribution Element that carries all message sets in an open access container. It specified the message type, sender id, and sections for the message sets. [8]

Case Studies

National Information Exchange Model (NIEM)

A number of situations like terrorist attacks, natural disasters and disease outbreaks, has shown the weaknesses in infrastructure that exist today in the US. The number of agencies involved in these kinds of cases is great and a standard way of exchanging information among those agencies was needed. NIEM[9] was developed by a joint effort of the Department of Justice and the Department of Homeland Security to improve information exchange process. NIEM is based on the GJXDM 3.1 standard which most of its concepts are used as the core for NIEM 1.0. GJXDM 3.1 currently handles the justice domain within NIEM 1.0. Its purpose is to exchange/share information among 7 domains (currently) including :

  • Emergency Management
  • Immigration
  • Infrastructure Protection
  • Intelligence
  • International Trade
  • Justice
  • Person Screening
Instead of trying to integrate all the domains by restructuring them, NIEM will provide a common framework that will normalize information exchange between domains.
Figure 21 - NIEM Core and its domains
Figure 21 - NIEM Core and its domains
Shown in Figure 5 is a depiction of the NIEM Core and its intended domains. The ability to exchange information is not limited to these NIEM domains but it can be extended to other conforming and future domains as well. NIEM tries to normalize common concepts between domains and how these are exchanged but at the same time allows for domain specific definitions of concepts. This is achieved by the NIEM core which provides the common framework for all the domains.

NIEM Core

There are three main components that make up the NIEM core. These are the universal namespace, common namespace and structures. The universal namespace includes the concepts that are shared among all of the domains. The baseline for the universal namespace was migrated from GJXDM. The common namespace includes the concepts that are common between some of the domains in NIEM. The other domains do not need these definitions. Finally, the structure component in the NIEM core provides data structures for associations, roles, metadata, type augmentation, content references, etc. One thing to note is that the common namespace includes all the definitions of the universal namespace. In the same way the namespace for any given domain will include the common namespace and by extension the universal namespace as well.

Naming convention

In order to take each concept definition in context depending on which domain it originates from a naming convention was adopted. Each namespace (universal, common and domain specific) is assigned a prefix which is added to each of the concepts. Table 1 shows the prefixes for each namespace. For example if we want to refer to a person defined in universal namespace this would be u:PersonType . If we wanted to refer to the person's eye color (which is not defined in the universal namespace) this would be represented as c:PersonEyeColor.

Table 1 – Prefixes for each namespace
Namespace Prefix
Universal u
Common c
Justice j
Emergency e
Imigration im
Infrastructure Protection ip
Intelligence intel
International Trade it
Person Screening scr

For example consider the concept of a person which is universal among the domains.Attributes like Name, sex and date of birth are also universal attributes. However, different domains such as Justice will also need the biometrics and criminal history for a person and in the Immigration domain the passport and visa are needed. These are domain specific attributes and are not used in the information exchange since the other domains do not need them. All domains however share the same definition for a person as it is defined by the NIEM core. This is illustrated in Figure 6 below.


Figure 22 - Universal and domain specific attributes
Figure 22 - Universal and domain specific attributes

Each of the domains will be using its own ontology which must conform to the NIEM standard. The data exchange between domains will be carried out by Information Exchange Packages (IEPs) which are completely defined by Information Exchange Package Documentations (IEPDs). An IEPD itself is a specification for a data exchange and defines a particular data exchange.

Usage Scenario

To understand the power that NIEM will provide consider the following scenario which was adapted from NIEM documentation. U.S. Border Patrol agent goes to investigate a sensor activation on the borders. When he reaches the area he sees that several illegal immigrants are trying to cross the border. He is able to arrest most of them however two are able to flee. The border agent then calls in with the description and firection of the illegal immigrants to coordinate a search. Meanwhile a citizen calls in and reports that he saw two people that looked suspicious boarding a truck. The information is relayed to police patrols which stop the vehicle and perform a check of its passengers. During the check it turns out that the two men onboard match the description given by the patrol agent and they are taken into custody. After the two suspects are being processed for fingerprints, their true identity is revealed and it matched to that from the terrorist watchlist.

A scenario like this requires interoperability between three different agencies and currently it is not possible without NIEM. The IEPDs that are defined in NIEM allow to exchange data between the Border security and the Law enforcement in real time thus increasing the efficiency of the operations. At the same time each agency (domain) is able to process the incoming information with what they have available and if they find something of interest transmit it to the other agencies so they can process it.

Terrorist Related Assessment using Knowledge Similarity (TRAKS)

This case study examines a prototype system from University of Georgia. It seeks to detect suspicious activities in the domains of money laundering, id theft, and terrorist attacks based on a concept called "Semantic Similarity". [10]

Figure 23. TRAKS Architecture
Figure 23. TRAKS Architecture


As Figure 23 shows, TRAKS relies on templates, ontologies, datasets, semantic similarity matching, and ranking.

Template and Core Template The template is a set of classes and relationships that form a connected graph where the classes are nodes and the relationships connect the classes. The set of interconnected classes and relationships describes a given scenario in the domain, such as money laundering or terrorism activities. The scenario template can be used as an abstraction that helps users find data matches and test their hypotheses regarding the data.

The core template is a subset of the classes and relationships in the main template. This allows for filtering out irrelevant information and for more precise matching of scenarios to real world data. TRAKS uses OWL for describing the templates.

Semantic Similarity Two entities are similar if they belong to the same class, or have a common parent (super) class, or if one of the two entities is in a parent class of the other entity's class. The same class similarity statement can be made for class relationships. For example, John, an undergraduate student, and Anja, a graduate student, are similar because they both belong to the same parent class "student".

Semantic Ranking Semantic Ranking considers the overall ranking of data in terms of matching distance of each instance to a class and relationship in the template. This helps determine the quality of the match between real world data and a given template. As previous research has shown, entities located deeper in the hierarchy of the ontology are more specialized than those higher in the ontology. Intuitively, we can tell this from the student example mentioned before. [11]

TRAKS uses an ontology from the NSF-funded project “Semantic Association Identification and Knowledge Discovery for National Security Applications”. [12] In conclusion, this prototype provides semantic matching of data instances to template patterns, augmented by visualization. It has potential to help agents in the field because the ontology basis is so flexible and because it uses open APIs and ontology languages.

Conclusion

By manipulating concepts and their relationships some very interesting results can be obtained when applied to the field of Intelligence and Security Informatics. NORA gives a good example where simple rules are used to to link concepts and its able to get results in real time. Ontologies organize concepts within a given domain allowing us to perform knowledge discovery and retrieval to enhance security in the local or even nationwide level. The foundations for security informatics ontologies are being set and tools like Protege make it easy to create ontologies in a few steps. NIEM is a good example of how ontologies help interoperability by sharing a common understanding with all the domains.

References

  1. http://www.ncsconline.org/D_Tech/gjxdm/ctc9/articles/carlson.pdf
  2. Fullerton, G. & Hines, C. (2004). “Emergency Data Exchange Language”. XML 2004 Proceedings. Washington D.C., U.S.A.
  3. Emergency Data Exchange Language (EDXL) Distribution Element, v. 1.0. http://docs.oasis-open.org/emergency/edxl-de/v1.0/EDXL-DE_Spec_v1.0.pdf
  4. OASIS News - Emergency Data Exchange Language (EDXL) Distribution Element Ratified as OASIS Standard. http://www.oasis-open.org/news/oasis-news-2006-06-20.php
  5. Emergency Data Exchange Language Resource Messaging (EDXL-RM) 1.0 - Committee Draft. http://docs.oasis-open.org/emergency/edxl-rm/v1.0/EDXL-RM-SPEC-V1.0.pdf
  6. 6.0 6.1 EDXL Overview and Phased Approach. http://xml.coverpages.org/EDXL-OverviewApproachSummary200503.doc
  7. Common Alerting Protocol v1.1 (2005). Common Alerting Protocol V1.1
  8. Cover Pages: Emergency Data Exchange Language http://xml.coverpages.org/edxl.html
  9. http://www.niem.gov
  10. Aleman-Meza, B., Halaschek, C., Sahoo, S.S. (2005). "TRAKS: Terrorist Related Assessment using Knowledge Similarity". http://lsdis.cs.uga.edu/proj/traks/about/final_report.pdf
  11. Rodriguez, M., Egenhofer, M. (2003) “Determining Semantic Similarity among Entity Classes from Different Ontologies”, IEEE Transactions on Knowledge and Data Engineering, Vol. 15, No. 2
  12. Sheth, A., Aleman-Meza, B., et al. "Semantic Association Identification and Knowledge Discovery for National Security Applications." Technical Memorandum 03-009, August 2003

Online Resources

  • Details on how the federal government spends its money on IT security (agency budgets, priorities, gov security trends, etc.) - Security Section of the GovITwiki.
Personal tools