July 12, 2012 — Poor Richard

[The following is a new draft addition to the PeerPoint Open Requirements Definition and Design Specification Proposal (currently a shared Google Doc). The PeerPoint project is an open and collaborative effort to develop requirements, standards, and specifications for peer-to-peer internet technologies that will promote fair and sustainable societies. On-going updates to this topic will be made at the above link. Your collaboration is invited! – PR]

PeerPoint Identity Management

The first step in defining the problem space of identity management is to define identity. What is it? From The Free Dictionary (tfd.com):

identity: 1. The collective aspect of the set of characteristics by which a thing is definitively recognizable or known

PeerPoint Terms and Definitions

entity : anything that has a definite, recognizable identity, whether a person, group, organization, place, object, computer, mobile device, concept, etc.

: anything that has a definite, recognizable identity, whether a person, group, organization, place, object, computer, mobile device, concept, etc. attribute : any characteristic, property, quality, trait, etc. that is inherent in or attributed to an entity. An entity has one or more attributes and an attribute has one or more values. For example “the sky (entity) has color (attribute) of blue (value).” This entity-attribute-value (EAV) model is sometimes called a “triple” as in the Resource Description Framework (RDF). An attribute (which is also a kind of entity) may have attributes of its own. These are often logically nested in a hierarchical fashion. For example, an address may be an attribute of a company but also an entity with attributes of street, city, state, etc. An entity may have multiple instances of the same attributes, such as multiple aliases or addresses. (Different programming languages, protocols, frameworks, and applications may organize the entity-attribute-value model differently; or use different terms such as object for entity or property for attribute; but this is probably the most generic approach.)

: any characteristic, property, quality, trait, etc. that is inherent in or attributed to an entity. An entity has one or more attributes and an attribute has one or more values. For example “the sky (entity) has color (attribute) of blue (value).” This entity-attribute-value (EAV) model is sometimes called a “triple” as in the Resource Description Framework (RDF). An attribute (which is also a kind of entity) may have attributes of its own. These are often logically nested in a hierarchical fashion. For example, an address may be an attribute of a company but also an entity with attributes of street, city, state, etc. An entity may have multiple instances of the same attributes, such as multiple aliases or addresses. (Different programming languages, protocols, frameworks, and applications may organize the differently; or use different terms such as object for entity or property for attribute; but this is probably the most generic approach.) identity : a definitive and recognizable set of attribute-value pairs (or entity-attribute-value triples) for a particular entity. The set of attribute-value pairs may be partial or exhaustive, depending on the intended purpose of the identity construct.

identification ( ID ): a dataset (value, record, file, etc) which represents the most concise amount of information required to specify a particular entity and distinguish it from others. An ID may be local to a particular context, such as a company employee ID or inventory number, or it may be universal. Examples of universal ID are Global Trade Item Numbers (GTIN) and uniform resource identifiers (URI). The ID typically consists of a smaller quantity of data than the full identity dataset and only represents or refers to the full identity.

Identity management problem space

The PeerPoint requirements will explore various parts of the Identity Management problem space, all of which overlap or interpenetrate each other:

1. Description

Description is meant here in its most general sense as the entire set of attributes and values that describe an entity, and not simply a “description” box or field in a record. This is the aspect of identity management which establishes the set of attributes and values (or profile) by which an entity is typically recognizable or known in a particular context. A description can attempt to be exhaustive, but in most cases it is only as complete as required for its intended purpose in a given application.

PeerPoint requirements

Identity management functions should be consistent across all PeerPoint applications, so the requirements should be implemented as part of a PeerPoint system library from which all applications, middleware, APIs, etc. can call the necessary functions. Interfaces or connectors must be provided for non-PeerPoint-compatable systems.

There are many methods in existing software applications, protocols, and frameworks to describe the identity of entities. The PeerPoint identity management solutions must inter-operate with as many of these as possible. For that reason the PeerPoint descriptions of entities must be as generic, modular, composable, and extensible (open-ended) as possible.

PeerPoint user interfaces (UI) must allow users to extend and customize entity descriptions in as intuitive a manner as possible without reducing or destroying the interoperability of the descriptions with those of other platforms. One approach is to provide user input forms with the most common or universal attributes for various types of entities, combined with fields for additional user-defined attribute-value pairs as well as simple tags.

In both standardized and customizable parts of entity descriptions, the UI should provide as much guidance as possible about the most typical names and/or value ranges for attributes without locking the user in to these “preferred” or popular choices.

One of the most basic entities in social networking systems is the person or member (or in more abstract terms, an account). The identity description for such an entity is commonly called a “user profile.” User profiles are also found in most applications that involve online collaboration. The most primitive form of user account consists of a user ID (or UID) and a password, where both the ID and password are simple alphanumeric strings. But increasingly, user accounts for social and collaborative applications include elaborate user profiles. Facebook is a good example, having one of the most extensive user profiles of any internet application.

This is a partial screenshot of Poor Richard’s Facebook Profile:

The information in a Facebook User Profile is organized into numerous logical categories. Some not shown above include the user’s friends, Facebook groups to which the user belongs, and a personal library of documents and images. Other profile sections include unlimited free-form text.

Many of the profile data categories such as “Arts and Entertainment” may include unlimited numbers of “likes” or tags. These are added via an intuitive interface in which the user begins typing something such as a-r-e-t-h-a- -f-r-a-n-k… and as the user types, a list of matching tags is displayed and continuously updated with each keystroke, showing possible matches from the Facebook database. If no match is found by the end of typing, the entered tag label is displayed as-is with a generic icon. Facebook’s database of entities in the various categories is created and maintained primarily by Facebook users who create Facebook “pages” for people, groups, companies, products, movies, authors, artists, etc.

Other social network sites have profile features not found in the Facebook User Profile. Google + adds a feature to the “friends” data category called “circles” and a homepage feature called “hangouts”. Google + users can organize friends into user-defined categories called circles that inter-operate with other Google apps, and can create live audio-video chat groups with user-defined membership. LinkedIn has additional profile data categories for resumes, cvs, and employment references, recommendations or testimonials.

In addition to users, on various social networks accounts may be created for special-interest groups, fan clubs, companies, organizations, and topic pages of all kinds. The structures of the profiles for different types of accounts on different networks vary widely.

Very limited, generic profiles are also hosted by services such as Gravatar and About.me.

Sample Gravatar profile:

OpenID Simple Registation is an extension to the OpenID Authentication protocol that allows for very light-weight profile exchange. It is designed to pass eight commonly requested pieces of information when an End User goes to register a new account with a web service.

Gravatar and OpenID SR are simple examples of what PeerPoint will call a meta-profile (a profile that can be used across multiple applications or systems).

PeerPoint requirements:

Digital identity, representation of a set of claims made by one digital subject about itself or another digital subject

Online identity, social identity that an internet user establishes in online communities and websites

Federated identity, assembled identity of a person’s user information, stored across multiple distinct identity management systems

the capability to create and maintain identity meta-profiles for users and other types of entity

the ability to create multiple alternate profiles for the same entity

intuitive user interface for creating, customizing, and maintaining meta-profiles

allow the creator of any identity profile to determine where any portion of it is stored and with whom any portion of it is shared

capability to synchronize PeerPoint profiles with profiles in non-PeerPoint applications and systems

2. Classification (“people, places, and things”)

Different kinds of entities have different kinds of descriptions, so an important part of the identity management problem is the problem of sorting things into various categories. Sorting things into categories or classes is often called categorization or classification. Classification systems are often called taxonomies. Examples might include the index of an encyclopedia, a library card catalog, or a glossary of internet terms.

In the case of information systems, the term ontology means “a rigorous and exhaustive organization of some knowledge domain that is usually hierarchical and contains all the relevant entities and their relations.” (tfd.com) Wikipedia says “An ontology renders shared vocabulary and taxonomy which models a domain with the definition of objects and/or concepts and their properties and relations. Ontologies are the structural frameworks for organizing information and are used in artificial intelligence, the Semantic Web, systems engineering, software engineering, biomedical informatics, library science, enterprise bookmarking, and information architecture as a form of knowledge representation about the world or some part of it. The creation of domain ontologies is also fundamental to the definition and use of an enterprise architecture framework.

Another related term in information systems is namespace, often used in relation to wiki structures and directory services.

In identity management, two of the main systems of categories, or taxonomies, would be categories of entities and categories of attributes. Attributes are themselves categories of values (the attribute “color” is a category of colors: red, blue, green, etc.).

Examples of high-level categories of entities might include:

people

groups

organizations

places

internet technologies

devices

Examples of very high-level categories of attributes could include:

These taxonomies become semantic web ontologies when they are defined in machine-readable protocols such as:

Linked Data

One great advantage of machine-readable ontologies is the ability to semantically link data across the web.

Linking open-data community project

The goal of the W3C Semantic Web Education and Outreach group’s Linking Open Data community project is to extend the Web with a data commons by publishing various open datasets as RDF on the Web and by setting RDF links between data items from different data sources. In October 2007, datasets consisted of over two billion RDF triples, which were interlinked by over two million RDF links. By September 2011 this had grown to 31 billion RDF triples, interlinked by around 504 million RDF links. There is also an interactive visualization of the linked data sets to browse through the cloud.

Dataset instance and class relationships

Clickable diagrams that show the individual datasets and their relationships within the DBpedia-spawned LOD cloud, as shown by the figures to the right, are:

3. Identity provisioning and discovery (directory services, including identity & directory linking, mapping, and federation)

(requirements to be determined)

4. Authentication (validation, verification, security token service)

(requirements to be determined)

5. Authorization (access control, role-based access control, single sign on)

(requirements to be determined)

6. Security (anonymity, vulnerabilities, risk management)

(requirements to be determined)

1. User Control and Consent: Digital identity systems must only reveal information identifying a user with the user’s consent. (Starts here…) 2. Limited Disclosure for Limited Use The solution which discloses the least identifying information and best limits its use is the most stable, long-term solution. (Starts here…) 3. The Law of Fewest Parties Digital identity systems must limit disclosure of identifying information to parties having a necessary and justifiable place in a given identity relationship. (Starts here…) 4. Directed Identity A universal identity metasystem must support both “omnidirectional” identifiers for use by public entities and “unidirectional” identifiers for private entities, thus facilitating discovery while preventing unnecessary release of correlation handles. (Starts here…) 5. Pluralism of Operators and Technologies: A universal identity metasystem must channel and enable the interworking of multiple identity technologies run by multiple identity providers. (Starts here…) 6. Human Integration: A unifying identity metasystem must define the human user as a component integrated through protected and unambiguous human-machine communications. (Starts here…) 7. Consistent Experience Across Contexts: A unifying identity metasystem must provide a simple consistent experience while enabling separation of contexts through multiple operators and technologies. (Starts here…)

Additional Resources:

Decentralized Identifiers (DIDs) v0.11