Rationale

From DocWiki

Jump to: navigation, search
Overall Content Model
Activity Relationship


The Rationale for this DocWiki in relation to its Semantic Government offering is itself part of a broader rationale arising from general trends in information development. Information technology is entering an era of unparalleled change and opportunity.

This document lays out seven rationales for the SemGov offering, and justification for the 'total open solution' and this DocWiki that supports it.The document is presented in the context of the Citizen Dan CIS use case, though other software frameworks could be used in its stead. The complement to this document are the Guiding Principles for the Open Semantic Government.


Contents

Reason #1: Information is Critical to Success

Information Development in the 21st Century

The IT capabilities that will be needed in the future will be different and will be driven by the need for information.

In spite of the claims of easy storage and retrieval of information surrounding the introduction of database concepts beginning in the 1980s, implemented systems, for the most part, have not lived up to the promise. Function and data are still tied up in vertical silos. Change is costly and reuse of function is minimal.

Removing cost from government while providing flexible access to integrated information about citizens and services is still the goal. To achieve this, a more formal and yet flexible approach to ‘information development’ is needed than what is being done today. This is needed as the definition of an “application” will continue to change as architectures become more federated, more integrated and software continues to progress to open source.

At the same time, delays in information are becoming less and less acceptable to government. As private sector entities embrace more open and responsive delivery of information, so too must governments respond.

Overarching these factors is the trend toward more open data and transparency. This means IT systems must evolve to embrace different delivery and publishing methods, as well as to provide the means for citizen participation and crowdsourding.

Reason #2: IT Commitments May Last Decades

Lifecycle IT implications and costs may be paradoxically stymieing innovation.

Two decades ago most large software vendors made on average 75% to 80% of their revenues from software licenses and maintenance fees; quite the opposite is true today[1]. The successful vendors have moved into consulting and services. One only needs look to three of the largest providers of enterprise software of the past two decades — IBM, Oracle and HP — to see evidence of this trend.

These suppliers are experienced hands in the enterprise and know what any seasoned IT manager knows: the total lifecycle costs of software and IT reside in maintenance, training, uptime and adaptation. Once installed and deployed, these systems assume a life of their own, with actual use lifetimes that can approach two to three decades.

As IT has matured within enterprises and government, IT managers and users have naturally come to realize that initial software purchases carry with them legacy requirements and long-term commitments. It is no longer enough or adequate to say that Software Application A has better functionality than Software Application B. What are the differences in the support infrastructure? What are the total lifecycle costs?

In some respects, this understanding of IT implications has led to more conservatism and less innovation at times. Fear of longer-term implications can stymie new approaches and experimentation today.

Reason #3: The Cost-Benefit Ratio is Not Sustainable

Governments spend too much money on IT to consider a proper architecture design just a “technology problem.”

Government spend an enormous amount of money on information technology; with integration usually being largest component of this cost. The changing enterprise environment and constantly evolving technology landscape can render even recent IT investments obsolete.

These expenditures have all too often not been justified in terms of realized benefits. This comes from the legacy effects noted above, but also from proprietary solutions that limit competition and substitution. At the broadest level, architectures and data models and design have also not lent themselves to integration and interoperability.

Although change and rework will always be necessary, recent developments suggest that Web-oriented architectures built around services and open standards-technology can be adaptable and scalable. This design provides the best opportunity for build an integration infrastructure that will be reusable into the future.

The best architecture is one that allows the plug-in of multiple solutions and multiple vendors the provide competition and prevent vendor "lock-in".

Reason #4: Integration and Federation Still a Pipe Dream

The Stealth Layer for Information Development

Diversity of approach is a reality and imposed standards or proprietary systems have failed to promote integration.

Every government knows and understands that individuals and departments embrace their own ‘little’ reporting systems and applications, which leads to a diversity of data formats and semantics about what the data means. Attempts to impose top-down solutions to this diversity have uniformly failed.

Across the information management environment there is a "stealth" layer of data marts and flat files used for reporting. These shared repositories typically support reporting and analysis across the enterprise. Typically they are populated by hundreds of downstream feeds from production environments. Most of these downstream feeds have just evolved over time with little documentation to support maintenance.

These information silos are often times a major barrier in transitioning from legacy systems to new production environments. This is because decommissioning the systems involves more than just the data in the source system - there are now a number of dependent systems that use this core data in different fashions. These dependencies make planning and execution of decommissioning very complex and oftentimes risky from a business continuity perspective.

Failure to accept these realities and to actually embrace the value in diverse, legacy systems has been a wrong-headed mindset. Only by changing the assumptions and respecting the information value that already exists will lead to new and adaptive approaches to information interoperability.

Reason #5: Vendor Lock-in Heading South

Open standards, open data, open source and Web-oriented architectures all point the way to a new way to do IT

More than three decades of consistent IT failures and disappoints clearly point to the need for new ways. Vendor lock-in has proven expensive and no real solution to interoperability.

The emerging view is that open standards, open data, open source and Web-oriented architectures all point the way to a new way to do IT. Parts of this picture have proven hugely successful for some Internet-base firms and services. But, the general transition is also quite young, without many good exemplars or case studies to point to in the government sector.

So, while new ways of doing the IT business look compelling and justified, the maturity and support infrastructure for doing so is still rather weak. Besides new technology, new ways and methods must also be adopted to overcome these weaknesses.

Reason #6: Open Source Promising, But Still Incomplete

Open source alone does not address the full spectrum of software use and adoption.

Open source has been looked to as one means to avoid vendor “lock-in”. And, certainly, the first generation of open source has been a substitute for upfront proprietary licenses. Some open source software, especially in databases (MySQL) and content management systems (Drupal, Joomla, WordPress, etc.) have also seen some success in government IT.

So, while comfort and broader use have increased the acceptability of open source somewhat in the IT department, widespread adoption is still lacking. Why might that be?

The answer to this question resides in the concerns and anxieties noted above. While government IT does not like “lock-in”, it likes even less seeing stranded investments. For open source to be successful, it needs to adopt a strategy that actively extends its traditional basis in open code. It needs to embrace complete documentation, provision of the methods and systems necessary for independent maintenance, and total lifecycle commitments. In short, open source needs to transition from code to systems.

Reason #7: Collaboration: All Hands on Deck

The new trend to openness means that all players -- from colleagues to citizens -- are part of the solution.

The premise of open source is based on many eyes and many contributors adding to the code base, and finding and fixing problems. It increasingly has also come to mean many providing documentation and support.

As governments extend their purview and services into open data and more openness and transparency, these same principles will need to extend to the general public and citizenry. It is why we are seeing cooperation amongst governments at many levels and the growing use of "crowdsourcing" and direct public engagement. The experience gained from social networks and a general comfort with such systems are paving the way.

Collaboration needs to increase at all levels from software and IT development and support to data and its interpretation. Collaboration also means that software and systems must be designed and embraced to support such interaction, as well as changing methods and ways of doing business and providing services.

When done right, we see the immense power of network effects and the cost and productivity benefits that result. By changing mindsets and traditional ways of doing things, all stakeholders can benefit.

The Time is Ripe for 'Total Open Solutions'

A 'Total Open Solution'
In the context of SemGov and this DocWiki, these reasons lead to the rationale for a design approach we call the total open solution. It involves — in addition to the software, of course — recipes, methods, and complete documentation useful for full-life deployments.

A total open solution requires the support and interaction of four components. These components, or legs if you will, provide in their totality a stable foundation for a complete open source solution. These four legs are software, structure, methods and documentation.

In the concluding sections below, we will fit these pieces together and describe what each of them means.

Fitting the Pieces Together

First, let's begin from that point that a useful application or software system has been provided. In the case of SemGov and its community indicator systems, the example use case is the Citizen Dan software. The question now arises, what else is needed for an entity to adopt and maintain this software with minimal or no outside assistance?[2]

To achieve these aims in relation to a total open solution, a number of desires and objectives intersected to guide the design of the DocWiki system:

The Full DocWiki System
  • A consolidated knowledge base with complete, turnkey implementation content
  • A collaborative document authoring system with authoring tools comfortable to most knowledge workers
  • A version control system to enable rollbacks and restoration of prior official versions
  • A system that would enable and facilitate the collection and import of relevant content; in our own case, that included widely distributed internal content in many forms and locations plus relevant external content (such as defined items in Wikipedia)
  • A document management framework that would allow existing content to be mixed, combined and re-purposed for different uses, from training to marketing collateral
  • A single source publishing system that would allow content to be published as paper documents, PDFs, Web pages and the like
  • A system that could be easily themed, skinned and branded, tailored for any given deployment or circumstance, and
  • A system built entirely from open source components and with content that had no restrictions on use or re-use.

In addition, we also wanted directions for how to use and maintain the structure that "drives" the Citizen Dan application.

Ultimately the trump card that decided the design for DocWiki was familiarity and ease-of-use. The net result is a design for the documentation and methods portions that involves a single software download install of Mediawiki with a few extensions (most based around Semantic Mediawiki).

To better understand this full architecture, we will describe the individual pieces of the knowledge base (documentation and methods content), the structure, and the collaboration (wiki) and publishing portions.

The Knowledge Base

The pre-loaded content for the DocWiki system comes from its knowledge base (what we are working with now). This is provided as a text-exported MySQL database that can be modified en masse before loading (such as substituting ‘YourName’ for ‘DocWiki’). The exemplar upon which this knowledge base is modeled is the MIKE2.0 framework.

The DocWiki Knowledge Base
The ‘DocWiki’ Knowledge Base

MIKE2.0 (Method for an Integrated Knowledge Environment ) provides a comprehensive methodology that can be applied across a number of different projects within the information management space. MIKE2.0 provides an organized way to describe the why, when, how and who of information management projects. Via standard templates and structures, MIKE2.0 provides a consistent basis to describe and manage these projects, and in a way that helps promote their interoperability and consistency across the enterprise.

MIKE2.0 has a generalized methodology and set of templates applicable to initiatives, the phases, activities and tasks to undertake them, and supporting assets. Supporting assets can range from glossaries and definition of terms and concepts to very specific technical documents or background material. The entire system is logical and applies a consistent design and organizational structure and categories.

For DocWiki, we wanted a complete, turnkey content knowledge base. This meant that we needed to accommodate all forms of project management and guidance, ranging from specific “how-to” and technical discussions to the entire suite of background and supporting material. The scope of this knowledge content is defined as what a new person assigned a lead or implementation responsibility would need to read or master.

As a destination site MIKE2.0 is quite broad: it embraces the ability to model virtually any information management initiative. This makes MIKE2.0 an invaluable source of structure and methodology guidance, but also results in it being quite limited in the specific how-tos associated with any given initiative.[3] The strength of MIKE2.0, however, is that its structure can be grabbed and quickly applied to form an organizational and structural basis for filling out the knowledge base for any specific information development initiative. And, that is exactly what was done with this CIS DocWiki.

MIKE2.0 hosts and maintains its project-related structure in Mediawiki (with some extensions). Combined with its templates, this provides a rapid-start baseline for beginning to tailor and flesh out the specific details for a given information management initiative. Thus, after copying broad aspects of the MIKE2.0 system into the incipient DocWiki, it was relatively straightforward to let the existing structure and templates of MIKE2.0 guide next steps.

This CIS DocWiki contains 354 substantive articles, a complete activity and tasking structure, and various re-usable templates based on Semantic Mediawiki for structured and consistent access and retrieval. New tasks and structure can be readily added to the system. Existing structure or content can be deleted or marked as archive for non-display.

For new CIS DocWiki (or Citizen Dan-based) deployments, this means the knowledge base can be completely modified and extended for local circumstances. The set-up of the Mediawiki instance is separate from the loading or modification of the knowledge base, which means the look-and-feel of the entire system, not to mention user rights and permissions, can also be readily tailored for local requirements.

The core content of the CIS DocWiki and its basis in a set structure and methodology (derived from MIKE2.0) means that the knowledge base is also adaptable for other broader information development areas, especially in the semantic enterprise or semantic government arenas.

The Guiding Structures

The Supporting MUNI Structure

Strictly speaking, the vocabularies and structures (including, of course, ontologies) that drive this SemGov offering is also part of the knowledge base. And, in fact, many of these aspects, especially related to the actual operating of the instances, are included as part of the standard knowledge base.

However, the applicable domain ontology itself is separately maintained. This arm’s length-separation is done to acknowledge that the ontology has independent use and value apart from the knowledge base or the software that is the focus of it.

In the Citizen Dan instance, this structure is the MUNI ontology. MUNI is a general local government domain ontology that can find use in a broad array of circumstances, using or not Citizen Dan. Thus, the ontology itself and its documentation, discussion forums and use cases are maintained separately.

The Collaboration and Publishing Environment

The Wiki/Publishing Portion

The software framework that hosts and manages all of this content is the Mediawiki software, originally developed for Wikipedia. This framework is supported by a number of standard extensions packaged with the DocWiki distribution. One of the more notable extensions is Semantic Mediawiki. Mediawiki also is the wiki framework underlying MIKE2.0, so content sharing between the systems is straightforward.

The first use of the DocWiki is to add new content to the knowledge base and to modify or extend what is provided in the baseline. For straight authoring, DocWiki offers the standard wikitext basis for content entry and editing, as well as the WikED enhanced editor and the FCKEditor WYSIWYG rich-text editor. Each of these may be turned on or off at will.

All of the baseline content is fully organized and categorized via a standard structure. Pre-existing templates aid in entering new content in specific areas consistently or in providing standard administrative ways of tagging content for completeness or need for editorial attention. Tasks and concepts, in particular, follow set ways of entry and description. These set templates, some forms-based and some derived from Semantic Mediawiki, are also tied into automatic internal scripts for listing and organizing various items. So long as new material is entered properly, it will be reflected in various stats and listings. Unlike sole reliance on Semantic Mediawiki, the DocWiki approach is a mix of standard wiki categories and semantic types. Both are used for effective organization of the knowledge base.

Besides the knowledge base of domain content and “how-to”, the system also comes pre-packaged with many wiki “how-to” and best practices guidance for using the system effectively and consistently. As with standard wikis, there is a history of prior page revisions that gives the system rollback and version control. Mediawiki has a pretty good user access and permissions framework ranging from access, reading, editing and to uploads.

Besides the standard and required extensions, DocWiki also comes packaged with the necessary settings and configuration files to operate “out-of-the-box” in its designed baseline mode. Of course, these settings, too, can be changed and modified by site administrators, and DocWiki also includes guidance on how to do that.

The Publication Portion

A little known but highly useful part of the Mediawiki API allows direct export of XHTML content. Then, with minor XSLT conversion templates, it is possible to strip out wiki-specific conventions (such as the editing of individual sections) or to create straight XML versions. When this is combined with the use of internal DocWiki CSS style sheets that impose some clean and semantic style identifiers, a common canonical output basis for content is possible.

From that point, a given deployment may use its own CSS styles to theme output content. Output Web pages (XHTML) or XML files then can be processed using existing and accurate utilities to produce PDF or *.doc documents. Then, with systems such as OpenOffice, an even wider variety of document formats can be produced. These facilities mean that the DocWiki can also act as a single-source publishing environment.

Notes

  1. See M.K. Bergman, Redux: Enterprise Software Licensing on Life Support, June 2, 2006.
  2. Actually, some governments may not have many internal resources or capabilities. So, the more proper formulation of this question is whether that government has the necessary supporting infrastructure, information and systems such that in can hire from multiple entities and avoid any specific vendor lock-in.
  3. See further, M.K. Bergman, MIKE2.0: Open Source Information Development in the Enterprise, February 23, 2010; and, M.K. Bergman, Open SEAS: A Framework to Transition to a Semantic Enterprise, March 1, 2010.
Personal tools