Document Management System Interoperability
Document repositories - their design and operation - will become a CIO Critical Success Factor over the next several years. Their importance in information technology (IT) architectures will be enormous. While the looming Year 2000 demands may overshadow immediate ventures into this relatively new IT area, agency planners who wait until after the Year 2000 to give careful attention to them do so at great risk. This paper was written to help agency officials and planners prepare, regardless when they start writing system specifications.
The immediate impetus for this paper was the recent adoption by the industry-wide Document Management Alliance of a "DMA 1.0 Specification," the industry's first standard enabling document management systems from different vendors to interoperate. The paper covers document management systems generally and their role in architecture, and why the DMA accomplishment is so important.
Certainly, there are many aspects to the operational use of this pull approach, especially dealing with communications and access. It may sound a bit unreal to business-people not accustomed to heavy Internet use, but it's old hat to scientists, engineers, and others who have already been using the "ftp" (file transfer protocol) Internet capability for at least a decade. As those users know, a file is put into an FTP Server and its specific address is given to the intended users who then can download it whenever they want. Each version of the file can have its own unique address.
The DMS-based pull approach will probably find its initial use in groupware environments, intranets, and extranets. Its broader use will be accelerated by the DMA interoperability standard which envisions being able to directly access a specific document with its unique address, in a very similar way to the World Wide Web's manner of accessing individual documents on Web servers.
Microsoft's product strategy includes building DMS features into coming NT releases. Will that do the job?
Microsoft targets the mass market, and we're seeing initial client-side appearances in the "Outlook" product, and also some document indexing and searching services in Web site products. What's anticipated in NT are modest server-side ("BackOffice") DMS functions at a modest price increment, suitable for a global server market that includes small businesses. Don't be surprised to see these functions tied to Microsoft's messaging, groupware and workflow capabilities. Large organizations will want much richer DMS functionality, flexibility, scalability, etc., for organization-wide use, and will be willing to pay the incremental cost. That's where the other COTS DMS products will find their niche for many years to come.
What's an example of where I'd want the richer DMS capabilities?
Branch versioning of compound documents is one easily understood through an example. An agency's investigator at Headquarters in Washington is leading a case effort with an investigator in the Chicago Regional office, in cooperation with an Illinois state government investigator. As part of the wrap-up, the Headquarters lead investigator drafts a final summary and report. It contains text, of course, plus photographs, drawings, images of bank checks, data tables, and embedded URLs to interviews and intercept transcripts that are stored on a secure intranet host in the agency. It's a "compound" document because the embedded objects - images, spreadsheets, etc., are themselves separate documents (objects) in the DMS. The draft report and all of its contents are confidential, highly sensitive, and will be subjected to challenge and cross-examination at trial.
The lead investigator sends the draft final report simultaneously to the regional and state investigators and invites their corrections. (That's the branching.) They review it independently and concurrently, with the regional investigator adding another embedded object while the state investigator notes a correction to a different embedded object. They send their recommended revisions back to the Headquarters lead investigator, who first "checks them in" (records them in the DMS) on their separate branches, and then melds them into a final consolidated version. The Headquarters DMS not only keeps track of all this, including the security aspects, but provides also the record-keeping environment needed in the legal arena by keeping the various versions, by recording the case audit trail, and by guaranteeing archival integrity.
What are the interoperability considerations here?
Headquarters, in this example, is where the final case files are brought together and maintained. It needs lots of sophistication and scalability. The Chicago regional office has only a LAN with a couple modest servers, and doesn't need all the bells, whistles and horsepower of the headquarters system. The Illinois environment is the responsibility of that state's IRM executives, independent of Washington. It's integrated with the overall IT architecture of the state, which is affected by such applications as taxes, driving licenses, roads, and health care reimbursements. Thus all three repositories in this example could be on different platforms, and all three could be using different DMSs, from different vendors. Yet, the cooperative mission activities require their interoperability. The interoperability needed is at least two-way between headquarters and region, and probably three-way to include the state case file. Each is a client to the others and each one's repository is a server to the others.
So what's the answer?
The DMS vendor community knows that interoperability is likely to affect the future success of many products, including such related ones as imaging, e-mail, voice-mail, groupware, workflow, and even printing. Industry knows that repositories - where and how things are stored and managed - will be the nexus of all of these. All know that interoperability comes only with standards.
A few years ago, document management standards efforts were started at two levels. One was focused on a simple application programming interface (API) to let any kind of client interact with a DMS that also implemented the API, for the purpose of storing and retrieving files. Desktop applications like word processing and spreadsheet packages are on those clients and must interact with the DMS to store and retrieve the files created with those packages. In that sense, the DMS replaces the Windows file system/directory. With this standard, the client must know the specific design, construction, capabilities, etc., of the DMS in order to use it, including its proprietary document structuring, indexing, and query facility. Because all this knowledge is inside the client, the API itself is simple and inexpensive, yet so valuable because it makes the power of a proprietary DMS available to a wide range of desktop applications.
This API standard, called ODMA (for Open Document Management API) has been built into many different kinds of clients, and is used widely today. It can be viewed as a many-to-one standard, for many different clients to interact with each proprietary DMS in each DMS's own proprietary way. Because each client must be intimately knowledgeable in advance of each DMS with which it will interact, it does a portion of the interoperability job needed by our example, but falls far short of the whole job.
In parallel, a second, more ambitious standards effort was launched to create interoperability across different proprietary DMSs regardless of the platforms on which they reside and regardless of the networks in which they exist, and without requiring clients to have advance intimate DMS knowledge. The goal is to have uniform access to any document stored in any format, anywhere, at any time. This standard can be combined with the ODMA standard for inexpensive universal client access, and adds what's needed for completely vendor-independent cross-repository interoperability. It's called DMA, for the Document Management Alliance that's creating it, and is a middleware specification for what is truly many-to-many interoperability. That's many clients to many DMSs, regardless of platforms and networks. Because it accommodates international multi-language conventions, it's even language-independent.
Needless to say, the DMA effort is ambitious and sophisticated, because it means that any conforming client, including Web clients, can interact with any conforming DMS without having to know in advance the specific commands and characteristics of each DMS. It enables a client to use its own user interface and command set (look and feel) to store and retrieve objects from different-vendor DMSs, and to discover DMS characteristics when a request is first sent. There are specifications for objects, querying, versioning, containment, check-in/check-out, compound document support, content-based searching, and other aspects of repository management. Most of these are in the DMA 1.0 specification which was formally approved in December 1997. (None of these are included in the ODMA specification.) The priorities for the DMA 2.0 and later levels of the specification will be determined by ongoing user feedback.
Can you give me an analogy to clarify this?
Think of people, and today's file rooms or records centers that store lots of cabinets containing lots of different files holding tons of paper. The people coming into the rooms, either to store or retrieve, are the clients. The ODMA specification creates a big, well-lit door and entranceway through which anyone can enter, whether on foot, crutches or in a wheelchair, and regardless of gender, race, or nationality. However, to use the file room, ODMA expects that each will know before-hand the file scheme, the rules of the file room, and be able to read and understand the cabinet and file labeling. Each file room is unique, and ODMA expects each user to understand its uniqueness. Because it's an entranceway specification, the ODMA spec is simple and basic.
The DMA specification lets all those unique file rooms be used without requiring advance knowledge of each room, or even the ability to understand the language in which the labels are written. If three different state agencies were give an access to a Federal investigator, the DMA specification lets them say, "We three State agents are going to let a particular Federal agent (to whom we've given permission) use information in our three different States' file rooms without the agent having to know in advance how the contents are organized or labeled in the different file rooms, or even the procedures set up in the rooms." The Federal agent can use a single client computer program - either the Federal agency's proprietary client or an Internet browser - to use all the file rooms simultaneously, despite the differences among the rooms in their organization, labeling, and procedures. In effect, DMA lets the different file rooms look and operate the same way to the Federal agent despite their underlying differences.
Now that's interoperability! One can imagine its power in regulatory activities wherein a government regulatory agency would be able to access regulatees' documents while letting each have the freedom to architect its own document management environment. (Speaking of architecture, all the Federal Government agency IT architecture documents could be made directly accessible to all Federal agency CIOs despite the agencies operating on different platforms, with different office application packages, and using different DMS COTS products.) That's the promise of DMA.
How can I be sure that conforming products will be available to me when I want them?
Because vendors will only build what the market wants, if government CIOs want it in 2001 they must start now to send the message to the vendor community. The message has to be that their agencies will be requiring the specification in procurement actions, for product deliveries beginning in 1999, and that they anticipate communicating with the DMA to prioritize expanded functions for the next levels of the DMA specification.
Agencies that have established relationships with affected vendors can do as several private sector users are doing, namely informing their vendors that implementation of the specification will be required in future versions, releases, upgrades, etc., of their products. Agencies that are developing architectures probably will be identifying several interoperability specifications they expect to be seeking in their future product acquisitions, and the DMA specification can appear on their lists. Agencies that are conducting research can ask about availability of the DMA specification in any Requests for information that they issue.
Regardless where an agency stands for procurement of conforming products, it can express its needs and desires within the DMA as a user member. Participation in the DMA can be a way to influence both the delivery of conforming products and the enhancements made in future releases of the specification. The DMA vendors have already identified several candidate additions and improvements but cannot do them all at once. Users and marketplace feedback will set the priorities.
For more information about the DMA, including membership
and technical materials on the 1.0 specification, see http://www.aiim.org/dma.
This paper was authored by Dan Schneider, U.S. Department of Justice, 202-514-4318, firstname.lastname@example.org. The Department of Justice is a DMA user organization member. Mr Schneider welcomes reader questions or comments.
|Image Mentor specializes in enhancing and adding to your Documentum ApplicationXtender document management system.|
|Privacy Statement | Home | Contact Us|