As enterprises attempt to create value from their information assets, information modeling provides an important tool to communicate the design, standards and key aspects of the entities in the context of the domain being represented to all involved throughout the lifecycle of the information.
Since information resides and is accessed in many disparate formats and structures (files, relational databases, non-relational databases, services, block chains) the need to be able to query, relate, organize, format, report, store, update and alter the schema of entities requires coordination and planning. Information modeling is a tool for information management, supports creating trustworthy/quality data while maintaining the integrity of the data, thus allowing architectures to be robust and scalable.
Modeling is a particular challenge when data resides in unstructured sources such as spreadsheets, text documents, wikis, document collaboration sites, etc. At best, the medium is actually semi-structured providing known meta-data. Alternatively, vendors provide tools that can assist with searching document content, or mining content for data and meta-data. However, the costs, quality, performance issues and risks are high when it comes to changing business process and gaining information value around these sources.
Models are mostly thought of as being created during requirements gathering and further refined in design phases, referenced during implementation and kept as an artifact to be referenced thereafter. However, there are significant efforts in developing them for other Information Management capabilities, such as Data Governance, Master Data Management, Information Integration, Analytics and Information Security.
Information modeling in IT Architecture takes many forms, all of which have representations of entities and their relationships to one another:
- Class diagrams: model the information required to build or maintain a system or component
- ER diagrams: data modeling specific for database design
- Information flow diagrams: indicates origin of data and its uses (process or systems) throughout either the enterprise, line of business (LOB) or system, depending on the context
- Data models: depending on level of detail, describes the entities and its relationships detailing attributes, types and multiplicity of an entity to other entities
- Data lifecycles: how data is created, used and retained (archived or destroyed)
Models will vary depending on a number of factors: usage (whether analytical or transactional), focus (global, enterprise, LOB or system) and level of detail (conceptual, logical, or physical). For example, customer data from an online application will differ from the data used to gather statistics on customer location, where the model of the former would have a normalized structure and the latter a star or snowflake schema. Modeling also can indicate boundaries showing the limits of the information flow needed for regulatory compliance or business access restrictions having audit trails and tight security around authentication and authorization. Models may be relevant either for only a small group of individuals for a specific business process need, or it could be recognized as a global industry-standard.
A model provides a formalized format that aids conversations between architects and users, stakeholders, other architects and analysts. Modeling languages standardize the notations conveying the meanings of entities, attributes, transitions, relationships, identifiers and others. A modeling session with subject matter experts will assist in translating real-world concepts (a business scenario/process) to the correct entities and information for the business needs. The notation helps ensure the meaning is not lost during design and implementation phases. Therefore, it is important to have session participants understand the notations being used. Improper analysis and understanding of the usage of the data will lead to performance, reliability, availability and scalability issues.