#LyX 1.6.7 created this file. For more info see http://www.lyx.org/
\lyxformat 345
\begin_document
\begin_header
\textclass scrbook
\begin_preamble
\input{preamble.tex}
\end_preamble
\use_default_options false
\language english
\inputencoding utf8
\font_roman lmodern
\font_sans helvet
\font_typewriter courier
\font_default_family default
\font_sc false
\font_osf false
\font_sf_scale 100
\font_tt_scale 100
\graphics default
\float_placement H
\paperfontsize default
\spacing single
\use_hyperref true
\pdf_bookmarks true
\pdf_bookmarksnumbered false
\pdf_bookmarksopen false
\pdf_bookmarksopenlevel 1
\pdf_breaklinks false
\pdf_pdfborder false
\pdf_colorlinks false
\pdf_backref false
\pdf_pdfusetitle true
\papersize letterpaper
\use_geometry false
\use_amsmath 2
\use_esint 1
\cite_engine basic
\use_bibtopic false
\paperorientation portrait
\secnumdepth 3
\tocdepth 3
\paragraph_separation indent
\defskip medskip
\quotes_language english
\papercolumns 1
\papersides 1
\paperpagestyle default
\bullet 0 5 11 -1
\bullet 1 5 24 -1
\bullet 2 0 0 -1
\tracking_changes false
\output_changes false
\author ""
\author ""
\end_header
\begin_body
\begin_layout Standard
\begin_inset ERT
status collapsed
\begin_layout Plain Layout
\backslash
input{chapter.tex}
\end_layout
\end_inset
\end_layout
\begin_layout Chapter
\begin_inset CommandInset label
LatexCommand label
name "cha:Data-Dictionary"
\end_inset
Data Dictionary
\end_layout
\begin_layout Standard
\begin_inset ERT
status collapsed
\begin_layout Plain Layout
\backslash
epigraph{Words are but symbols for the relations of things to one another
and to us; nowhere do they touch upon absolute truth.}{Friedrich Nietzsche}
\end_layout
\end_inset
\end_layout
\begin_layout Standard
Throughout the remainder of this manual,
\emph on
data dictionary
\emph default
is the name given to a map of all database columns represented in a human-reada
ble format (including internationalization).
You can think of it as a
\emph on
column dictionary
\emph default
.
\begin_inset CommandInset ref
LatexCommand ref
reference "fig:Data-Dictionary-Overview"
\end_inset
represents practical uses of a data dictionary.
\end_layout
\begin_layout Standard
\begin_inset Float figure
wide false
sideways false
status collapsed
\begin_layout Plain Layout
\begin_inset Graphics
filename drawings/dictionary-architecture.png
display false
\end_inset
\end_layout
\begin_layout Plain Layout
\begin_inset Caption
\begin_layout Plain Layout
\begin_inset CommandInset label
LatexCommand label
name "fig:Data-Dictionary-Overview"
\end_inset
Data Dictionary Overview
\end_layout
\end_inset
\end_layout
\end_inset
\end_layout
\begin_layout Standard
The primary purpose of the data dictionary is to define a single location
where programmer jargon is translated into end user text producing a completely
consistent experience.
Creating a data dictionary requires the following:
\end_layout
\begin_layout Itemize
Document an Entity Relationship Diagram.
\end_layout
\begin_layout Itemize
Create database tables and views.
\end_layout
\begin_layout Itemize
Convert column names to text.
\end_layout
\begin_layout Itemize
Populate the tables with data.
\end_layout
\begin_layout Itemize
List business areas and column usage.
\end_layout
\begin_layout Itemize
Generate the resource bundles.
\end_layout
\begin_layout Standard
A data dictionary has the following advantages:
\end_layout
\begin_layout Itemize
Single, consistent text throughout all visible parts of the system.
\end_layout
\begin_layout Itemize
Easy-to-find database schema, table, and column identifiers.
\begin_inset Foot
status collapsed
\begin_layout Plain Layout
Useful when report requirements state desired data, not fully qualified
database columns.
\end_layout
\end_inset
\end_layout
\begin_layout Itemize
Readily supported internationalization.
\begin_inset Foot
status collapsed
\begin_layout Plain Layout
The brunt of the work can be machine translated.
\end_layout
\end_inset
\end_layout
\begin_layout Itemize
Ability to correlate dependencies between reports and database columns.
\end_layout
\begin_layout Itemize
Ability to develop a tool for generating and managing
\begin_inset ERT
status collapsed
\begin_layout Plain Layout
\backslash
appjasperserver{}
\end_layout
\end_inset
Domains.
\end_layout
\begin_layout Standard
A data dictionary has the following disadvantages:
\end_layout
\begin_layout Itemize
Creating the data dictionary takes development effort.
\end_layout
\begin_layout Itemize
A simple implementation cannot adequately capture 100% of business needs.
\end_layout
\begin_layout Standard
In this chapter:
\end_layout
\begin_layout Itemize
\series bold
Business Clusters.
\series default
Describes the pairing of Business Areas with Column Clusters.
\end_layout
\begin_layout Itemize
\series bold
Entity Relationship Diagram.
\series default
Describes tables and columns used by the data dictionary.
\end_layout
\begin_layout Itemize
\series bold
Create Tables and Views.
\series default
References the script to create the data dictionary.
\end_layout
\begin_layout Itemize
\series bold
Populate Database.
\series default
Describes the recurring task to keep the data dictionary up-to-date.
\end_layout
\begin_layout Itemize
\series bold
Map Column Names.
\series default
In-depth review of different ways to obtain column translations.
\end_layout
\begin_layout Itemize
\series bold
Resource Bundle.
\series default
Briefly describes how to create resource bundles from the data dictionary.
\end_layout
\begin_layout Itemize
\series bold
Business Areas and Column Clusters.
\series default
Describes how to export and maintain the relationships between Business
Areas and Column Clusters.
\end_layout
\begin_layout Section
Business Clusters
\end_layout
\begin_layout Standard
Before diving headlong into the data dictionary's entity relationship diagram
(ERD), it is worthwhile to see how the data dictionary can be organized
to meet business needs.
Every database column must be categorized before it can be used to generate
content.
In
\begin_inset CommandInset ref
LatexCommand ref
reference "fig:Data-Dictionary-Business-Clusters"
\end_inset
, the organizational structure for a physician's office serves as an example
to explain how the data dictionary is structured.
\end_layout
\begin_layout Standard
\begin_inset Float figure
wide false
sideways false
status collapsed
\begin_layout Plain Layout
\begin_inset Graphics
filename drawings/dictionary-business-clusters.png
display false
\end_inset
\end_layout
\begin_layout Plain Layout
\begin_inset Caption
\begin_layout Plain Layout
\begin_inset CommandInset label
LatexCommand label
name "fig:Data-Dictionary-Business-Clusters"
\end_inset
Data Dictionary Business Clusters
\end_layout
\end_inset
\end_layout
\end_inset
\end_layout
\begin_layout Standard
The crucial components to
\begin_inset CommandInset ref
LatexCommand ref
reference "fig:Data-Dictionary-Business-Clusters"
\end_inset
are as follows:
\end_layout
\begin_layout Itemize
\series bold
Column Translation.
\series default
Maps database column names to human-readable text.
\end_layout
\begin_layout Itemize
\series bold
Entity Column Map.
\series default
Maps column names to their corresponding tables or views.
\end_layout
\begin_layout Itemize
\series bold
Business Areas.
\series default
Describes a high-level collection of the major operational aspects to the
business.
\end_layout
\begin_layout Itemize
\series bold
Column Clusters.
\series default
Describes a low-level collection of database columns that logically group
within a Business Area.
\end_layout
\begin_layout Standard
Together, the Business Areas and Column Clusters are called
\emph on
Business Clusters
\emph default
as they are tightly associated.
For a fictional physician, the business might be concerned with patients,
payment processing, inventory, clinic locations, and so forth.
Each business area, then, can have any number of associated column clusters.
\begin_inset CommandInset ref
LatexCommand ref
reference "fig:Data-Dictionary-Business-Clusters"
\end_inset
shows the column clusters for a patient.
Every patient has a name, an address, a phone number, information about
their health, and so on.
The patient's name might include given, surname, middle, and title; whereas,
a patient's session might include items like date, time, duration, and
next appointment.
\end_layout
\begin_layout Standard
Notice that a patient and a physician both can have a name cluster; so too,
the patient, physician, staff, and clinic all have an associated address
cluster.
\end_layout
\begin_layout Standard
The Business Cluster associates Business Areas with Column Clusters.
Conceptually, each Column Cluster contains a set of references to database
columns.
As the ERD in
\begin_inset CommandInset ref
LatexCommand ref
reference "sec:Entity-Relationship-Diagram"
\end_inset
describes, how the database design actually works is a little bit more
complex.
\end_layout
\begin_layout Section
\begin_inset CommandInset label
LatexCommand label
name "sec:Entity-Relationship-Diagram"
\end_inset
Entity Relationship Diagram
\end_layout
\begin_layout Standard
\begin_inset CommandInset ref
LatexCommand ref
reference "fig:Data-Dictionary-ERD"
\end_inset
shows the entity relationship diagram used by the data dictionary.
\end_layout
\begin_layout Standard
\begin_inset Float figure
wide false
sideways false
status collapsed
\begin_layout Plain Layout
\begin_inset Graphics
filename drawings/data-dictionary-erd.png
display false
\end_inset
\end_layout
\begin_layout Plain Layout
\begin_inset Caption
\begin_layout Plain Layout
\begin_inset CommandInset label
LatexCommand label
name "fig:Data-Dictionary-ERD"
\end_inset
Data Dictionary ERD
\end_layout
\end_inset
\end_layout
\end_inset
\end_layout
\begin_layout Standard
The diagram includes the following conceptual areas:
\end_layout
\begin_layout Itemize
Translation and Internationalization
\end_layout
\begin_layout Itemize
Entity Relations
\end_layout
\begin_layout Itemize
Business Clusters
\end_layout
\begin_layout Subsection
Translation and Internationalization
\end_layout
\begin_layout Standard
The purpose of the
\family typewriter
column_translation
\family default
and
\family typewriter
object_column
\family default
tables is to associate a word or phrase with a database column, effectively
mapping column names named by software developers to human readable text
for end users.
Joining data between the two (on the
\family typewriter
object_column_id
\family default
column) can return data shown in
\begin_inset CommandInset ref
LatexCommand ref
reference "tab:Example-Column-Translations"
\end_inset
.
\end_layout
\begin_layout Standard
\begin_inset Float table
wide false
sideways false
status open
\begin_layout Plain Layout
\begin_inset Caption
\begin_layout Plain Layout
\begin_inset CommandInset label
LatexCommand label
name "tab:Example-Column-Translations"
\end_inset
Example Column Translations
\end_layout
\end_inset
\begin_inset Tabular
<lyxtabular version="3" rows="5" columns="5">
<features>
<column alignment="left" valignment="top" width="0">
<column alignment="left" valignment="top" width="0">
<column alignment="left" valignment="top" width="0">
<column alignment="left" valignment="top" width="0">
<column alignment="left" valignment="top" width="0">
<row>
<cell alignment="left" valignment="top" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\begin_inset ERT
status collapsed
\begin_layout Plain Layout
\backslash
rowcolor{tableheader}
\backslash
toprule{}
\end_layout
\end_inset
\series bold
translated_text
\end_layout
\end_inset
</cell>
<cell alignment="left" valignment="top" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\series bold
language_code
\end_layout
\end_inset
</cell>
<cell alignment="left" valignment="top" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\series bold
country_code
\end_layout
\end_inset
</cell>
<cell alignment="left" valignment="top" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\series bold
column_name
\end_layout
\end_inset
</cell>
<cell alignment="left" valignment="top" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\series bold
business_name
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="left" valignment="top" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
First Name
\end_layout
\end_inset
</cell>
<cell alignment="left" valignment="top" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
en
\end_layout
\end_inset
</cell>
<cell alignment="left" valignment="top" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
CA
\end_layout
\end_inset
</cell>
<cell alignment="left" valignment="top" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
first_name
\end_layout
\end_inset
</cell>
<cell alignment="left" valignment="top" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
first
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="left" valignment="top" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
Hours
\end_layout
\end_inset
</cell>
<cell alignment="left" valignment="top" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
en
\end_layout
\end_inset
</cell>
<cell alignment="left" valignment="top" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
CA
\end_layout
\end_inset
</cell>
<cell alignment="left" valignment="top" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
pay_hours
\end_layout
\end_inset
</cell>
<cell alignment="left" valignment="top" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
hours
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="left" valignment="top" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
Street
\end_layout
\end_inset
</cell>
<cell alignment="left" valignment="top" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
en
\end_layout
\end_inset
</cell>
<cell alignment="left" valignment="top" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
CA
\end_layout
\end_inset
</cell>
<cell alignment="left" valignment="top" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
addr_street
\end_layout
\end_inset
</cell>
<cell alignment="left" valignment="top" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
street
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="left" valignment="top" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
Rue
\end_layout
\end_inset
</cell>
<cell alignment="left" valignment="top" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
fr
\end_layout
\end_inset
</cell>
<cell alignment="left" valignment="top" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
CA
\end_layout
\end_inset
</cell>
<cell alignment="left" valignment="top" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
addr_street
\end_layout
\end_inset
</cell>
<cell alignment="left" valignment="top" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
street
\begin_inset ERT
status collapsed
\begin_layout Plain Layout
\backslash
tabularnewline
\backslash
bottomrule%
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
</row>
</lyxtabular>
\end_inset
\end_layout
\end_inset
\end_layout
\begin_layout Standard
This rudimentary design has some deficiencies:
\end_layout
\begin_layout Itemize
The
\family typewriter
language_code
\family default
and
\family typewriter
country_code
\family default
combinations can be normalized.
\end_layout
\begin_layout Itemize
The
\family typewriter
report_width_min
\family default
(and
\family typewriter
max
\family default
) columns could be defined elsewhere.
\end_layout
\begin_layout Standard
The report width columns provide hints to developers when they are writing
reports.
As printed reports typically have a finite width
\begin_inset Foot
status collapsed
\begin_layout Plain Layout
There are not many situations where reports are printed on plotters.
\end_layout
\end_inset
(for example, letter or legal sized), developers need to know how much
space they can allocate to fields across a page.
The same column name used among different tables and views has a single
occurrence of its name within the
\family typewriter
object_column
\family default
table.
\end_layout
\begin_layout Standard
Differences exist between Australlian, British, and Canadian English.
\begin_inset Foot
status collapsed
\begin_layout Plain Layout
For example, the word ``root'' has vulgar connotations in Australia, but
``root directory'' in Canada is not so uncommon.
\end_layout
\end_inset
To avoid potentially embarrassing translations, each column can have a
translation based on language
\emph on
and
\emph default
country, as defined by the
\family typewriter
language_code
\family default
and
\family typewriter
country_code
\family default
columns in
\begin_inset CommandInset ref
LatexCommand ref
reference "tab:Example-Column-Translations"
\end_inset
.
\end_layout
\begin_layout Subsection
Entity Relations
\end_layout
\begin_layout Standard
These tables serve to list all the database objects (tables and views) in
the system having at least one column with columns that are presented to
an end user.
This is accomplished using the following tables:
\end_layout
\begin_layout Itemize
\family typewriter
entity_relation
\family default
-- Lists unique database objects.
\end_layout
\begin_layout Itemize
\family typewriter
object_column_entity_relation
\family default
-- Maps the relationship between columns and database objects.
\end_layout
\begin_layout Subsection
Business Clusters
\end_layout
\begin_layout Standard
A large private chain of medical clinics might have business areas such
as, physicians, patients, locations, inventory, and medical history records.
Within each business area is a collection of categories, or
\begin_inset Quotes eld
\end_inset
column clusters
\begin_inset Quotes erd
\end_inset
, which are general attributes associated with each business area.
For example, a physician likely has an address, phone numbers, and a name.
Each of these attributes refers to a cluster of data stored in columns.
An address consists of a street, city, region, country, and code.
Similarly, a name
\emph on
\emph default
might include first, last, middle, prefix, and suffix.
In this example, a name represents a column cluster and a physician represents
a business area.
\end_layout
\begin_layout Standard
The point of the business area-column clusters, then, is to organize the
business data, as shown in
\begin_inset CommandInset ref
LatexCommand ref
reference "tab:Business-Clusters"
\end_inset
.
\end_layout
\begin_layout Standard
\begin_inset Float table
wide false
sideways false
status collapsed
\begin_layout Plain Layout
\begin_inset Caption
\begin_layout Plain Layout
\begin_inset CommandInset label
LatexCommand label
name "tab:Business-Clusters"
\end_inset
Business Clusters
\end_layout
\end_inset
\begin_inset ERT
status collapsed
\begin_layout Plain Layout
\backslash
begin{small}
\end_layout
\end_inset
\end_layout
\begin_layout Plain Layout
\begin_inset Tabular
<lyxtabular version="3" rows="5" columns="7">
<features>
<column alignment="left" valignment="top" width="0">
<column alignment="left" valignment="top" width="0">
<column alignment="left" valignment="top" width="0">
<column alignment="left" valignment="top" width="0">
<column alignment="left" valignment="top" width="0">
<column alignment="left" valignment="top" width="0">
<column alignment="left" valignment="top" width="0">
<row>
<cell alignment="left" valignment="top" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\begin_inset ERT
status collapsed
\begin_layout Plain Layout
\backslash
rowcolor{tableheader}
\backslash
toprule{}
\end_layout
\end_inset
\series bold
Area
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\series bold
Cluster
\end_layout
\end_inset
</cell>
<cell alignment="left" valignment="top" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\series bold
Relation
\end_layout
\end_inset
</cell>
<cell alignment="left" valignment="top" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\series bold
Business Name
\end_layout
\end_inset
</cell>
<cell alignment="left" valignment="top" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\series bold
Column
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\series bold
Business Name
\end_layout
\end_inset
</cell>
<cell alignment="left" valignment="top" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\series bold
Text
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="left" valignment="top" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\begin_inset ERT
status collapsed
\begin_layout Plain Layout
\backslash
midrule{}
\end_layout
\end_inset
physician
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
name
\end_layout
\end_inset
</cell>
<cell alignment="left" valignment="top" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
physician
\end_layout
\end_inset
</cell>
<cell alignment="left" valignment="top" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
Physician
\end_layout
\end_inset
</cell>
<cell alignment="left" valignment="top" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
fname
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
first
\end_layout
\end_inset
</cell>
<cell alignment="left" valignment="top" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
First Name
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="left" valignment="top" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
physician
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
name
\end_layout
\end_inset
</cell>
<cell alignment="left" valignment="top" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
physician
\end_layout
\end_inset
</cell>
<cell alignment="left" valignment="top" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
Physician
\end_layout
\end_inset
</cell>
<cell alignment="left" valignment="top" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
lname
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
last
\end_layout
\end_inset
</cell>
<cell alignment="left" valignment="top" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
Last Name
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="left" valignment="top" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
patient
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
name
\end_layout
\end_inset
</cell>
<cell alignment="left" valignment="top" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
patient
\end_layout
\end_inset
</cell>
<cell alignment="left" valignment="top" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
Patient
\end_layout
\end_inset
</cell>
<cell alignment="left" valignment="top" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
fname
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
first
\end_layout
\end_inset
</cell>
<cell alignment="left" valignment="top" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
First Name
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="left" valignment="top" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
patient
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
name
\end_layout
\end_inset
</cell>
<cell alignment="left" valignment="top" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
patient
\end_layout
\end_inset
</cell>
<cell alignment="left" valignment="top" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
Patient
\end_layout
\end_inset
</cell>
<cell alignment="left" valignment="top" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
lname
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
last
\end_layout
\end_inset
</cell>
<cell alignment="left" valignment="top" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
Last Name
\begin_inset ERT
status collapsed
\begin_layout Plain Layout
\backslash
tabularnewline
\backslash
bottomrule%
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
</row>
</lyxtabular>
\end_inset
\begin_inset ERT
status collapsed
\begin_layout Plain Layout
\backslash
end{small}
\end_layout
\end_inset
\end_layout
\end_inset
\end_layout
\begin_layout Standard
Using the data from
\begin_inset CommandInset ref
LatexCommand ref
reference "tab:Business-Clusters"
\end_inset
, it is now relatively straightforward to write a SQL statement that generates
the resource bundle in
\begin_inset CommandInset ref
LatexCommand ref
reference "alg:Autogenerated-Resource-Bundle"
\end_inset
.
\end_layout
\begin_layout Standard
\begin_inset Float algorithm
wide false
sideways false
status collapsed
\begin_layout Plain Layout
\begin_inset Graphics
filename source/text/business_cluster.properties.png
display false
\end_inset
\end_layout
\begin_layout Plain Layout
\begin_inset Caption
\begin_layout Plain Layout
\begin_inset CommandInset label
LatexCommand label
name "alg:Autogenerated-Resource-Bundle"
\end_inset
Autogenerated Resource Bundle
\end_layout
\end_inset
\end_layout
\end_inset
\end_layout
\begin_layout Standard
Note here that the data only appear to be duplicated.
In the data dictionary's table that contains a list of all the columns,
the column name of
\begin_inset Quotes eld
\end_inset
fname
\begin_inset Quotes erd
\end_inset
appears a single time.
If the column
\begin_inset Quotes eld
\end_inset
fname
\begin_inset Quotes erd
\end_inset
does not store information about a person's
\begin_inset Quotes eld
\end_inset
First Name
\begin_inset Quotes erd
\end_inset
throughout the database, you will either have to add another layer of complexit
y into this entity relationship design or force a consistent naming scheme
onto the data model.
Although it can be painful and time-consuming, favour the latter.
\end_layout
\begin_layout Section
Create Tables and Views
\end_layout
\begin_layout Standard
Modern ERD tools
\begin_inset Foot
status collapsed
\begin_layout Plain Layout
DbSchema, Toad, eRWin, Druid III, and others.
\end_layout
\end_inset
can transform diagrams, such as
\begin_inset CommandInset ref
LatexCommand ref
reference "fig:Data-Dictionary-ERD"
\end_inset
, into executable SQL code.
To install a dictionary schema and create associated database objects,
see the auto-generated SQL source code in
\begin_inset CommandInset ref
LatexCommand vref
reference "sub:filesqlcreatedatadictionary"
\end_inset
.
\end_layout
\begin_layout Section
Populate Database
\end_layout
\begin_layout Standard
Populating a database is not a one-time task.
Every time new tables, views, or columns are added to a system, its data
dictionary must be updated to include these items.
Fortunately, that's a matter of running a few queries.
To (re-)populate the data dictionary in a database:
\end_layout
\begin_layout Enumerate
Scan for new (or removed) tables, views, or columns.
\end_layout
\begin_layout Enumerate
Link new columns with text.
\end_layout
\begin_layout Enumerate
Assign new columns with all relevant tables.
\end_layout
\begin_layout Standard
Example scripts to (re-)populate the data dictionary are in the data dictionary'
s maintenance directory (see
\begin_inset CommandInset ref
LatexCommand vref
reference "sec:Data-Dictionary-Scripts"
\end_inset
), with the following names:
\end_layout
\begin_layout Itemize
\begin_inset ERT
status collapsed
\begin_layout Plain Layout
\backslash
filename{
\backslash
filesqlentitycolumn}
\end_layout
\end_inset
\end_layout
\begin_layout Itemize
\begin_inset ERT
status collapsed
\begin_layout Plain Layout
\backslash
filename{
\backslash
filesqlcolumntext}
\end_layout
\end_inset
\end_layout
\begin_layout Itemize
\begin_inset ERT
status collapsed
\begin_layout Plain Layout
\backslash
filename{
\backslash
filesqlrelateentitycolumn}
\end_layout
\end_inset
\end_layout
\begin_layout Standard
The first script generates a sequence of
\family typewriter
INSERT
\family default
and
\family typewriter
DELETE
\family default
statements that either inserts new entities and columns or removes expired
entities and columns.
Review (and update, if necessary) the auto-generated statements before
executing them.
The second script generates a list of
\family typewriter
INSERT
\family default
statements that associate database columns with text.
These insertions can later be internationalized.
The third script associates the new-found columns with their corresponding
new-found tables and views.
The last script does not generate code and can take several minutes to
complete.
\end_layout
\begin_layout Section
\begin_inset CommandInset label
LatexCommand label
name "sec:Map-Column-Names"
\end_inset
Map Column Names
\end_layout
\begin_layout Standard
The least time consuming way to map column names to their human-readable
equivalent is to enforce database development standards that mandate the
use of underscores in column names.
This makes conversion easy and mostly avoids the problem (abbreviated names
remain an issue).
If your database was developed without such standards, use this section.
The following sections describe how to translate programming jargon into
human-readable text:
\end_layout
\begin_layout Itemize
Existing mappings
\end_layout
\begin_layout Itemize
Web service
\end_layout
\begin_layout Itemize
Word splitting
\end_layout
\begin_layout Standard
Using all three ways should reduce the translation task quite significantly.
\end_layout
\begin_layout Subsection
Existing Mappings
\end_layout
\begin_layout Standard
If your software applications have a front-end of any sort, there might
be a mapping of column names to names a business owner could relate to.
For example, a web user interface might map database columns to input field
prompts.
If the map is created in the Extensible Markup Language (XML), consider
using an XSL template to transform the XML into SQL statements.
\end_layout
\begin_layout Standard
XML is related to other technologies, including the following:
\end_layout
\begin_layout Itemize
Extensible Stylesheet Language (XSL)
\end_layout
\begin_layout Itemize
Extract, Transform, and Load (ETL)
\end_layout
\begin_layout Standard
When you want to deposit the contents of XML documents directly into a database,
ETL tools are an excellent choice.
But when the data cannot be inserted directly (if the SQL statements to
perform the inserts are based on a complex relationship within the XML),
generating the SQL insert statements can be a relatively quick solution,
which falls into the realm of XSL.
\begin_inset Foot
status collapsed
\begin_layout Plain Layout
There are web-based XSLT engines, such as
\begin_inset CommandInset href
LatexCommand href
target "http://www.futurelab.ch/xmlkurs/xslt.en.html"
\end_inset
.
\end_layout
\end_inset
Let's look at a short example of how XML and XSL can be used to generate
SQL that produces insert statements for the data dictionary.
\end_layout
\begin_layout Standard
A map of database columns to text might already exist, like the example
in
\begin_inset CommandInset ref
LatexCommand ref
reference "alg:Data-Dictionary-XML-Source"
\end_inset
.
\end_layout
\begin_layout Standard
\begin_inset Float algorithm
wide false
sideways false
status collapsed
\begin_layout Plain Layout
\begin_inset Graphics
filename source/xml/field-list.xml.png
display false
\end_inset
\end_layout
\begin_layout Plain Layout
\begin_inset Caption
\begin_layout Plain Layout
\begin_inset CommandInset label
LatexCommand label
name "alg:Data-Dictionary-XML-Source"
\end_inset
Data Dictionary XML Source
\end_layout
\end_inset
\end_layout
\end_inset
\end_layout
\begin_layout Standard
The XML is quite straightforward and should need no explanation, although
the XSL, in
\begin_inset CommandInset ref
LatexCommand ref
reference "alg:Data-Dictionary-XSL-Template"
\end_inset
, could use a summary.
\end_layout
\begin_layout Standard
\begin_inset Float algorithm
wide false
sideways false
status collapsed
\begin_layout Plain Layout
\begin_inset Graphics
filename source/xsl/field-list.xsl.png
display false
\end_inset
\end_layout
\begin_layout Plain Layout
\begin_inset Caption
\begin_layout Plain Layout
\begin_inset CommandInset label
LatexCommand label
name "alg:Data-Dictionary-XSL-Template"
\end_inset
Data Dictionary XSL Template
\end_layout
\end_inset
\end_layout
\end_inset
\end_layout
\begin_layout Standard
The lines perform the following actions:
\end_layout
\begin_layout Itemize
\series bold
Lines 1 to 3.
\series default
Typical declaration of a stylesheet.
\end_layout
\begin_layout Itemize
\series bold
Lines 5 and 6.
\series default
Create a text file, such as a SQL file, and remove any extraneous spaces.
\end_layout
\begin_layout Itemize
\series bold
Lines 8 to 10.
\series default
Traverse through the XML document starting at the root element.
\end_layout
\begin_layout Itemize
\series bold
Lines 12 to 14.
\series default
Traverse through all the
\family typewriter
field-list
\family default
elements.
\end_layout
\begin_layout Itemize
\series bold
Lines 16 to 28.
\series default
Create a SQL statement based on the values of the
\family typewriter
column
\family default
and
\family typewriter
prompt
\family default
elements (from
\begin_inset CommandInset ref
LatexCommand ref
reference "alg:Data-Dictionary-XML-Source"
\end_inset
).
\end_layout
\begin_layout Standard
When the XML from
\begin_inset CommandInset ref
LatexCommand ref
reference "alg:Data-Dictionary-XML-Source"
\end_inset
is transformed using the stylesheet in
\begin_inset CommandInset ref
LatexCommand ref
reference "alg:Data-Dictionary-XSL-Template"
\end_inset
, the result will be
\family typewriter
INSERT
\family default
statements similar to the statement in
\begin_inset CommandInset ref
LatexCommand ref
reference "alg:Data-Dictionary-XSL-Transform"
\end_inset
.
\end_layout
\begin_layout Standard
\begin_inset Float algorithm
wide false
sideways false
status collapsed
\begin_layout Plain Layout
\begin_inset Graphics
filename source/sql/data-dictionary-inserts.pgsql.png
display false
\end_inset
\end_layout
\begin_layout Plain Layout
\begin_inset Caption
\begin_layout Plain Layout
\begin_inset CommandInset label
LatexCommand label
name "alg:Data-Dictionary-XSL-Transform"
\end_inset
Data Dictionary XSL Transform
\end_layout
\end_inset
\end_layout
\end_inset
\end_layout
\begin_layout Standard
This example shows a specific instance of how to popualate the data dictionary
automatically.
A similar technique could be used, for example, to leverage the translations
from XHTML (or related markup) with
\family typewriter
<label>
\family default
elements (or similar elements in whatever web framework is currently being
used).
If it looks like it will be too much work to implement an automatic solution,
consider hiring someone to perform the data entry service.
\end_layout
\begin_layout Standard
It is likely that the same column name is used on different web pages.
In such a situation, generating duplicate
\family typewriter
INSERT
\family default
statements will cause a failure.
To resolve this problem:
\end_layout
\begin_layout Enumerate
Use XSL to write out columns and values in a simple file format:
\family typewriter
column,value
\family default
.
\end_layout
\begin_layout Enumerate
Optionally, convert the column names to match the case of their database
column namesakes.
\end_layout
\begin_layout Enumerate
Sort the file (use the
\family typewriter
sort
\family default
command in Linux).
\end_layout
\begin_layout Enumerate
Extract only the unique lines (use the
\family typewriter
uniq
\family default
command in Linux).
\end_layout
\begin_layout Enumerate
Convert the simple file format to
\family typewriter
INSERT
\family default
statements.
\end_layout
\begin_layout Standard
The details for implementing this solution are dependent on your set up
and needs.
\end_layout
\begin_layout Subsection
Web Service
\end_layout
\begin_layout Standard
Chances are, your database columns do not have underscores between each
and every word.
After you first convert the column names using existing source files and
run the column names through the dictionary split method, you might consider
an additional method: machine translation using a web service.
In this case, using Google.
Get permission before running the script in
\begin_inset CommandInset ref
LatexCommand ref
reference "alg:Google-based-Word-Extraction"
\end_inset
(without express permission, running the shell script violates Google's
Terms of Service).
\end_layout
\begin_layout Standard
\begin_inset Float algorithm
wide false
sideways false
status collapsed
\begin_layout Plain Layout
\begin_inset Graphics
filename source/scripts/column-split.sh.png
display false
\end_inset
\end_layout
\begin_layout Plain Layout
\begin_inset Caption
\begin_layout Plain Layout
\begin_inset CommandInset label
LatexCommand label
name "alg:Google-based-Word-Extraction"
\end_inset
Google-based Word Extraction
\end_layout
\end_inset
\end_layout
\end_inset
\end_layout
\begin_layout Itemize
\series bold
Line 3.
\series default
Loops over every line in the file.
The file
\begin_inset ERT
status collapsed
\begin_layout Plain Layout
\backslash
filename{
\backslash
filetxtcolumns}
\end_layout
\end_inset
should contain one database column name per line.
\end_layout
\begin_layout Itemize
\series bold
Line 4.
\series default
Writes the name of the column followed by a full-colon to standard output,
but does not write the line terminator (yet).
\end_layout
\begin_layout Itemize
\series bold
Line 5.
\series default
Launches a text-oriented web browser to query Google with the database
column to translate (into English).
\end_layout
\begin_layout Itemize
\series bold
Line 6.
\series default
Throws away the entire web page result except for the one line that contains
that one special phrase.
\end_layout
\begin_layout Itemize
\series bold
Line 7.
\series default
Extracts the first word after Google's six-word phrase until there are
no more words, and writes them to standard output.
Note that this phrase can change at any time.
\begin_inset Foot
status collapsed
\begin_layout Plain Layout
For example, the text might become ``Showing results for''.
\end_layout
\end_inset
\end_layout
\begin_layout Itemize
\series bold
Line 8.
\series default
The line terminator.
\end_layout
\begin_layout Itemize
\series bold
Line 9.
\series default
Prevent sending requests in rapid succession to Google.
\end_layout
\begin_layout Subsection
Word Splitting
\end_layout
\begin_layout Standard
Word splitting, also known as text segmentation, solves the problem of extractin
g words from a string of characters lacking word boundaries (such as, spaces
or punctuation).
For example,
\family typewriter
clientaccountedit
\family default
likely means
\begin_inset Quotes eld
\end_inset
Client Account Edit
\begin_inset Quotes erd
\end_inset
or
\begin_inset Quotes eld
\end_inset
Client Accounted It
\begin_inset Quotes erd
\end_inset
.
Developers are notorious for lumping words together to name database columns.
This natural language processing is a difficult problem, and will not have
a perfect solution for some time.
Until then, you can either spend your time researching and looking for
tools that will help, or do the following:
\end_layout
\begin_layout Enumerate
Obtain a dictionary of words,
\begin_inset Foot
status collapsed
\begin_layout Plain Layout
In Unix systems,
\family typewriter
/usr/share/dict/words
\family default
is a good start.
\end_layout
\end_inset
and remove any words with the following:
\begin_inset Foot
status collapsed
\begin_layout Plain Layout
The regular expression
\family typewriter
^[a-z]*$
\family default
will be helpful here.
\end_layout
\end_inset
\end_layout
\begin_deeper
\begin_layout Itemize
Upper-case characters (proper nouns)
\end_layout
\begin_layout Itemize
Numerals
\end_layout
\begin_layout Itemize
Punctuation
\end_layout
\end_deeper
\begin_layout Enumerate
Create a corpus of material using business documentation, especially technical
manuals:
\end_layout
\begin_deeper
\begin_layout Itemize
Database design documentation.
\end_layout
\begin_layout Itemize
Technical architectures.
\end_layout
\begin_layout Itemize
Requirements documents.
\end_layout
\begin_layout Itemize
Blogs, wikis, websites, and even emails.
\end_layout
\end_deeper
\begin_layout Enumerate
Extract, normalize (punctuation, case), and combine the individual words
from the corpus text.
\begin_inset Foot
status collapsed
\begin_layout Plain Layout
The program Lynx is a text-based browser that, quite handily, can extract
plain text from HTML.
\end_layout
\end_inset
\end_layout
\begin_layout Enumerate
Tally the occurrences of each word in the corpus.
\begin_inset Foot
status collapsed
\begin_layout Plain Layout
The program
\family typewriter
sed
\family default
allows stream editing using regular expressions.
See
\begin_inset CommandInset ref
LatexCommand vref
reference "alg:Script-Tally-Corpus"
\end_inset
.
\end_layout
\end_inset
\end_layout
\begin_layout Enumerate
Eliminate non-words (while preserving lemmas)
\begin_inset Foot
status collapsed
\begin_layout Plain Layout
The words
\emph on
run
\emph default
,
\emph on
runs
\emph default
,
\emph on
ran
\emph default
, and
\emph on
running
\emph default
are from the same lexeme, with
\emph on
run
\emph default
as the lemma.
\end_layout
\end_inset
from the corpus tallies by cross-referencing the dictionary.
\end_layout
\begin_layout Enumerate
Convert the number of occurrences to probabilities using a rudimentary file
format (such as CSV).
\begin_inset Foot
status collapsed
\begin_layout Plain Layout
See
\begin_inset CommandInset ref
LatexCommand vref
reference "alg:Script-Tallies-Probabilities"
\end_inset
.
\end_layout
\end_inset
\end_layout
\begin_layout Standard
The above steps are not a lot of work, nor are they extremely complicated;
the steps are crucial as they set the stage for splitting the concatenated
phrases (column names) based on relative usage probability.
The complicated part is performing the analysis to split the phrases.
To do this, find a software tool that performs text segmentation (also
known as word splitting), such as the one described in
\begin_inset CommandInset ref
LatexCommand ref
reference "cha:Word-Split"
\end_inset
.
\end_layout
\begin_layout Section
\begin_inset CommandInset label
LatexCommand label
name "sec:Resource-Bundle"
\end_inset
Resource Bundle
\end_layout
\begin_layout Standard
As described, the data dictionary has many uses, one of which is generating
a resource bundle.
The query to generate the resource bundle uses the following views:
\end_layout
\begin_layout Itemize
\family typewriter
dictionary.resource_bundle
\end_layout
\begin_layout Itemize
\family typewriter
dictionary.generate_resource_bundle
\end_layout
\begin_layout Standard
The view
\family typewriter
dictionary.resource_bundle
\family default
is the work horse; the other view is for convenience, based on the former.
The
\family typewriter
dictionary.resource_bundle
\family default
view is the mastermind view of the data dictionary ERD, it is one of the
reasons the data dictionary exists.
\end_layout
\begin_layout Subsection
Custom Resource Bundles
\end_layout
\begin_layout Standard
Inevitably, there will be various texts that have no corresponding database
column.
It is a mistake to create a data dictionary and generate a resource bundle
only to edit that resource bundle by hand.
Any manual changes to the resource bundle are lost when you regenerate
the resource bundle.
The simplest solution is to create another (likely smaller) resource bundle,
one that is maintained by hand.
In theory, this should not pose a problem.
In practice,
\begin_inset ERT
status collapsed
\begin_layout Plain Layout
\backslash
appireport{}
\end_layout
\end_inset
does not support multiple resource bundles.
\end_layout
\begin_layout Standard
Until
\begin_inset ERT
status collapsed
\begin_layout Plain Layout
\backslash
appireport{}
\end_layout
\end_inset
supports multiple resource bundles, a quick work-around is to concatenate
the computer-generated bundle and the manually-maintained bundle to create
a new, combined resource bundle.
Configure
\begin_inset ERT
status collapsed
\begin_layout Plain Layout
\backslash
appireport{}
\end_layout
\end_inset
to use the combined resource bundle.
\end_layout
\begin_layout Subsection
Alternate Text Lengths
\end_layout
\begin_layout Standard
As printed pages have fixed widths, additional column name values are sometimes
necessary.
Shorter names can be implemented by addding an alternate column to the
data dictionary's translation.
Generate the resource bundle keys as before with the alternate values alongside
the original.
For example:
\end_layout
\begin_layout LyX-Code
person.name.first=First Name
\end_layout
\begin_layout LyX-Code
person.name.first.alternate=First
\end_layout
\begin_layout Standard
The word ``alternate'' is arbitrary.
Use a suitable term for your application.
\end_layout
\begin_layout Section
Business Areas and Column Clusters
\end_layout
\begin_layout Standard
Assigning column clusters to business areas is a completely manual process.
There is no practical way to program the database organization in a way
that makes sense for a corporation.
The term
\emph on
business cluster
\emph default
is short-hand for a business area-column cluster relationship.
This has a couple of implications:
\end_layout
\begin_layout Itemize
\series bold
Business Cluster Map.
\series default
How do you associate business areas with column clusters?
\end_layout
\begin_layout Itemize
\series bold
Database Migration.
\series default
How do you retain the map when migrating through the development lifecycle?
\end_layout
\begin_layout Standard
These topics are described in the following subsections.
\end_layout
\begin_layout Subsection
\begin_inset CommandInset label
LatexCommand label
name "sub:Business-Cluster-Map"
\end_inset
Business Cluster Map
\end_layout
\begin_layout Standard
It would be inconvenient, error-prone, and laborious to manually perform
the same set of steps to insert a relationship between a business area
and a column cluster for every possible business cluster relationship.
Maintaining the association between business areas and column clusters
can be accomplished as follows:
\end_layout
\begin_layout Enumerate
Create a function that, when given a fully qualified column reference (for
example,
\family typewriter
schema.entity.column
\family default
), returns the primary key from the object's column relationship (that is,
the primary key of the row that maintains the relationship between a table
(or view) and a column).
\end_layout
\begin_layout Enumerate
Create a function that performs the necessary steps to insert the new business
area-column cluster relation that meets the data dictionary's relational
constraints.
The function requires a business area, column cluster, business area type,
and fully qualified column reference.
\end_layout
\begin_layout Enumerate
Create a spreadsheet that tracks the mapping of business areas to column
clusters and produces the corresponding SQL statements to insert the relationsh
ips into the database, using the function defined in the previous step.
\end_layout
\begin_layout Standard
Example functions for the first and second steps are in
\begin_inset CommandInset ref
LatexCommand ref
reference "sub:filesqlgetentitycolumnpk"
\end_inset
and
\begin_inset CommandInset ref
LatexCommand vref
reference "sub:filesqlbusinessclusterrelation"
\end_inset
, respectively.
It is the latter function that is called to perform useful work.
A spreadsheet is not the only possible way to create the initial assignment
of column clusters to business areas, but it is expedient.
The idea is to create a series of sheets---one per business area---that
consists of the name of a column cluster and a number of rows representing
the fully qualified column that must be part of that column cluster.
This is represented visually in
\begin_inset CommandInset ref
LatexCommand ref
reference "tab:Example-Business-Cluster-Map"
\end_inset
.
\end_layout
\begin_layout Standard
\begin_inset Float table
wide false
sideways false
status collapsed
\begin_layout Plain Layout
\begin_inset Caption
\begin_layout Plain Layout
\begin_inset CommandInset label
LatexCommand label
name "tab:Example-Business-Cluster-Map"
\end_inset
Example Business Cluster Map
\end_layout
\end_inset
\end_layout
\begin_layout Plain Layout
\begin_inset Tabular
<lyxtabular version="3" rows="5" columns="3">
<features>
<column alignment="left" valignment="top" width="0">
<column alignment="left" valignment="top" width="0">
<column alignment="left" valignment="top" width="0">
<row>
<cell alignment="left" valignment="top" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\begin_inset ERT
status collapsed
\begin_layout Plain Layout
\backslash
rowcolor{tableheader}
\backslash
toprule{}
\end_layout
\end_inset
\series bold
name
\end_layout
\end_inset
</cell>
<cell alignment="left" valignment="top" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\series bold
address
\end_layout
\end_inset
</cell>
<cell alignment="left" valignment="top" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\series bold
appointment
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="left" valignment="top" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\begin_inset ERT
status collapsed
\begin_layout Plain Layout
\backslash
midrule{}
\end_layout
\end_inset
clinic.person.first_name
\end_layout
\end_inset
</cell>
<cell alignment="left" valignment="top" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
clinic.address.street
\end_layout
\end_inset
</cell>
<cell alignment="left" valignment="top" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
clinic.appointment.start_date
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="left" valignment="top" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
clinic.person.last_name
\end_layout
\end_inset
</cell>
<cell alignment="left" valignment="top" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
clinic.address.city
\end_layout
\end_inset
</cell>
<cell alignment="left" valignment="top" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
clinic.appointment.duration
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
clinic.person.middle_name
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
clinic.address.region
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
clinic.person.title
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
clinic.address.country
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\begin_inset ERT
status collapsed
\begin_layout Plain Layout
\backslash
tabularnewline
\backslash
bottomrule%
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
</row>
</lyxtabular>
\end_inset
\end_layout
\end_inset
\end_layout
\begin_layout Standard
The header row in
\begin_inset CommandInset ref
LatexCommand ref
reference "tab:Example-Business-Cluster-Map"
\end_inset
represents the name of the column cluster, and the values in each column
point to the location in the database where the information is stored.
Group the sheets by business area, such as the business areas shown in
\begin_inset CommandInset ref
LatexCommand vref
reference "fig:Data-Dictionary-Business-Clusters"
\end_inset
.
After the spreadsheet is complete, it is then possible to generate the
SQL statements required to create the business cluster map by calling the
function described in
\begin_inset CommandInset ref
LatexCommand vref
reference "sub:filesqlbusinessclusterrelation"
\end_inset
.
\end_layout
\begin_layout Subsection
Database Migration
\end_layout
\begin_layout Standard
After creating the relationships in a development database, it is prudent
to make a backup of these relationships.
Depending on how the database migration works,
\begin_inset Foot
status collapsed
\begin_layout Plain Layout
There are schema comparison tools, but they have technical quirks that prevent
a completely automated migration.
\end_layout
\end_inset
it might not be feasible to simply copy the data from the development database
into the testing database.
For this, create a view that generates the SQL statements from
\begin_inset CommandInset ref
LatexCommand vref
reference "sub:filesqlregeneratebusinessclusters"
\end_inset
.
The resulting SQL script can be used to reseed the business area-column
cluster relationships.
\end_layout
\begin_layout Section
Search
\end_layout
\begin_layout Standard
A problem with systems that are based on a vast number of tables and views
is in trying to find specific columns.
There are a number of ways to hunt through a database that contains hundreds,
or even thousands, of entities.
Graphical user interfaces (GUIs), such as pgAdmin,
\begin_inset Foot
status collapsed
\begin_layout Plain Layout
\begin_inset CommandInset href
LatexCommand href
target "http://www.pgadmin.org/"
\end_inset
\end_layout
\end_inset
or web-based products such as phpMyAdmin
\begin_inset Foot
status collapsed
\begin_layout Plain Layout
\begin_inset CommandInset href
LatexCommand href
target "http://www.phpmyadmin.net/"
\end_inset
\end_layout
\end_inset
display a long, long list.
And although some tools allow developers to search database meta-information,
the following problems have not been adequately solved in practice:
\end_layout
\begin_layout Itemize
\series bold
Search.
\series default
They all lack the ability to search, in natural language terms, for database
columns.
\end_layout
\begin_layout Itemize
\series bold
Scroll.
\series default
Most GUIs have awkward interfaces for finding objects: an unwieldy list
forcing the user to scroll.
Even when a filter is available, or the objects can be found by typing
its name, some developers insist on scrolling through the list, which is
woefully inefficient.
\end_layout
\begin_layout Standard
After the data dictionary is created, though, a query can be constructed
to scan for specific items without knowing the exact column names.
One such possibility is shown in
\begin_inset CommandInset ref
LatexCommand ref
reference "alg:Searching-the-Data-Dictionary"
\end_inset
.
\end_layout
\begin_layout Standard
\begin_inset Float algorithm
wide false
sideways false
status collapsed
\begin_layout Plain Layout
\begin_inset Graphics
filename source/sql/data-dictionary/usage/search.pgsql.png
display false
\end_inset
\end_layout
\begin_layout Plain Layout
\begin_inset Caption
\begin_layout Plain Layout
\begin_inset CommandInset label
LatexCommand label
name "alg:Searching-the-Data-Dictionary"
\end_inset
Searching the Data Dictionary
\end_layout
\end_inset
\end_layout
\end_inset
\end_layout
\begin_layout Standard
Notice that the SQL statement contains the following code:
\end_layout
\begin_layout LyX-Code
SELECT
\end_layout
\begin_layout LyX-Code
t.*
\end_layout
\begin_layout Standard
The only time implicitly selecting all columns is adviseable is when executing
one-off statements (like when searching the data dictionary)---production
code must explicitly name columns.
\end_layout
\begin_layout Section
Summary
\end_layout
\begin_layout Standard
A system that is completely internally consistent (that minimizes misleading
names) is a difficult but noble goal.
The data dictionary has the ability to highlight inconsistent database
areas.
Any database system of sufficient complexity will inevitably introduce
unintentional naming conflicts, or will have two different column names
refer to the same data.
The views described in this chapter (for example,
\family typewriter
dictionary.resource_bundle
\family default
) can be used to discover parts of the database that are falling into disrepair.
\end_layout
\begin_layout Standard
Lastly, with a little bit of ingenuity, it is possible to build a partial
\emph on
ad hoc
\emph default
reporting system based on the data dictionary.
\begin_inset ERT
status collapsed
\begin_layout Plain Layout
\backslash
appjasperreports{}
\end_layout
\end_inset
is more than a system to execute reports developed in
\begin_inset ERT
status collapsed
\begin_layout Plain Layout
\backslash
appireport{}
\end_layout
\end_inset
.
\begin_inset ERT
status collapsed
\begin_layout Plain Layout
\backslash
appjasperreports{}
\end_layout
\end_inset
also allows software developers to write programs to develop reports.
\end_layout
\end_body
\end_document