Dave Jarvis' Repositories

git clone https://repo.autonoma.ca/repo/indispensable.git
#LyX 1.6.7 created this file. For more info see http://www.lyx.org/
\lyxformat 345
\begin_document
\begin_header
\textclass scrbook
\begin_preamble
\input{preamble.tex}
\end_preamble
\use_default_options false
\language english
\inputencoding utf8
\font_roman lmodern
\font_sans helvet
\font_typewriter courier
\font_default_family default
\font_sc false
\font_osf false
\font_sf_scale 100
\font_tt_scale 100

\graphics default
\float_placement H
\paperfontsize default
\spacing single
\use_hyperref true
\pdf_bookmarks true
\pdf_bookmarksnumbered false
\pdf_bookmarksopen false
\pdf_bookmarksopenlevel 1
\pdf_breaklinks false
\pdf_pdfborder false
\pdf_colorlinks false
\pdf_backref false
\pdf_pdfusetitle true
\papersize letterpaper
\use_geometry false
\use_amsmath 2
\use_esint 1
\cite_engine basic
\use_bibtopic false
\paperorientation portrait
\secnumdepth 3
\tocdepth 3
\paragraph_separation indent
\defskip medskip
\quotes_language english
\papercolumns 1
\papersides 1
\paperpagestyle default
\bullet 0 5 11 -1
\bullet 1 5 24 -1
\bullet 2 0 0 -1
\tracking_changes false
\output_changes false
\author "" 
\author "" 
\end_header

\begin_body

\begin_layout Standard
\begin_inset ERT
status collapsed

\begin_layout Plain Layout


\backslash
input{chapter.tex}
\end_layout

\end_inset


\end_layout

\begin_layout Chapter
\begin_inset CommandInset label
LatexCommand label
name "cha:Data-Dictionary"

\end_inset

Data Dictionary
\end_layout

\begin_layout Standard
\begin_inset ERT
status collapsed

\begin_layout Plain Layout


\backslash
epigraph{Words are but symbols for the relations of things to one another
 and to us; nowhere do they touch upon absolute truth.}{Friedrich Nietzsche}
\end_layout

\end_inset


\end_layout

\begin_layout Standard
Throughout the remainder of this manual, 
\emph on
data dictionary
\emph default
 is the name given to a map of all database columns represented in a human-reada
ble format (including internationalization).
 You can think of it as a 
\emph on
column dictionary
\emph default
.
 
\begin_inset CommandInset ref
LatexCommand ref
reference "fig:Data-Dictionary-Overview"

\end_inset

 represents practical uses of a data dictionary.
\end_layout

\begin_layout Standard
\begin_inset Float figure
wide false
sideways false
status collapsed

\begin_layout Plain Layout
\begin_inset Graphics
	filename drawings/dictionary-architecture.png
	display false

\end_inset


\end_layout

\begin_layout Plain Layout
\begin_inset Caption

\begin_layout Plain Layout
\begin_inset CommandInset label
LatexCommand label
name "fig:Data-Dictionary-Overview"

\end_inset

Data Dictionary Overview
\end_layout

\end_inset


\end_layout

\end_inset


\end_layout

\begin_layout Standard
The primary purpose of the data dictionary is to define a single location
 where programmer jargon is translated into end user text producing a completely
 consistent experience.
 Creating a data dictionary requires the following:
\end_layout

\begin_layout Itemize
Document an Entity Relationship Diagram.
\end_layout

\begin_layout Itemize
Create database tables and views.
\end_layout

\begin_layout Itemize
Convert column names to text.
\end_layout

\begin_layout Itemize
Populate the tables with data.
\end_layout

\begin_layout Itemize
List business areas and column usage.
\end_layout

\begin_layout Itemize
Generate the resource bundles.
\end_layout

\begin_layout Standard
A data dictionary has the following advantages:
\end_layout

\begin_layout Itemize
Single, consistent text throughout all visible parts of the system.
\end_layout

\begin_layout Itemize
Easy-to-find database schema, table, and column identifiers.
\begin_inset Foot
status collapsed

\begin_layout Plain Layout
Useful when report requirements state desired data, not fully qualified
 database columns.
\end_layout

\end_inset


\end_layout

\begin_layout Itemize
Readily supported internationalization.
\begin_inset Foot
status collapsed

\begin_layout Plain Layout
The brunt of the work can be machine translated.
\end_layout

\end_inset


\end_layout

\begin_layout Itemize
Ability to correlate dependencies between reports and database columns.
\end_layout

\begin_layout Itemize
Ability to develop a tool for generating and managing 
\begin_inset ERT
status collapsed

\begin_layout Plain Layout


\backslash
appjasperserver{}
\end_layout

\end_inset

 Domains.
\end_layout

\begin_layout Standard
A data dictionary has the following disadvantages:
\end_layout

\begin_layout Itemize
Creating the data dictionary takes development effort.
\end_layout

\begin_layout Itemize
A simple implementation cannot adequately capture 100% of business needs.
\end_layout

\begin_layout Standard
In this chapter:
\end_layout

\begin_layout Itemize

\series bold
Business Clusters.

\series default
 Describes the pairing of Business Areas with Column Clusters.
\end_layout

\begin_layout Itemize

\series bold
Entity Relationship Diagram.

\series default
 Describes tables and columns used by the data dictionary.
\end_layout

\begin_layout Itemize

\series bold
Create Tables and Views.

\series default
 References the script to create the data dictionary.
\end_layout

\begin_layout Itemize

\series bold
Populate Database.

\series default
 Describes the recurring task to keep the data dictionary up-to-date.
\end_layout

\begin_layout Itemize

\series bold
Map Column Names.

\series default
 In-depth review of different ways to obtain column translations.
\end_layout

\begin_layout Itemize

\series bold
Resource Bundle.

\series default
 Briefly describes how to create resource bundles from the data dictionary.
\end_layout

\begin_layout Itemize

\series bold
Business Areas and Column Clusters.

\series default
 Describes how to export and maintain the relationships between Business
 Areas and Column Clusters.
\end_layout

\begin_layout Section
Business Clusters
\end_layout

\begin_layout Standard
Before diving headlong into the data dictionary's entity relationship diagram
 (ERD), it is worthwhile to see how the data dictionary can be organized
 to meet business needs.
 Every database column must be categorized before it can be used to generate
 content.
 In 
\begin_inset CommandInset ref
LatexCommand ref
reference "fig:Data-Dictionary-Business-Clusters"

\end_inset

, the organizational structure for a physician's office serves as an example
 to explain how the data dictionary is structured.
\end_layout

\begin_layout Standard
\begin_inset Float figure
wide false
sideways false
status collapsed

\begin_layout Plain Layout
\begin_inset Graphics
	filename drawings/dictionary-business-clusters.png
	display false

\end_inset


\end_layout

\begin_layout Plain Layout
\begin_inset Caption

\begin_layout Plain Layout
\begin_inset CommandInset label
LatexCommand label
name "fig:Data-Dictionary-Business-Clusters"

\end_inset

Data Dictionary Business Clusters
\end_layout

\end_inset


\end_layout

\end_inset


\end_layout

\begin_layout Standard
The crucial components to 
\begin_inset CommandInset ref
LatexCommand ref
reference "fig:Data-Dictionary-Business-Clusters"

\end_inset

 are as follows:
\end_layout

\begin_layout Itemize

\series bold
Column Translation.

\series default
 Maps database column names to human-readable text.
\end_layout

\begin_layout Itemize

\series bold
Entity Column Map.

\series default
 Maps column names to their corresponding tables or views.
\end_layout

\begin_layout Itemize

\series bold
Business Areas.

\series default
 Describes a high-level collection of the major operational aspects to the
 business.
\end_layout

\begin_layout Itemize

\series bold
Column Clusters.

\series default
 Describes a low-level collection of database columns that logically group
 within a Business Area.
\end_layout

\begin_layout Standard
Together, the Business Areas and Column Clusters are called 
\emph on
Business Clusters
\emph default
 as they are tightly associated.
 For a fictional physician, the business might be concerned with patients,
 payment processing, inventory, clinic locations, and so forth.
 Each business area, then, can have any number of associated column clusters.
 
\begin_inset CommandInset ref
LatexCommand ref
reference "fig:Data-Dictionary-Business-Clusters"

\end_inset

 shows the column clusters for a patient.
 Every patient has a name, an address, a phone number, information about
 their health, and so on.
 The patient's name might include given, surname, middle, and title; whereas,
 a patient's session might include items like date, time, duration, and
 next appointment.
\end_layout

\begin_layout Standard
Notice that a patient and a physician both can have a name cluster; so too,
 the patient, physician, staff, and clinic all have an associated address
 cluster.
\end_layout

\begin_layout Standard
The Business Cluster associates Business Areas with Column Clusters.
 Conceptually, each Column Cluster contains a set of references to database
 columns.
 As the ERD in 
\begin_inset CommandInset ref
LatexCommand ref
reference "sec:Entity-Relationship-Diagram"

\end_inset

 describes, how the database design actually works is a little bit more
 complex.
\end_layout

\begin_layout Section
\begin_inset CommandInset label
LatexCommand label
name "sec:Entity-Relationship-Diagram"

\end_inset

Entity Relationship Diagram
\end_layout

\begin_layout Standard
\begin_inset CommandInset ref
LatexCommand ref
reference "fig:Data-Dictionary-ERD"

\end_inset

 shows the entity relationship diagram used by the data dictionary.
\end_layout

\begin_layout Standard
\begin_inset Float figure
wide false
sideways false
status collapsed

\begin_layout Plain Layout
\begin_inset Graphics
	filename drawings/data-dictionary-erd.png
	display false

\end_inset


\end_layout

\begin_layout Plain Layout
\begin_inset Caption

\begin_layout Plain Layout
\begin_inset CommandInset label
LatexCommand label
name "fig:Data-Dictionary-ERD"

\end_inset

Data Dictionary ERD
\end_layout

\end_inset


\end_layout

\end_inset


\end_layout

\begin_layout Standard
The diagram includes the following conceptual areas:
\end_layout

\begin_layout Itemize
Translation and Internationalization
\end_layout

\begin_layout Itemize
Entity Relations
\end_layout

\begin_layout Itemize
Business Clusters
\end_layout

\begin_layout Subsection
Translation and Internationalization
\end_layout

\begin_layout Standard
The purpose of the 
\family typewriter
column_translation
\family default
 and 
\family typewriter
object_column
\family default
 tables is to associate a word or phrase with a database column, effectively
 mapping column names named by software developers to human readable text
 for end users.
 Joining data between the two (on the 
\family typewriter
object_column_id
\family default
 column) can return data shown in 
\begin_inset CommandInset ref
LatexCommand ref
reference "tab:Example-Column-Translations"

\end_inset

.
\end_layout

\begin_layout Standard
\begin_inset Float table
wide false
sideways false
status open

\begin_layout Plain Layout
\begin_inset Caption

\begin_layout Plain Layout
\begin_inset CommandInset label
LatexCommand label
name "tab:Example-Column-Translations"

\end_inset

Example Column Translations
\end_layout

\end_inset


\begin_inset Tabular
<lyxtabular version="3" rows="5" columns="5">
<features>
<column alignment="left" valignment="top" width="0">
<column alignment="left" valignment="top" width="0">
<column alignment="left" valignment="top" width="0">
<column alignment="left" valignment="top" width="0">
<column alignment="left" valignment="top" width="0">
<row>
<cell alignment="left" valignment="top" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
\begin_inset ERT
status collapsed

\begin_layout Plain Layout


\backslash
rowcolor{tableheader}
\backslash
toprule{}
\end_layout

\end_inset


\series bold
translated_text
\end_layout

\end_inset
</cell>
<cell alignment="left" valignment="top" usebox="none">
\begin_inset Text

\begin_layout Plain Layout

\series bold
language_code
\end_layout

\end_inset
</cell>
<cell alignment="left" valignment="top" usebox="none">
\begin_inset Text

\begin_layout Plain Layout

\series bold
country_code
\end_layout

\end_inset
</cell>
<cell alignment="left" valignment="top" usebox="none">
\begin_inset Text

\begin_layout Plain Layout

\series bold
column_name
\end_layout

\end_inset
</cell>
<cell alignment="left" valignment="top" usebox="none">
\begin_inset Text

\begin_layout Plain Layout

\series bold
business_name
\end_layout

\end_inset
</cell>
</row>
<row>
<cell alignment="left" valignment="top" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
First Name
\end_layout

\end_inset
</cell>
<cell alignment="left" valignment="top" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
en
\end_layout

\end_inset
</cell>
<cell alignment="left" valignment="top" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
CA
\end_layout

\end_inset
</cell>
<cell alignment="left" valignment="top" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
first_name
\end_layout

\end_inset
</cell>
<cell alignment="left" valignment="top" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
first
\end_layout

\end_inset
</cell>
</row>
<row>
<cell alignment="left" valignment="top" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
Hours
\end_layout

\end_inset
</cell>
<cell alignment="left" valignment="top" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
en
\end_layout

\end_inset
</cell>
<cell alignment="left" valignment="top" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
CA
\end_layout

\end_inset
</cell>
<cell alignment="left" valignment="top" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
pay_hours
\end_layout

\end_inset
</cell>
<cell alignment="left" valignment="top" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
hours
\end_layout

\end_inset
</cell>
</row>
<row>
<cell alignment="left" valignment="top" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
Street
\end_layout

\end_inset
</cell>
<cell alignment="left" valignment="top" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
en
\end_layout

\end_inset
</cell>
<cell alignment="left" valignment="top" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
CA
\end_layout

\end_inset
</cell>
<cell alignment="left" valignment="top" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
addr_street
\end_layout

\end_inset
</cell>
<cell alignment="left" valignment="top" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
street
\end_layout

\end_inset
</cell>
</row>
<row>
<cell alignment="left" valignment="top" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
Rue
\end_layout

\end_inset
</cell>
<cell alignment="left" valignment="top" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
fr
\end_layout

\end_inset
</cell>
<cell alignment="left" valignment="top" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
CA
\end_layout

\end_inset
</cell>
<cell alignment="left" valignment="top" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
addr_street
\end_layout

\end_inset
</cell>
<cell alignment="left" valignment="top" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
street
\begin_inset ERT
status collapsed

\begin_layout Plain Layout


\backslash
tabularnewline
\backslash
bottomrule%
\end_layout

\end_inset


\end_layout

\end_inset
</cell>
</row>
</lyxtabular>

\end_inset


\end_layout

\end_inset


\end_layout

\begin_layout Standard
This rudimentary design has some deficiencies:
\end_layout

\begin_layout Itemize
The 
\family typewriter
language_code
\family default
 and 
\family typewriter
country_code
\family default
 combinations can be normalized.
\end_layout

\begin_layout Itemize
The 
\family typewriter
report_width_min
\family default
 (and 
\family typewriter
max
\family default
) columns could be defined elsewhere.
\end_layout

\begin_layout Standard
The report width columns provide hints to developers when they are writing
 reports.
 As printed reports typically have a finite width
\begin_inset Foot
status collapsed

\begin_layout Plain Layout
There are not many situations where reports are printed on plotters.
\end_layout

\end_inset

 (for example, letter or legal sized), developers need to know how much
 space they can allocate to fields across a page.
 The same column name used among different tables and views has a single
 occurrence of its name within the 
\family typewriter
object_column
\family default
 table.
\end_layout

\begin_layout Standard
Differences exist between Australlian, British, and Canadian English.
\begin_inset Foot
status collapsed

\begin_layout Plain Layout
For example, the word ``root'' has vulgar connotations in Australia, but
 ``root directory'' in Canada is not so uncommon.
\end_layout

\end_inset

 To avoid potentially embarrassing translations, each column can have a
 translation based on language 
\emph on
and
\emph default
 country, as defined by the 
\family typewriter
language_code
\family default
 and 
\family typewriter
country_code
\family default
 columns in 
\begin_inset CommandInset ref
LatexCommand ref
reference "tab:Example-Column-Translations"

\end_inset

.
\end_layout

\begin_layout Subsection
Entity Relations
\end_layout

\begin_layout Standard
These tables serve to list all the database objects (tables and views) in
 the system having at least one column with columns that are presented to
 an end user.
 This is accomplished using the following tables:
\end_layout

\begin_layout Itemize

\family typewriter
entity_relation
\family default
 -- Lists unique database objects.
\end_layout

\begin_layout Itemize

\family typewriter
object_column_entity_relation
\family default
 -- Maps the relationship between columns and database objects.
\end_layout

\begin_layout Subsection
Business Clusters
\end_layout

\begin_layout Standard
A large private chain of medical clinics might have business areas such
 as, physicians, patients, locations, inventory, and medical history records.
 Within each business area is a collection of categories, or 
\begin_inset Quotes eld
\end_inset

column clusters
\begin_inset Quotes erd
\end_inset

, which are general attributes associated with each business area.
 For example, a physician likely has an address, phone numbers, and a name.
 Each of these attributes refers to a cluster of data stored in columns.
 An address consists of a street, city, region, country, and code.
 Similarly, a name
\emph on
 
\emph default
might include first, last, middle, prefix, and suffix.
 In this example, a name represents a column cluster and a physician represents
 a business area.
\end_layout

\begin_layout Standard
The point of the business area-column clusters, then, is to organize the
 business data, as shown in 
\begin_inset CommandInset ref
LatexCommand ref
reference "tab:Business-Clusters"

\end_inset

.
\end_layout

\begin_layout Standard
\begin_inset Float table
wide false
sideways false
status collapsed

\begin_layout Plain Layout
\begin_inset Caption

\begin_layout Plain Layout
\begin_inset CommandInset label
LatexCommand label
name "tab:Business-Clusters"

\end_inset

Business Clusters
\end_layout

\end_inset


\begin_inset ERT
status collapsed

\begin_layout Plain Layout


\backslash
begin{small}
\end_layout

\end_inset


\end_layout

\begin_layout Plain Layout
\begin_inset Tabular
<lyxtabular version="3" rows="5" columns="7">
<features>
<column alignment="left" valignment="top" width="0">
<column alignment="left" valignment="top" width="0">
<column alignment="left" valignment="top" width="0">
<column alignment="left" valignment="top" width="0">
<column alignment="left" valignment="top" width="0">
<column alignment="left" valignment="top" width="0">
<column alignment="left" valignment="top" width="0">
<row>
<cell alignment="left" valignment="top" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
\begin_inset ERT
status collapsed

\begin_layout Plain Layout


\backslash
rowcolor{tableheader}
\backslash
toprule{}
\end_layout

\end_inset


\series bold
Area
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" usebox="none">
\begin_inset Text

\begin_layout Plain Layout

\series bold
Cluster
\end_layout

\end_inset
</cell>
<cell alignment="left" valignment="top" usebox="none">
\begin_inset Text

\begin_layout Plain Layout

\series bold
Relation
\end_layout

\end_inset
</cell>
<cell alignment="left" valignment="top" usebox="none">
\begin_inset Text

\begin_layout Plain Layout

\series bold
Business Name
\end_layout

\end_inset
</cell>
<cell alignment="left" valignment="top" usebox="none">
\begin_inset Text

\begin_layout Plain Layout

\series bold
Column
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" usebox="none">
\begin_inset Text

\begin_layout Plain Layout

\series bold
Business Name
\end_layout

\end_inset
</cell>
<cell alignment="left" valignment="top" usebox="none">
\begin_inset Text

\begin_layout Plain Layout

\series bold
Text
\end_layout

\end_inset
</cell>
</row>
<row>
<cell alignment="left" valignment="top" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
\begin_inset ERT
status collapsed

\begin_layout Plain Layout


\backslash
midrule{}
\end_layout

\end_inset

physician
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
name
\end_layout

\end_inset
</cell>
<cell alignment="left" valignment="top" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
physician
\end_layout

\end_inset
</cell>
<cell alignment="left" valignment="top" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
Physician
\end_layout

\end_inset
</cell>
<cell alignment="left" valignment="top" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
fname
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
first
\end_layout

\end_inset
</cell>
<cell alignment="left" valignment="top" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
First Name
\end_layout

\end_inset
</cell>
</row>
<row>
<cell alignment="left" valignment="top" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
physician
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
name
\end_layout

\end_inset
</cell>
<cell alignment="left" valignment="top" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
physician
\end_layout

\end_inset
</cell>
<cell alignment="left" valignment="top" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
Physician
\end_layout

\end_inset
</cell>
<cell alignment="left" valignment="top" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
lname
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
last
\end_layout

\end_inset
</cell>
<cell alignment="left" valignment="top" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
Last Name
\end_layout

\end_inset
</cell>
</row>
<row>
<cell alignment="left" valignment="top" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
patient
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
name
\end_layout

\end_inset
</cell>
<cell alignment="left" valignment="top" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
patient
\end_layout

\end_inset
</cell>
<cell alignment="left" valignment="top" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
Patient
\end_layout

\end_inset
</cell>
<cell alignment="left" valignment="top" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
fname
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
first
\end_layout

\end_inset
</cell>
<cell alignment="left" valignment="top" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
First Name
\end_layout

\end_inset
</cell>
</row>
<row>
<cell alignment="left" valignment="top" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
patient
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
name
\end_layout

\end_inset
</cell>
<cell alignment="left" valignment="top" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
patient
\end_layout

\end_inset
</cell>
<cell alignment="left" valignment="top" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
Patient
\end_layout

\end_inset
</cell>
<cell alignment="left" valignment="top" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
lname
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
last
\end_layout

\end_inset
</cell>
<cell alignment="left" valignment="top" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
Last Name
\begin_inset ERT
status collapsed

\begin_layout Plain Layout


\backslash
tabularnewline
\backslash
bottomrule%
\end_layout

\end_inset


\end_layout

\end_inset
</cell>
</row>
</lyxtabular>

\end_inset


\begin_inset ERT
status collapsed

\begin_layout Plain Layout


\backslash
end{small}
\end_layout

\end_inset


\end_layout

\end_inset


\end_layout

\begin_layout Standard
Using the data from 
\begin_inset CommandInset ref
LatexCommand ref
reference "tab:Business-Clusters"

\end_inset

, it is now relatively straightforward to write a SQL statement that generates
 the resource bundle in 
\begin_inset CommandInset ref
LatexCommand ref
reference "alg:Autogenerated-Resource-Bundle"

\end_inset

.
\end_layout

\begin_layout Standard
\begin_inset Float algorithm
wide false
sideways false
status collapsed

\begin_layout Plain Layout
\begin_inset Graphics
	filename source/text/business_cluster.properties.png
	display false

\end_inset


\end_layout

\begin_layout Plain Layout
\begin_inset Caption

\begin_layout Plain Layout
\begin_inset CommandInset label
LatexCommand label
name "alg:Autogenerated-Resource-Bundle"

\end_inset

Autogenerated Resource Bundle
\end_layout

\end_inset


\end_layout

\end_inset


\end_layout

\begin_layout Standard
Note here that the data only appear to be duplicated.
 In the data dictionary's table that contains a list of all the columns,
 the column name of 
\begin_inset Quotes eld
\end_inset

fname
\begin_inset Quotes erd
\end_inset

 appears a single time.
 If the column 
\begin_inset Quotes eld
\end_inset

fname
\begin_inset Quotes erd
\end_inset

 does not store information about a person's 
\begin_inset Quotes eld
\end_inset

First Name
\begin_inset Quotes erd
\end_inset

 throughout the database, you will either have to add another layer of complexit
y into this entity relationship design or force a consistent naming scheme
 onto the data model.
 Although it can be painful and time-consuming, favour the latter.
\end_layout

\begin_layout Section
Create Tables and Views
\end_layout

\begin_layout Standard
Modern ERD tools
\begin_inset Foot
status collapsed

\begin_layout Plain Layout
DbSchema, Toad, eRWin, Druid III, and others.
\end_layout

\end_inset

 can transform diagrams, such as 
\begin_inset CommandInset ref
LatexCommand ref
reference "fig:Data-Dictionary-ERD"

\end_inset

, into executable SQL code.
 To install a dictionary schema and create associated database objects,
 see the auto-generated SQL source code in 
\begin_inset CommandInset ref
LatexCommand vref
reference "sub:filesqlcreatedatadictionary"

\end_inset

.
\end_layout

\begin_layout Section
Populate Database
\end_layout

\begin_layout Standard
Populating a database is not a one-time task.
 Every time new tables, views, or columns are added to a system, its data
 dictionary must be updated to include these items.
 Fortunately, that's a matter of running a few queries.
 To (re-)populate the data dictionary in a database:
\end_layout

\begin_layout Enumerate
Scan for new (or removed) tables, views, or columns.
\end_layout

\begin_layout Enumerate
Link new columns with text.
\end_layout

\begin_layout Enumerate
Assign new columns with all relevant tables.
\end_layout

\begin_layout Standard
Example scripts to (re-)populate the data dictionary are in the data dictionary'
s maintenance directory (see 
\begin_inset CommandInset ref
LatexCommand vref
reference "sec:Data-Dictionary-Scripts"

\end_inset

), with the following names:
\end_layout

\begin_layout Itemize
\begin_inset ERT
status collapsed

\begin_layout Plain Layout


\backslash
filename{
\backslash
filesqlentitycolumn}
\end_layout

\end_inset


\end_layout

\begin_layout Itemize
\begin_inset ERT
status collapsed

\begin_layout Plain Layout


\backslash
filename{
\backslash
filesqlcolumntext}
\end_layout

\end_inset


\end_layout

\begin_layout Itemize
\begin_inset ERT
status collapsed

\begin_layout Plain Layout


\backslash
filename{
\backslash
filesqlrelateentitycolumn}
\end_layout

\end_inset


\end_layout

\begin_layout Standard
The first script generates a sequence of 
\family typewriter
INSERT
\family default
 and 
\family typewriter
DELETE
\family default
 statements that either inserts new entities and columns or removes expired
 entities and columns.
 Review (and update, if necessary) the auto-generated statements before
 executing them.
 The second script generates a list of 
\family typewriter
INSERT
\family default
 statements that associate database columns with text.
 These insertions can later be internationalized.
 The third script associates the new-found columns with their corresponding
 new-found tables and views.
 The last script does not generate code and can take several minutes to
 complete.
\end_layout

\begin_layout Section
\begin_inset CommandInset label
LatexCommand label
name "sec:Map-Column-Names"

\end_inset

Map Column Names
\end_layout

\begin_layout Standard
The least time consuming way to map column names to their human-readable
 equivalent is to enforce database development standards that mandate the
 use of underscores in column names.
 This makes conversion easy and mostly avoids the problem (abbreviated names
 remain an issue).
 If your database was developed without such standards, use this section.
 The following sections describe how to translate programming jargon into
 human-readable text:
\end_layout

\begin_layout Itemize
Existing mappings
\end_layout

\begin_layout Itemize
Web service
\end_layout

\begin_layout Itemize
Word splitting
\end_layout

\begin_layout Standard
Using all three ways should reduce the translation task quite significantly.
\end_layout

\begin_layout Subsection
Existing Mappings
\end_layout

\begin_layout Standard
If your software applications have a front-end of any sort, there might
 be a mapping of column names to names a business owner could relate to.
 For example, a web user interface might map database columns to input field
 prompts.
 If the map is created in the Extensible Markup Language (XML), consider
 using an XSL template to transform the XML into SQL statements.
\end_layout

\begin_layout Standard
XML is related to other technologies, including the following:
\end_layout

\begin_layout Itemize
Extensible Stylesheet Language (XSL)
\end_layout

\begin_layout Itemize
Extract, Transform, and Load (ETL)
\end_layout

\begin_layout Standard
When you want to deposit the contents of XML documents directly into a database,
 ETL tools are an excellent choice.
 But when the data cannot be inserted directly (if the SQL statements to
 perform the inserts are based on a complex relationship within the XML),
 generating the SQL insert statements can be a relatively quick solution,
 which falls into the realm of XSL.
\begin_inset Foot
status collapsed

\begin_layout Plain Layout
There are web-based XSLT engines, such as 
\begin_inset CommandInset href
LatexCommand href
target "http://www.futurelab.ch/xmlkurs/xslt.en.html"

\end_inset

.
\end_layout

\end_inset

 Let's look at a short example of how XML and XSL can be used to generate
 SQL that produces insert statements for the data dictionary.
\end_layout

\begin_layout Standard
A map of database columns to text might already exist, like the example
 in 
\begin_inset CommandInset ref
LatexCommand ref
reference "alg:Data-Dictionary-XML-Source"

\end_inset

.
\end_layout

\begin_layout Standard
\begin_inset Float algorithm
wide false
sideways false
status collapsed

\begin_layout Plain Layout
\begin_inset Graphics
	filename source/xml/field-list.xml.png
	display false

\end_inset


\end_layout

\begin_layout Plain Layout
\begin_inset Caption

\begin_layout Plain Layout
\begin_inset CommandInset label
LatexCommand label
name "alg:Data-Dictionary-XML-Source"

\end_inset

Data Dictionary XML Source
\end_layout

\end_inset


\end_layout

\end_inset


\end_layout

\begin_layout Standard
The XML is quite straightforward and should need no explanation, although
 the XSL, in 
\begin_inset CommandInset ref
LatexCommand ref
reference "alg:Data-Dictionary-XSL-Template"

\end_inset

, could use a summary.
\end_layout

\begin_layout Standard
\begin_inset Float algorithm
wide false
sideways false
status collapsed

\begin_layout Plain Layout
\begin_inset Graphics
	filename source/xsl/field-list.xsl.png
	display false

\end_inset


\end_layout

\begin_layout Plain Layout
\begin_inset Caption

\begin_layout Plain Layout
\begin_inset CommandInset label
LatexCommand label
name "alg:Data-Dictionary-XSL-Template"

\end_inset

Data Dictionary XSL Template
\end_layout

\end_inset


\end_layout

\end_inset


\end_layout

\begin_layout Standard
The lines perform the following actions:
\end_layout

\begin_layout Itemize

\series bold
Lines 1 to 3.

\series default
 Typical declaration of a stylesheet.
\end_layout

\begin_layout Itemize

\series bold
Lines 5 and 6.

\series default
 Create a text file, such as a SQL file, and remove any extraneous spaces.
\end_layout

\begin_layout Itemize

\series bold
Lines 8 to 10.

\series default
 Traverse through the XML document starting at the root element.
\end_layout

\begin_layout Itemize

\series bold
Lines 12 to 14.

\series default
 Traverse through all the 
\family typewriter
field-list
\family default
 elements.
\end_layout

\begin_layout Itemize

\series bold
Lines 16 to 28.

\series default
 Create a SQL statement based on the values of the 
\family typewriter
column
\family default
 and 
\family typewriter
prompt
\family default
 elements (from 
\begin_inset CommandInset ref
LatexCommand ref
reference "alg:Data-Dictionary-XML-Source"

\end_inset

).
\end_layout

\begin_layout Standard
When the XML from 
\begin_inset CommandInset ref
LatexCommand ref
reference "alg:Data-Dictionary-XML-Source"

\end_inset

 is transformed using the stylesheet in 
\begin_inset CommandInset ref
LatexCommand ref
reference "alg:Data-Dictionary-XSL-Template"

\end_inset

, the result will be 
\family typewriter
INSERT
\family default
 statements similar to the statement in 
\begin_inset CommandInset ref
LatexCommand ref
reference "alg:Data-Dictionary-XSL-Transform"

\end_inset

.
\end_layout

\begin_layout Standard
\begin_inset Float algorithm
wide false
sideways false
status collapsed

\begin_layout Plain Layout
\begin_inset Graphics
	filename source/sql/data-dictionary-inserts.pgsql.png
	display false

\end_inset


\end_layout

\begin_layout Plain Layout
\begin_inset Caption

\begin_layout Plain Layout
\begin_inset CommandInset label
LatexCommand label
name "alg:Data-Dictionary-XSL-Transform"

\end_inset

Data Dictionary XSL Transform
\end_layout

\end_inset


\end_layout

\end_inset


\end_layout

\begin_layout Standard
This example shows a specific instance of how to popualate the data dictionary
 automatically.
 A similar technique could be used, for example, to leverage the translations
 from XHTML (or related markup) with 
\family typewriter
<label>
\family default
 elements (or similar elements in whatever web framework is currently being
 used).
 If it looks like it will be too much work to implement an automatic solution,
 consider hiring someone to perform the data entry service.
\end_layout

\begin_layout Standard
It is likely that the same column name is used on different web pages.
 In such a situation, generating duplicate 
\family typewriter
INSERT
\family default
 statements will cause a failure.
 To resolve this problem:
\end_layout

\begin_layout Enumerate
Use XSL to write out columns and values in a simple file format: 
\family typewriter
column,value
\family default
.
\end_layout

\begin_layout Enumerate
Optionally, convert the column names to match the case of their database
 column namesakes.
\end_layout

\begin_layout Enumerate
Sort the file (use the 
\family typewriter
sort
\family default
 command in Linux).
\end_layout

\begin_layout Enumerate
Extract only the unique lines (use the 
\family typewriter
uniq
\family default
 command in Linux).
\end_layout

\begin_layout Enumerate
Convert the simple file format to 
\family typewriter
INSERT
\family default
 statements.
\end_layout

\begin_layout Standard
The details for implementing this solution are dependent on your set up
 and needs.
\end_layout

\begin_layout Subsection
Web Service
\end_layout

\begin_layout Standard
Chances are, your database columns do not have underscores between each
 and every word.
 After you first convert the column names using existing source files and
 run the column names through the dictionary split method, you might consider
 an additional method: machine translation using a web service.
 In this case, using Google.
 Get permission before running the script in 
\begin_inset CommandInset ref
LatexCommand ref
reference "alg:Google-based-Word-Extraction"

\end_inset

 (without express permission, running the shell script violates Google's
 Terms of Service).
\end_layout

\begin_layout Standard
\begin_inset Float algorithm
wide false
sideways false
status collapsed

\begin_layout Plain Layout
\begin_inset Graphics
	filename source/scripts/column-split.sh.png
	display false

\end_inset


\end_layout

\begin_layout Plain Layout
\begin_inset Caption

\begin_layout Plain Layout
\begin_inset CommandInset label
LatexCommand label
name "alg:Google-based-Word-Extraction"

\end_inset

Google-based Word Extraction
\end_layout

\end_inset


\end_layout

\end_inset


\end_layout

\begin_layout Itemize

\series bold
Line 3.

\series default
 Loops over every line in the file.
 The file 
\begin_inset ERT
status collapsed

\begin_layout Plain Layout


\backslash
filename{
\backslash
filetxtcolumns}
\end_layout

\end_inset

 should contain one database column name per line.
\end_layout

\begin_layout Itemize

\series bold
Line 4.

\series default
 Writes the name of the column followed by a full-colon to standard output,
 but does not write the line terminator (yet).
\end_layout

\begin_layout Itemize

\series bold
Line 5.

\series default
 Launches a text-oriented web browser to query Google with the database
 column to translate (into English).
\end_layout

\begin_layout Itemize

\series bold
Line 6.

\series default
 Throws away the entire web page result except for the one line that contains
 that one special phrase.
\end_layout

\begin_layout Itemize

\series bold
Line 7.

\series default
 Extracts the first word after Google's six-word phrase until there are
 no more words, and writes them to standard output.
 Note that this phrase can change at any time.
\begin_inset Foot
status collapsed

\begin_layout Plain Layout
For example, the text might become ``Showing results for''.
\end_layout

\end_inset


\end_layout

\begin_layout Itemize

\series bold
Line 8.

\series default
 The line terminator.
\end_layout

\begin_layout Itemize

\series bold
Line 9.

\series default
 Prevent sending requests in rapid succession to Google.
\end_layout

\begin_layout Subsection
Word Splitting
\end_layout

\begin_layout Standard
Word splitting, also known as text segmentation, solves the problem of extractin
g words from a string of characters lacking word boundaries (such as, spaces
 or punctuation).
 For example, 
\family typewriter
clientaccountedit
\family default
 likely means 
\begin_inset Quotes eld
\end_inset

Client Account Edit
\begin_inset Quotes erd
\end_inset

 or 
\begin_inset Quotes eld
\end_inset

Client Accounted It
\begin_inset Quotes erd
\end_inset

.
 Developers are notorious for lumping words together to name database columns.
 This natural language processing is a difficult problem, and will not have
 a perfect solution for some time.
 Until then, you can either spend your time researching and looking for
 tools that will help, or do the following:
\end_layout

\begin_layout Enumerate
Obtain a dictionary of words,
\begin_inset Foot
status collapsed

\begin_layout Plain Layout
In Unix systems, 
\family typewriter
/usr/share/dict/words
\family default
 is a good start.
\end_layout

\end_inset

 and remove any words with the following:
\begin_inset Foot
status collapsed

\begin_layout Plain Layout
The regular expression 
\family typewriter
^[a-z]*$
\family default
 will be helpful here.
\end_layout

\end_inset


\end_layout

\begin_deeper
\begin_layout Itemize
Upper-case characters (proper nouns)
\end_layout

\begin_layout Itemize
Numerals
\end_layout

\begin_layout Itemize
Punctuation
\end_layout

\end_deeper
\begin_layout Enumerate
Create a corpus of material using business documentation, especially technical
 manuals:
\end_layout

\begin_deeper
\begin_layout Itemize
Database design documentation.
\end_layout

\begin_layout Itemize
Technical architectures.
\end_layout

\begin_layout Itemize
Requirements documents.
\end_layout

\begin_layout Itemize
Blogs, wikis, websites, and even emails.
\end_layout

\end_deeper
\begin_layout Enumerate
Extract, normalize (punctuation, case), and combine the individual words
 from the corpus text.
\begin_inset Foot
status collapsed

\begin_layout Plain Layout
The program Lynx is a text-based browser that, quite handily, can extract
 plain text from HTML.
\end_layout

\end_inset


\end_layout

\begin_layout Enumerate
Tally the occurrences of each word in the corpus.
\begin_inset Foot
status collapsed

\begin_layout Plain Layout
The program 
\family typewriter
sed
\family default
 allows stream editing using regular expressions.
 See 
\begin_inset CommandInset ref
LatexCommand vref
reference "alg:Script-Tally-Corpus"

\end_inset

.
\end_layout

\end_inset


\end_layout

\begin_layout Enumerate
Eliminate non-words (while preserving lemmas)
\begin_inset Foot
status collapsed

\begin_layout Plain Layout
The words 
\emph on
run
\emph default
, 
\emph on
runs
\emph default
, 
\emph on
ran
\emph default
, and 
\emph on
running
\emph default
 are from the same lexeme, with 
\emph on
run
\emph default
 as the lemma.
\end_layout

\end_inset

 from the corpus tallies by cross-referencing the dictionary.
\end_layout

\begin_layout Enumerate
Convert the number of occurrences to probabilities using a rudimentary file
 format (such as CSV).
\begin_inset Foot
status collapsed

\begin_layout Plain Layout
See 
\begin_inset CommandInset ref
LatexCommand vref
reference "alg:Script-Tallies-Probabilities"

\end_inset

.
\end_layout

\end_inset


\end_layout

\begin_layout Standard
The above steps are not a lot of work, nor are they extremely complicated;
 the steps are crucial as they set the stage for splitting the concatenated
 phrases (column names) based on relative usage probability.
 The complicated part is performing the analysis to split the phrases.
 To do this, find a software tool that performs text segmentation (also
 known as word splitting), such as the one described in 
\begin_inset CommandInset ref
LatexCommand ref
reference "cha:Word-Split"

\end_inset

.
\end_layout

\begin_layout Section
\begin_inset CommandInset label
LatexCommand label
name "sec:Resource-Bundle"

\end_inset

Resource Bundle
\end_layout

\begin_layout Standard
As described, the data dictionary has many uses, one of which is generating
 a resource bundle.
 The query to generate the resource bundle uses the following views:
\end_layout

\begin_layout Itemize

\family typewriter
dictionary.resource_bundle
\end_layout

\begin_layout Itemize

\family typewriter
dictionary.generate_resource_bundle
\end_layout

\begin_layout Standard
The view 
\family typewriter
dictionary.resource_bundle
\family default
 is the work horse; the other view is for convenience, based on the former.
 The 
\family typewriter
dictionary.resource_bundle
\family default
 view is the mastermind view of the data dictionary ERD, it is one of the
 reasons the data dictionary exists.
\end_layout

\begin_layout Subsection
Custom Resource Bundles
\end_layout

\begin_layout Standard
Inevitably, there will be various texts that have no corresponding database
 column.
 It is a mistake to create a data dictionary and generate a resource bundle
 only to edit that resource bundle by hand.
 Any manual changes to the resource bundle are lost when you regenerate
 the resource bundle.
 The simplest solution is to create another (likely smaller) resource bundle,
 one that is maintained by hand.
 In theory, this should not pose a problem.
 In practice, 
\begin_inset ERT
status collapsed

\begin_layout Plain Layout


\backslash
appireport{}
\end_layout

\end_inset

 does not support multiple resource bundles.
\end_layout

\begin_layout Standard
Until 
\begin_inset ERT
status collapsed

\begin_layout Plain Layout


\backslash
appireport{}
\end_layout

\end_inset

 supports multiple resource bundles, a quick work-around is to concatenate
 the computer-generated bundle and the manually-maintained bundle to create
 a new, combined resource bundle.
 Configure 
\begin_inset ERT
status collapsed

\begin_layout Plain Layout


\backslash
appireport{}
\end_layout

\end_inset

 to use the combined resource bundle.
\end_layout

\begin_layout Subsection
Alternate Text Lengths
\end_layout

\begin_layout Standard
As printed pages have fixed widths, additional column name values are sometimes
 necessary.
 Shorter names can be implemented by addding an alternate column to the
 data dictionary's translation.
 Generate the resource bundle keys as before with the alternate values alongside
 the original.
 For example:
\end_layout

\begin_layout LyX-Code
person.name.first=First Name
\end_layout

\begin_layout LyX-Code
person.name.first.alternate=First
\end_layout

\begin_layout Standard
The word ``alternate'' is arbitrary.
 Use a suitable term for your application.
\end_layout

\begin_layout Section
Business Areas and Column Clusters
\end_layout

\begin_layout Standard
Assigning column clusters to business areas is a completely manual process.
 There is no practical way to program the database organization in a way
 that makes sense for a corporation.
 The term 
\emph on
business cluster
\emph default
 is short-hand for a business area-column cluster relationship.
 This has a couple of implications:
\end_layout

\begin_layout Itemize

\series bold
Business Cluster Map.

\series default
 How do you associate business areas with column clusters?
\end_layout

\begin_layout Itemize

\series bold
Database Migration.

\series default
 How do you retain the map when migrating through the development lifecycle?
\end_layout

\begin_layout Standard
These topics are described in the following subsections.
\end_layout

\begin_layout Subsection
\begin_inset CommandInset label
LatexCommand label
name "sub:Business-Cluster-Map"

\end_inset

Business Cluster Map
\end_layout

\begin_layout Standard
It would be inconvenient, error-prone, and laborious to manually perform
 the same set of steps to insert a relationship between a business area
 and a column cluster for every possible business cluster relationship.
 Maintaining the association between business areas and column clusters
 can be accomplished as follows:
\end_layout

\begin_layout Enumerate
Create a function that, when given a fully qualified column reference (for
 example, 
\family typewriter
schema.entity.column
\family default
), returns the primary key from the object's column relationship (that is,
 the primary key of the row that maintains the relationship between a table
 (or view) and a column).
\end_layout

\begin_layout Enumerate
Create a function that performs the necessary steps to insert the new business
 area-column cluster relation that meets the data dictionary's relational
 constraints.
 The function requires a business area, column cluster, business area type,
 and fully qualified column reference.
\end_layout

\begin_layout Enumerate
Create a spreadsheet that tracks the mapping of business areas to column
 clusters and produces the corresponding SQL statements to insert the relationsh
ips into the database, using the function defined in the previous step.
\end_layout

\begin_layout Standard
Example functions for the first and second steps are in 
\begin_inset CommandInset ref
LatexCommand ref
reference "sub:filesqlgetentitycolumnpk"

\end_inset

 and 
\begin_inset CommandInset ref
LatexCommand vref
reference "sub:filesqlbusinessclusterrelation"

\end_inset

, respectively.
 It is the latter function that is called to perform useful work.
 A spreadsheet is not the only possible way to create the initial assignment
 of column clusters to business areas, but it is expedient.
 The idea is to create a series of sheets---one per business area---that
 consists of the name of a column cluster and a number of rows representing
 the fully qualified column that must be part of that column cluster.
 This is represented visually in 
\begin_inset CommandInset ref
LatexCommand ref
reference "tab:Example-Business-Cluster-Map"

\end_inset

.
\end_layout

\begin_layout Standard
\begin_inset Float table
wide false
sideways false
status collapsed

\begin_layout Plain Layout
\begin_inset Caption

\begin_layout Plain Layout
\begin_inset CommandInset label
LatexCommand label
name "tab:Example-Business-Cluster-Map"

\end_inset

Example Business Cluster Map
\end_layout

\end_inset


\end_layout

\begin_layout Plain Layout
\begin_inset Tabular
<lyxtabular version="3" rows="5" columns="3">
<features>
<column alignment="left" valignment="top" width="0">
<column alignment="left" valignment="top" width="0">
<column alignment="left" valignment="top" width="0">
<row>
<cell alignment="left" valignment="top" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
\begin_inset ERT
status collapsed

\begin_layout Plain Layout


\backslash
rowcolor{tableheader}
\backslash
toprule{}
\end_layout

\end_inset


\series bold
name
\end_layout

\end_inset
</cell>
<cell alignment="left" valignment="top" usebox="none">
\begin_inset Text

\begin_layout Plain Layout

\series bold
address
\end_layout

\end_inset
</cell>
<cell alignment="left" valignment="top" usebox="none">
\begin_inset Text

\begin_layout Plain Layout

\series bold
appointment
\end_layout

\end_inset
</cell>
</row>
<row>
<cell alignment="left" valignment="top" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
\begin_inset ERT
status collapsed

\begin_layout Plain Layout


\backslash
midrule{}
\end_layout

\end_inset

clinic.person.first_name
\end_layout

\end_inset
</cell>
<cell alignment="left" valignment="top" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
clinic.address.street
\end_layout

\end_inset
</cell>
<cell alignment="left" valignment="top" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
clinic.appointment.start_date
\end_layout

\end_inset
</cell>
</row>
<row>
<cell alignment="left" valignment="top" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
clinic.person.last_name
\end_layout

\end_inset
</cell>
<cell alignment="left" valignment="top" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
clinic.address.city
\end_layout

\end_inset
</cell>
<cell alignment="left" valignment="top" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
clinic.appointment.duration
\end_layout

\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
clinic.person.middle_name
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
clinic.address.region
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" usebox="none">
\begin_inset Text

\begin_layout Plain Layout

\end_layout

\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
clinic.person.title
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
clinic.address.country
\end_layout

\end_inset
</cell>
<cell alignment="center" valignment="top" usebox="none">
\begin_inset Text

\begin_layout Plain Layout
\begin_inset ERT
status collapsed

\begin_layout Plain Layout


\backslash
tabularnewline
\backslash
bottomrule%
\end_layout

\end_inset


\end_layout

\end_inset
</cell>
</row>
</lyxtabular>

\end_inset


\end_layout

\end_inset


\end_layout

\begin_layout Standard
The header row in 
\begin_inset CommandInset ref
LatexCommand ref
reference "tab:Example-Business-Cluster-Map"

\end_inset

 represents the name of the column cluster, and the values in each column
 point to the location in the database where the information is stored.
 Group the sheets by business area, such as the business areas shown in
 
\begin_inset CommandInset ref
LatexCommand vref
reference "fig:Data-Dictionary-Business-Clusters"

\end_inset

.
 After the spreadsheet is complete, it is then possible to generate the
 SQL statements required to create the business cluster map by calling the
 function described in 
\begin_inset CommandInset ref
LatexCommand vref
reference "sub:filesqlbusinessclusterrelation"

\end_inset

.
\end_layout

\begin_layout Subsection
Database Migration
\end_layout

\begin_layout Standard
After creating the relationships in a development database, it is prudent
 to make a backup of these relationships.
 Depending on how the database migration works,
\begin_inset Foot
status collapsed

\begin_layout Plain Layout
There are schema comparison tools, but they have technical quirks that prevent
 a completely automated migration.
\end_layout

\end_inset

 it might not be feasible to simply copy the data from the development database
 into the testing database.
 For this, create a view that generates the SQL statements from 
\begin_inset CommandInset ref
LatexCommand vref
reference "sub:filesqlregeneratebusinessclusters"

\end_inset

.
 The resulting SQL script can be used to reseed the business area-column
 cluster relationships.
\end_layout

\begin_layout Section
Search
\end_layout

\begin_layout Standard
A problem with systems that are based on a vast number of tables and views
 is in trying to find specific columns.
 There are a number of ways to hunt through a database that contains hundreds,
 or even thousands, of entities.
 Graphical user interfaces (GUIs), such as pgAdmin,
\begin_inset Foot
status collapsed

\begin_layout Plain Layout
\begin_inset CommandInset href
LatexCommand href
target "http://www.pgadmin.org/"

\end_inset


\end_layout

\end_inset

 or web-based products such as phpMyAdmin
\begin_inset Foot
status collapsed

\begin_layout Plain Layout
\begin_inset CommandInset href
LatexCommand href
target "http://www.phpmyadmin.net/"

\end_inset


\end_layout

\end_inset

 display a long, long list.
 And although some tools allow developers to search database meta-information,
 the following problems have not been adequately solved in practice:
\end_layout

\begin_layout Itemize

\series bold
Search.

\series default
 They all lack the ability to search, in natural language terms, for database
 columns.
\end_layout

\begin_layout Itemize

\series bold
Scroll.

\series default
 Most GUIs have awkward interfaces for finding objects: an unwieldy list
 forcing the user to scroll.
 Even when a filter is available, or the objects can be found by typing
 its name, some developers insist on scrolling through the list, which is
 woefully inefficient.
\end_layout

\begin_layout Standard
After the data dictionary is created, though, a query can be constructed
 to scan for specific items without knowing the exact column names.
 One such possibility is shown in 
\begin_inset CommandInset ref
LatexCommand ref
reference "alg:Searching-the-Data-Dictionary"

\end_inset

.
\end_layout

\begin_layout Standard
\begin_inset Float algorithm
wide false
sideways false
status collapsed

\begin_layout Plain Layout
\begin_inset Graphics
	filename source/sql/data-dictionary/usage/search.pgsql.png
	display false

\end_inset


\end_layout

\begin_layout Plain Layout
\begin_inset Caption

\begin_layout Plain Layout
\begin_inset CommandInset label
LatexCommand label
name "alg:Searching-the-Data-Dictionary"

\end_inset

Searching the Data Dictionary
\end_layout

\end_inset


\end_layout

\end_inset


\end_layout

\begin_layout Standard
Notice that the SQL statement contains the following code:
\end_layout

\begin_layout LyX-Code
SELECT
\end_layout

\begin_layout LyX-Code
  t.*
\end_layout

\begin_layout Standard
The only time implicitly selecting all columns is adviseable is when executing
 one-off statements (like when searching the data dictionary)---production
 code must explicitly name columns.
\end_layout

\begin_layout Section
Summary
\end_layout

\begin_layout Standard
A system that is completely internally consistent (that minimizes misleading
 names) is a difficult but noble goal.
 The data dictionary has the ability to highlight inconsistent database
 areas.
 Any database system of sufficient complexity will inevitably introduce
 unintentional naming conflicts, or will have two different column names
 refer to the same data.
 The views described in this chapter (for example, 
\family typewriter
dictionary.resource_bundle
\family default
) can be used to discover parts of the database that are falling into disrepair.
\end_layout

\begin_layout Standard
Lastly, with a little bit of ingenuity, it is possible to build a partial
 
\emph on
ad hoc
\emph default
 reporting system based on the data dictionary.
 
\begin_inset ERT
status collapsed

\begin_layout Plain Layout


\backslash
appjasperreports{}
\end_layout

\end_inset

 is more than a system to execute reports developed in 
\begin_inset ERT
status collapsed

\begin_layout Plain Layout


\backslash
appireport{}
\end_layout

\end_inset

.
 
\begin_inset ERT
status collapsed

\begin_layout Plain Layout


\backslash
appjasperreports{}
\end_layout

\end_inset

 also allows software developers to write programs to develop reports.
\end_layout

\end_body
\end_document