UCSD Home Page
UCSD | University of California, San Diego Help Blink Home Search Blink
DataLink   Blink Home
 
 
 DataLink > Data Warehouse Glossary Monday, November 25, 2024


ACT Data Warehouse
Queries
DB2Wiki
SQL Executer (db2)
Link Family

Did You Get What You Wanted?
Click here for 'YES' Click here for 'NO'
findlay

This is a glossary of commonly used data warehousing, database and UCSD specific terms. This list is not designed to be conclusive. If there are any glaring omissions or obvious falsehoods please report them to the address listed at the end of the page.


Quickjump :

A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
- A -
Additive Fact
Additive facts are measurements in a fact table that are able to be added across all of the dimensions.
alias
A short substitute or "nickname" for a table name.
ASCII
American Standard Code for Information Interchange. An eight-bit code for character representation; includes seven bits plus parity.
attribute
A field represented by a column within an object (entity). An object may be a table, view or report.
- B -
Back End Tool
A back end tool is a software application, typically resident on both the client and the server; that assists in the production data extract process. Compare with front end tool.
browser
A software application that is used to allow users to read an HTML document in a user friendly readable format. Two well known browsers are Mosaic and NetScape.
- C -
client
A software application on your computer, which is used to extract or download some application, data or service from a host system.
Client/Server architecture
A 'networked' environment where a smaller system such as a PC interacts with a larger, faster system. This allows the processing to be performed on the larger system which frees the user's PC. The larger system is able to connect and disconnect from the clients in order to more efficiently process the data.
Conforming Dimensions
Conforming dimensions are dimensions that have exactly the same set of primary keys and same number of records. Two conformed dimensions can be combined into a single dimension by creating the union of the attributes.
- D -
Data Directory
The Data Directory is a component of ACTs DataLink product. The Data Directory provides the descriptions of all databases, tables and fields used within DARWIN. A physical data model (also called a schema or entity-relationship diagram) is provided for each subject area within DARWIN and a subset of the SQL-DSE. A search option also exists that allows the ability to find all structures containing the data field of interest.
DataLink
DataLink is a web based product developed by UCSD's Data Warehouse team to provide detailed information about the data contained within its environment. Currently, information is provided for overall data, available SQL queries and data refresh history.
DARWIN
DARWIN (Data Repository and Windows to Information) is the primary data environment running on a Sun/Sybase server, with full data directory support, integrity checks and training classes.
Data Mart
A data mart is a subject oriented subset of data classically designed for ease of use.
Degenerate Dimension
A dimension key, such as an invoice number or a ticket number that has no attributes and hence has no actual dimension table.
Denormalize
To allow redundancy in a table so that the table can remain flat.
dimensional modeling
Dimensional modeling is a generally accepted practice in the data warehouse industry to structure data intended for user access, analysis and reporting in dimensional data models. They are specifically designed to meet the twin goals of ease-of-use and performance. Over the past five years, the use of dimensional models for data warehousing has become an industry standard. It is a well-proven technique that allows for fast retrieval of data and also helps the business users to understand the data easily. This is achieved because the models are developed based on how the business sees the data.
- E -
entity
A database object such as a table, view, report or screen.
Entity Relation Model
A model of an organization's data in which the objective has been to remove all repeated values by creating more tables.
- F -
Fact Table
The central table in a star join schema characterized by a composite key, each of whose elements is a foreign key drawn from a dimension table.
Frequently Asked Questions (FAQs)
An acronym that refers to a text file on the internet that contains answers to frequently asked questions. FAQ's are a valuable resource for gaining knowledge on particular subjects of interest. FAQ's are compiled by subject matter.
Foreign Key
A field in a relational database table whose values are drawn from the values of a primary key in another table.
Front End Tool
A client tool that fetches or manipulates data stored on a relational database. Compare with back end tool.
ftp
File Transfer Protocol is the most common way to transfer files from one computer to another.
- G -
Gigabyte
One billion bytes.
GQL
Graphical Query Language is a graphical query tool used to access information stored in a SQL database. As a user, you may query the data, download data into your own environment and generate reports. GQL is a software product designed by Andyne.
GUI
GUI (pronounced gooey) stands for Graphical User Interface.
- H -
HTML
HTML, or HyperText Markup Language, a subset of SGML, provides a tag set used to create an HTML document. The tags or elements tell the browser how to display the information. The tags are used to "mark," in a hierarchical format, the different components of the document.
HTTP
The HyperText Transfer Protocol is a fixed set of commands used during a hypertext link between a client and server.
hyperlink
A hyperlink is a logical link between two related pieces of information in the same document or between different documents. It will allow you to quickly jump from one document or location to another transparently. This is the "cool" part of HTML.
Hypertext
A hypertext document is a combination of text, links to other documents, sound, graphic images, and film clips.
- I -
IFIS
Integrated Financial Information Systems is the central campus financial system that processes accounts payable, budget and staffing, chart of accounts, travel, purchasing, general accounting, express order systems.
index
An index is a link between one table and another, it allows for rapid access to the rows of a table based on the values of one or more columns in another table.
Internet
The internet is a network of networks linked together through low level protocols.
ip address
Every device on the network is assigned an address which is used for communicating from device to device.
ISIS
Integrated Student Information Systems is the central campus student system that processes academic history, accounts receivable, admissions, course catalog/schedue, facilities, grades, housing, registration, and student aid management systems.
- J -
- K -
- L -
Local Area Network (LAN)
A grouping of devices, such as PC's, fax machines and printers, physically connected together by ethernet wiring, within a fairly limited location.
- M -
Many-To-Many Relationship
A logical data relationship in which the value of one data element can exist in combination with many values of another data element and vice versa.
Meta Data
Meta data literally means "data about data". For a breakdown of the kinds of meta data provided see the glossary definitions for Data Directory as well as DataLink.
- N -
newsgroup
A listing of bulletin boards that you may access. Known for exchanging of ideas through "mailnotes." Newsgroups are designed by subject or interest.
Nonadditive Fact
A fact that cannot logically be added between records.
Normalize
The process of removing redundancy in data by separating the data into multiple tables.
null
A null value is a trigger to let you know the value for that row is either missing, unknown, not yet know or inapplicable. Placing a zero in the row would not reflect the accurate state of the row, because zero is a value. This way you can search for "missing" data and SQL supports the notion of null values.
- O -
ODBC
Open Data Base Connectivity is an interface protocol used to access data in a relational database environment. ODBC allows users to access data from several servers using just one applicaiton, because ODBC is vendor neutral. ODBC includes the DB-LIB protocol portion of the interface between the client and the server. ODBC is Microsoft's interface layer to various databases. When comparing ODBC with Open Client, ODBC would effectively replace the DB-Lib portion of Sybases' Open Client.
OLAP
OLAP (On Line Analytic Processing) is a loosely defined set of principles that provide a dimensional framework for decision support. The term OLAP also is used to define a confederation of vendors who offer norelational, proprietary products aimed at decision support.
open client
Open Client is the connectivity software from Sybase for establishing a connection between various client applications and Sybases' SQL Server 'server software.' On Intel platforms, Sybase has split Open Client into the combination of Net-Lib and DB-Lib. The Net-Lib portion establishes the connection and the DB-Lib poriton is specific to the interface (e.g. character verses graphics).
owner
The object owner is a user or users who have authority over that object, where object is a table, view or attribute.
outer join
The merger of two SQL answer sets of data by making the row headers be the union of all values in the two sets of data.
- P -
physical design
The phase of a database design following the logical design that identifies the actual database tables and index structures used to implement the logical design.
PPS
Payroll Personel S ystem is maintained, in the form of a "base", or standard version of the system, by the University of California, Office of the President (UCOP). In this role, UCOP could be viewed as the "Vendor" of the system. UCSD maintains and operates their own local, modified versions of the standard system on their own platform.
primary key
A column or combination of columns whose values uniquely identify a row or record in the table. The primary key(s) will have a unique value for each record or row in the table.
protocol
A formal description of message formats and the rules two or more devices must follow to exchange data.
- Q -
- R -
relationship
Relationships in a relational database are represented by common data values stored in the two tables. This type of relationship allows you to retrieve related data from the database by manipulating the relationships in a simple, easy way.
Relational Database
A database system that supports the full range of standard SQL. It is entirely composed of tables. Each column is a particular kind of data and each row is a unique instance of that data. Each row is uniquely identified by a primary key. Sybase is the relational data base being used by UCSD's Data Warehouse.
- S -
server
A computer where the server software resides. Such services as filing, mail, communications and authentication are a few of the services that are available.
slice and dice
The standard description of the ability to access data through any of its dimension equally.
slowly changing dimensions
The tendency of dimension records to change gradually or occasionally over time.
Structured Query Language (SQL)
Pronounced sequel or S.Q.L. Used with relational databases, it allows users to define the structure and organization of stored data; verify and maintain data integrity; control access to the data; and define relationships among the stored data items. Data retrieval by a user or an application program from the database is also a major function of SQL. The database can be updated (adding, deleting or modifying data) by a user or an application program. SQL is a fourth generation language which enables the user to tell the computer what data they want without telling the computer how to get it.
SQL-DSE
SQL-DSE (Structured Query Language for Departmental Support Environment) is a data environment that will eventually (for the most part) be integrated into the DARWIN environment. Currently the SQL-DSE environment does not have the same level of integrity checks, training classes, or meta data.
- T -
TCP/IP
Transmission Control Protocol/Internet Protocol, these protocols are used for transmitting data across the internet.
Telnet
Telnet is a protocol that allows you to interact with another device remotely, it allows you to "login" to a remote device and simulate a direct connect. When you login to the mainframe and access ISIS, IFIS, PPS or DSE you are connecting through Telnet.
terabyte
One trillion bytes
TIFF
Tagged Image File Format is a graphic file format used to transfer graphic images across the internet.
twinkling database
The tendency to have a database constantly changing data the user is attempting to query.
- U -
URL
Uniform Resource Locator. The URL is the path information in an HTML coded source file used to locate another document or image. The format for the URL is:
scheme://host-domain[:port]/path/filename.
- V -
View
A database component that behaves just like a table but has no independent existence of its own.
- W -
World-Wide Web
A hypermedia application used for access of data over the internet. The WWW is based on the HTML standard of marking up 'documents.'
- X -
- Y -
- Z -
Copyright © 2010 Regents of the University of California. All rights reserved.
Official Web Page of the University of California, San Diego