Penn Computing

Penn Computing

Computing Menu Computing A-Z
Computing Home Information Systems & Computing Penn

Current Load Status
Regular Availability
FAQs & Tips
Password Changer
Support services
About the Data Warehouse
Data Administration
Express Mail
General Ledger
Learning Management
Position Inventory
Research-PennERA Proposals
Salary Management
Travel Expense Management
Tuition Distribution

About Queries

Types of Queries
· Writing Your Own Queries
· "Canned" Queries and Reports
· Guidelines for Sharing Queries

General Guidelines on Executing Queries
· Information Restrictions
· Guidelines for Writing Queries
· Verifying Query Results
· Tips for Querying with Business Objects

Writing Your Own Queries

When writing your own query, you define the set of choices and conditions that you want to use to retrieve data stored in the Warehouse. You determine such attributes as data elements to be returned (e.g., last name, city, age), conditions for selecting records (e.g., equal to or less than), and sort criteria (the order and priority in which results are to be sorted).

You may want to keep in mind some general questions to guide you in composing a query:

  • What information are you looking for? In what collection does this data reside, and which Business Object universe would be best to use?
  • Bear in mind the level of detail data you need, the time periods concerned, and which source system you would use to verify the data retrieved.
  • Once you have a basic idea of the results you need, consider how the query should be contrained -- by time period? account segment(s)? employee or organization names/codes?
  • What will you do with your results? If you are presenting them to others, you may want to include segment descriptions for those unfamiliar with codes. Also, if you plan to export the data, you may want to include objects which you have used to constrain your query, to better identify the data in the exported file. (For example, although you may have used Accounting_Period as a constraint, it might help to have the period appear as a column in your exported file, so you know what period that data represents.)
Refer to the Guidelines for Writing Queries below for more detailed examples. Also, for more information about considerations particular to the General Ledger data collection, refer to the collection online tutorial. (A similar page also exists for the Salary Management data collection.)

Information Restrictions
Be aware that University data is sensitive and must be handled responsibly. Note that some data may be restricted by school or by department or cannot be released to persons outside the University. Persons who work with student data should also be aware of student rights as stated in the Buckley Amendment.

More information about restrictions on University data is available in Warehouse Security.

General Guidelines for Executing Queries
Queries often require complex conditions to return appropriate results. WHERE clauses (also known as query filters or conditions) reduce the amount of data to be processed in a SELECT statement (comprised of desired result objects or columns) by specifying that only those rows meeting the criteria in the WHERE clause are displayed. Depending upon which tool you use to query the Data Warehouse, your degree of control over the query language and operators will vary. For example, BusinessObjects and Microsoft Access employ graphical interfaces which construct appropriate SQL behind the scenes based on the tables and joins the user indicates. Oracle SQL*Plus, on the other hand, requires the user to write his or her own SQL statments. The Oracle database system, which forms the foundation of the Data Warehouse, permits the use of powerful SQL operators, some of which may also be available as post-query operators in your desktop tool.

The following guidelines are for all users of the Data Warehouse, regardless of query tool, and provide good practices for efficient querying:

· Refer to the help documentation. The main difficulty most people have with writing queries is knowing which table to use. If you are unsure about which table to use, refer to the table help that is available for the data collection. Pay particular attention to the sections "Common Uses" and "Cautions." These sections may help you decide if you have selected the appropriate table for your query. If you are unsure about a data element, look up its definition. The Office of Data Administration has included data element definitions and table help as part of the data collection documentation on the web.

· Take advantage of indexes. If possible, include an indexed data element in your condition statement. A query with a record selection condition using an indexed data element tells the system to go directly to the rows in the table that contain the value indicated and to stop retrieving data when the value is no longer found. If a query does not select records based on an indexed data element in its record selection condition, the system starts searching at the first row in the table and works through every row until it reaches the last row in the table. Indexed columns are noted in each collection's documentation.

Certain operators or query segments are processed by the system without the use of indexes, even if the column in the condition is indexed. It may, of course, be necessary for you to construct your query in this manner to retrieve correct results, but in considering alternatives in query construction you may wish to keep in mind the following situations where indexes may not be used:

Negative comparisons such as Not Equal (represented by =! or <> in SQL), Different From, or Not In. Avoid negative phrasing of condition statements as much as possible. In general, it is easier (both for the system and for you) to interpret a positive phrase than a negative phrase. For example, instead of the condition statement "If term is not greater than 1998A," rephrase the statement to "If term is less than or equal to 1998A." Or, if practical, eliminate the condition from the query and filter your results on the desktop.

Nulls such as Is Null or Is Not Null.

Like or Matches Pattern comparison with a date or number column. For example, to retrieve employee payments from March (of any year) use "FISCAL_MONTH_SEQ = '09'" rather than "CHECK_DATE Like 03/%".

Wildcards at the beginning of a string. Avoid matching patterns beginning with a wildcard (Like %...). A wildcard at the end of a pattern is definitely appropriate and can be very efficient (e.g., Where Fund Like 5% will retrieve all Funds 500000 - 599999).

Indexed columns modified by an expression or function (e.g., rather than concatenating all Chart of Account segments as COA_CNAC||COA_ORG||COA_BC||COA_FUND||COA_OBJECT||COA_PROGRAM||COA_CREF, select the column COA_ACCOUNT, which is indexed). Also, comparing an indexed column to another indexed column using Greater Than (>) or Less Than (<).

· Check the "and/or" qualifiers in the records selection criteria of the condition statement. For example, a query coded to get students with the following conditions statement will actually return every student in COL for 1998A and every student for 1998C regardless of the division:

If division is equal to 'COL' and term is equal to '1998A' or term is equal to '1998C'
The query coded to get students with the following conditions statement will return every student in COL for 1998A and 1998C:
If division is equal to 'COL' and (term is equal to '1998A' or term is equal to '1998C')

· If your access to data is restricted, do not force the security system to select records for you. For example, if you are authorized to access data only from a particular department, one of your record selection conditions should state "If Organization='My Organization'," where organization is the code for your department.

· Review your query before executing it. Check to make sure that your query is as precise as possible. This includes selecting the tables that will give the best results, reviewing selection conditions and sort criteria, and if it makes sense to do so, including at least one indexed data element in the conditions statement. For example, if you want to find all undergraduate freshmen and their names, choose the Person table rather than the ADDRESS table. This is because a student can have multiple addresses, and choosing the ADDRESS table would return a name for each address the student has listed.

· Be aware of data that is subject to change and its effect on your results. For example, a grade change can affect a student's grade point average (GPA). A query executed before and after the grade change may or may not result in a changed GPA. In addition, keep in mind that there is a "data delay" between Warehouse collections and their respective source systems. Refresh schedules are noted in each collection's documentation.

· Give the query time to execute. Queries can take many minutes to execute; complex queries can take longer. It is not uncommon for a query to take 5 to 10 minutes to complete. In general, let the query run until it finishes. If the query takes longer than 1 hour to complete, contact Data Administration.

· Note the date and time of the query when creating reports or communicating results to others.

Verifying Query Results
Although query tools make accessing data easier, remember that University data is complex. You can easily compose a "bad query." That is, you may have selected incompatible data elements, the wrong data elements, or structured the conditions in your query such that it returns improper results. Be aware that it is possible to get an incorrect answer or misinterpret the data. For example, when mailing information to students, it is more preferable to select "current address" rather than "temporary address" as the address type. This is because the system looks at the expiration dates for the temporary and local addresses and computes which address is the most current.

To minimize the risk of misinterpreting data or getting the wrong answer:

· Compare your results with known reports that are up-to-date and accurate.

· Perform "sanity checking." Ask yourself if the information you have is a reasonable answer. For example, few classes at Penn enroll more than 100 students per class so a query returning a class enrollment of 250 students is questionable; grants are usually less than Penn's overall budget thus a grant amount that is listed as twice its Penn budget is unlikely; and Penn's student population is approximately 50% men and 50% women so a query returning a class enrollment of 98% males is probably incorrect.

· Consider estimating likely parameters for a reasonable response before executing the query.

· Try sequential queries that break large sorts down to components, then add them to be sure the whole is accounted for. For example, if searching for Wharton undergraduate Asian female students, look at: all Asian students, male Asian students, etc. to check the reliability of your query.

· Use one or two conditions and zero in on what you want. When developing your query to answer a question, start by retrieving detail information, with just one or two record selection conditions. Zero in on what you want, looking at how the results change as you add each condition. If your goal is a summarized report, see how the detail report compares to the summarized one. Make sure the query includes what you want, all of what you want, and nothing but what you want.

· Ask someone who knows the data to check your results. Ask someone in your department or school, or contact Data Administration. Also, ask the person requesting the report what results they expect to get. Or ask if the person can suggest another existing report that you can use to check to see if your results are at least in the ballpark. If your query gets results that are nowhere near what the requester expected, your query may be in error.

· Continue to become familiar with and knowledgeable about University data by participating in the data collection listservs.

· Just look at the report. Sometimes, just looking at the query results can help you spot glaring errors. (Why am I getting so many rows returned? Whoops--I forgot to screen on accounting period!)

· Run a similiar query. Try running another query that approaches the question differently, and see if both queries get the same results. For example, if you want figures on telephone charges, in one query, you could set a condition on a list of object codes, and in another query, you could set a condition on the parent object code. If the queries do not get the same results, find out why not.

· Check other reliable sources. Check other sources of information, such as BEN Financials or other source system screens or reports, to see how they compare with your query results.

"Canned" Queries and Reports

· financial queries

Canned queries, sometimes known as corporate documents, are predefined queries. In most instances, canned queries contain prompts that allow you to customize the query for your specific needs. For example, a prompt may ask you for a School, department, term, or section ID. In this instance you would enter the name of the School, department or term, and the query will retrieve the specified data from the Warehouse.

The main reason for using a canned query or report rather than creating your own is that your chances of misinterpreting data or getting the wrong answer are reduced. Someone who is knowledgeable about University data has correctly identified the conditions and criteria needed to execute a successful query. You are assured of getting the right data and the right answer.

Whenever possible, canned queries and reports are provided to give you standardized perspectives on Warehouse data. These types of queries vary with the query tool and the data collection. Thus, before creating your own query, check to see if a canned query or report already exists for the information you want.

For a list of canned Business Objects reports using financial data, click here.

To access Business Objects corporate documents:

Using InfoView:

  1. Point your browser to InfoView login page (
  2. Enter your Business Objects User Name and Password. (You can make sure your Data Warehouse and Business Objects account passwords are synchronized by using the password change application.
  3. Once you've successfully logged in to InfoView, navigate to the Document List. You can choose to browse documents by viewing them by Folder or Category. The Folder view appears by default - all the shared documents are available under Public Folders. Although you'll see all the categories available, you'll only see documents whose data you're authorized to refresh. You can also use the search option to search for a report by name.
  4. Click on a category or folder name to view the documents available to you. The list of documents appears in the right-hand pane of the InfoView window, and displays the documents by name, with description (if available)..
    • Simply double-clicking clicking the document name will open it.
    • Keep in mind that documents in the repository are saved without any data, so that you can click the Refresh Data link at the upper right corner of the screen to refresh the report according to your security and any parameters.
    • If the report includes prompts, you will be presented with a screen asking you to fill in those prompts, and then click the Run Query button to execute the report.

Guidelines for Sharing Queries

If you would like to contribute queries to the shared repository, or have questions about sharing queries, please contact Data Administration. You may also share queries with other users. In preparing a query for the repository, please follow these guidelines:

  • Use prompts with your conditions to keep the query as generic as possible, and to take advantage of the Data Warehouse built-in security. (For example, COA Org = [prompt], rather than a hard-coded value.)
  • Purge the data from your query before sending, so that others will not be able to see your confidential data.
  • Prepare some brief accompanying documentation for your query so that Data Administration can make it generally available. Some helpful information might be suggested use, any special timing issues or cautions, a list of all result objects and conditions, and the logic for any report variables. Please suggest a source for verification of results, if possible.

Information Systems and Computing
University of Pennsylvania
Comments & Questions

Penn Computing University of Pennsylvania
Information Systems and Computing, University of Pennsylvania