web analytics

How to Read Tuxedo User Log (ULOG)?

Options

codeling 1599 - 6654
@2016-03-07 21:25:45

The user log (ULOG) is a file to which all messages generated by the BEA Tuxedo system—error messages, warning messages, information messages, and debugging messages—are written. Application clients and servers can also write to the user log. A new log is created every day and there can be a different log on each machine. However, a ULOG can be shared by multiple machines when a remote file system is being used.

The ULOG provides an administrator with a record of system events from which the causes of most BEA Tuxedo system and application failures can be determined. You can view the ULOG, a text file, with any text editor. The ULOG also contains messages generated by the tlisten process. The tlisten process provides remote service connections for other machines in an application. Each machine, including the master machine, should have a tlisten process running on it.

The ULOG provides a central repository in which the BEA Tuxedo system and applications can store error information and simplifies the job of finding errors returned by the BEA Tuxedo ATMI.

A ULOG message consists of a tag and text. The tag consists of the following:

  • A 6-digit string (hhmmss) representing the time of day (in terms of hour, minute, and second).
  • The name of the machine.
  • The name and process identifier of the process that is logging the message. (This process ID can optionally include a transaction ID.) Also included is a thread ID (1) and a context ID (0).

The text consists of the following:

  • The message catalog name and number if the log was generated by the BEA Tuxedo system (rather than by the application), such as LIBTUX_CAT:262
  • The literal string gtrid (global transaction identifier) followed by three long hexadecimal integers which uniquely identify the global transaction if the message is sent to the ULOG while the process is in transaction mode, such as gtrid x2 x24e1b803 x239 
  • The message  

The following message are some examples in the ULOG:

162214.mach1!security.23451.1.0: Unknown User 'abc'
162214.mach1!security.23451.1.0: LIBSEC_CAT: 999: Unknown User 'abc'
162214.mach1!security.23451.1.0: gtrid x2 x24e1b803 x239: Unknown User 'abc' 

You can use the information in the ULOG to identify the cause of system or application failures. Multiple messages about a given problem can be placed in the user log. Generally, earlier messages provide more useful diagnostic information than later messages.

In the following example, message 358 from the LIBTUX_CAT catalog identifies the cause of the trouble reported in subsequent messages, namely, that there are not enough UNIX system semaphores to boot the application.

151550.gumby!BBL.28041.1.0: LIBTUX_CAT:262: std main starting 
151550.gumby!BBL.28041.1.0: LIBTUX_CAT:358: reached UNIX limit on semaphore ids
151550.gumby!BBL.28041.1.0: LIBTUX_CAT:248: fatal: system init function ...
151550.gumby!BBL.28040.1.0: CMDTUX_CAT:825: Process BBL at SITE1 failed ...
151550.gumby!BBL.28040.1.0: WARNING: No BBL available on site SITE1.
               Will not attempt to boot server processes on that site.

The ULOG is created by the BEA TUXEDO system whenever one of the following events occurs:

  • A new configuration file is loaded
  • An application is booted

When a message is written to the ULOG through the tperrno global variable, application clients and servers are notified, as follows:

  • If tperrno is set to TPESYSTEM after returning from an ATMI call, you can conclude that:
    • A BEA TUXEDO system error has occurred.
    • An error message has been placed in the user log.
  • If tperrno is set to TPEOS after returning from an ATMI call, you can conclude that:
    • An operating system error has occurred.
    • An error message has been placed in the user log. 

BEA Tuxedo system errors indicate problems at the system level, rather than at the application level. When BEA Tuxedo system errors occur, the system writes messages explaining the exact nature of the errors to the central event log (ULOG), and returns TPESYSTEM in tperrno (12). Because these errors occur in the system, rather than in the application, you may need to consult the system administrator to correct them.

@2016-03-07 21:31:45

Where the ULOG Resides

By default, the user log is called ULOG.mmddyy (where mmddyy represents the date in terms of month, day, and year) and it is created in the $APPDIR directory. You can place this file in any location, however, by setting the ULOGPFX parameter in the MACHINES section of the UBBCONFIG file.

@2016-03-07 21:34:14

The BEA Tuxedo system uses the tperrno variable to supply information to a process when a function fails. All ATMI functions that normally return an integer or pointer return -1 or NULL, respectively, on error and set tperrno to a value that describes the nature of the error. When a function does not return to its caller, as in the case of tpreturn() or tpforward(), which are used to terminate a service routine, the only way the system can communicate success or failure is through the variable tperrno in the requester.

The tperrordetail() and tpstrerrordetail() functions can be used to obtain additional detail about an error in the most recent BEA Tuxedo system call on the current thread. tperrordetail() returns an integer (with an associated symbolic name) which is then used as an argument to tpstrerrordetail() to retrieve a pointer to a string that contains the error message. The pointer can then be used as an argument to userlog() or fprintf().

The codes returned in tperrno represent categories of errors, which are listed in the following table.

Error Category

tperrno Values

Abort

TPEABORT2

BEA Tuxedo system1

TPESYSTEM

Call descriptor

TPELIMIT and TPEBADDESC

Conversational

TPEVENT

Duplicate operation

TPEMATCH

General communication

TPESVCFAIL, TPESVCERR, TPEBLOCK, and TPGOTSIG

Heuristic decision

TPEHAZARD2 and TPEHEURISTIC2

Invalid argument1

TPEINVAL

MIB

TPEMIB

No entry

TPENOENT

Operating system1

TPEOS

Permission

TPEPERM

Protocol1

TPEPROTO

Queueing

TPEDIAGNOSTIC

Release compatibility

TPERELEASE

Resource manager

TPERMERR

Timeout

TPETIME

Transaction

TPETRAN2

Typed buffer mismatch

TPEITYPE and TPEOTYPE

1. Applicable to all ATMI functions for which failure is reported by the value returned in tperrno.

2. Refer to Fatal Transaction Errors for more information on this error category.

As footnote 1 shows, four categories of errors are reported by tperrno(5) and are applicable to all ATMI functions. The remaining categories are used only for specific ATMI functions.The following sections describe some error categories in detail.

In the atmi.h, you can find the definitions of the tperrno values:

/*
 * tperrno values - error codes
 * The man pages explain the context in which the following error codes
 * can return.
 */

#define TPMINVAL 0 /* minimum error message */
#define TPEABORT 1
#define TPEBADDESC 2
#define TPEBLOCK 3
#define TPEINVAL 4
#define TPELIMIT 5
#define TPENOENT 6
#define TPEOS  7
#define TPEPERM  8
#define TPEPROTO 9
#define TPESVCERR 10
#define TPESVCFAIL 11
#define TPESYSTEM 12
#define TPETIME  13
#define TPETRAN  14
#define TPGOTSIG 15
#define TPERMERR 16
#define TPEITYPE 17
#define TPEOTYPE 18
#define TPERELEASE 19
#define TPEHAZARD 20
#define TPEHEURISTIC 21
#define TPEEVENT 22
#define TPEMATCH 23
#define TPEDIAGNOSTIC 24
#define TPEMIB  25
#define TPMAXVAL 26 /* maximum error message */

[TPEINVAL]
Invalid arguments were given (for example, svc is NULL or flags are invalid).

[TPENOENT]
Cannot send to svc because it does not exist, or it is a conversational service, or the name provided begins with a dot (.).\

[TPEITYPE]
The type and subtype of idata is not one of the allowed types and subtypes that svc accepts.

[TPEOTYPE]
Either the type and subtype of the reply are not known to the caller; or, TPNOCHANGE was set in flags and the type and subtype of *odata do not match the type and subtype of the reply sent by the service. Neither *odata, its contents, nor *olen is changed. If the service request was made on behalf of the caller's current transaction, then the transaction is marked abort-only since the reply is discarded.

[TPETRAN]
svc belongs to a server that does not support transactions and TPNOTRAN was not set.

[TPETIME]
This error code indicates that either a timeout has occurred or tpcall() has been attempted, in spite of the fact that the current transaction is already marked rollback only. If the caller is in transaction mode, then either the transaction is already rollback only or a transaction timeout has occurred. The transaction is marked abort-only. If the caller is not in transaction mode, a blocking timeout has occurred. (A blocking timeout cannot occur if TPNOBLOCK and/or TPNOTIME is specified.) In either case, no changes are made to *odata, its contents, or *olen. If a transaction timeout has occurred, then, with one exception, any attempts to send new requests or receive outstanding replies will fail with TPETIME until the transaction has been aborted. The exception is a request that does not block, expects no reply, and is not sent on behalf of the caller's transaction (that is, tpacall() with TPNOTRAN, TPNOBLOCK, and TPNOREPLY set). When a service fails inside a transaction, the transaction is put into the TX_ROLLBACK_ONLY state. This state is treated, for most purposes, as though it were equivalent to a timeout. All further ATMI calls for this transaction (with the exception of those issued in the circumstances described in the previous paragraph) will fail with TPETIME.

[TPESVCFAIL]
The service routine sending the caller's reply called tpreturn() with TPFAIL. This is an application-level failure. The contents of the service's reply, if one was sent, is available in the buffer pointed to by *odata. If the service request was made on behalf of the caller's current transaction, then the transaction is marked abort-only. Note that regardless of whether the transaction has timed out, the only valid communications before the transaction is aborted are calls to tpacall() with TPNOREPLY, TPNOTRAN, and TPNOBLOCK set.

[TPESVCERR]
A service routine encountered an error either in tpreturn(3c) or tpforward(3c) (for example, bad arguments were passed). No reply data is returned when this error occurs (that is, neither *odata, its contents, nor *olen is changed). If the service request was made on behalf of the caller's transaction (that is, TPNOTRAN was not set), then the transaction is marked abort-only. Note that regardless of whether the transaction has timed out, the only valid communications before the transaction is aborted are calls to tpacall() with TPNOREPLY, TPNOTRAN, and TPNOBLOCK set. If either SVCTIMEOUT in the UBBCONFIG file or TA_SVCTIMEOUT in the TM_MIB is non-zero, TPESVCERR is returned when a service timeout occurs.

[TPEBLOCK]
A blocking condition was found on the send call and TPNOBLOCK was specified.

[TPGOTSIG]
A signal was received and TPSIGRSTRT was not specified.

[TPEPROTO]
A library routine was called in an improper context. For example, tpcall() was called improperly.

Protocol errors occur when an ATMI function is invoked, either in the wrong order or using an incorrect process. For example, a client may try to begin communicating with a server before joining the application. Or tpcommit() may be called by a transaction participant instead of the initiator.

You can correct a protocol error at the application level by enforcing the rules of order and proper usage of ATMI calls.

To determine the cause of a protocol error, answer the following questions:

  • Is the call being made in the correct order?
  • Is the call being made by the correct process?

[TPESYSTEM]
A BEA Tuxedo system error has occurred. The exact nature of the error is written to a log file.

[TPEOS]
An operating system error has occurred. If a message queue on a remote location is filled, TPEOS may be returned even if tpcall() returned successfully.

@2016-03-07 21:35:51

Determining Types of Failures

The first step in troubleshooting is determining problem areas. In most applications you must consider six possible sources of trouble:

  • Application
  • BEA Tuxedo system
  • Database management software
  • Network
  • Operating system
  • Hardware

Once you have determined the problem area, you must then work with the appropriate administrator to resolve the problem. If, for example, you determine that the trouble is caused by a networking problem, you must work with the network administrator.

How to Determine the Cause of an Application Failure

The following steps will help you detect the source of an application failure.

  1. Check any BEA Tuxedo system warnings and error messages in the user log (ULOG).
  2. Select the messages you think most likely reflect the current problem. Note the catalog name and the number of each of message, so you can look up the message in System Messages. The manual entry provides:
    • Details about the error condition indicated by the message
    • Recommendations for recovery actions
  3. Check any application warnings and error messages in the ULOG.
  4. Check any warnings and errors generated by application servers and clients. Such messages are usually sent to the standard output and standard error files (named, by default stdout and stderr, respectively).
    • The stdout and stderr files are located in the directory defined by the APPDIR variable.
    • The stdout and stderr files for your clients and servers may have been renamed. (You can rename the stdout and stderr files by specifying -e and -o in the appropriate client and server definitions in your configuration file. For details, see servopts(5) in the File Formats, Data Descriptions, MIBs, and System Processes Reference.)
  5. Look for any core dumps in the directory defined by the APPDIR.variable. Use a debugger such as dbx to get a stack trace. If you find core dumps, notify your application developer.
  6. Check your system activity reports (for example, by running the sar(1) command) to determine why your system is not functioning properly. Consider the following reasons:
    • The system may be running out of memory.
    • The kernel might not be tuned correctly.

How to Determine the Cause of a BEA Tuxedo System Failure

The following steps will help you detect the source of a system failure.

  1. Check any BEA Tuxedo system warnings and error messages in the user log (ULOG):
    • TPEOS messages indicate errors in the operating system.
    • TPESYSTEM messages indicate errors in the BEA Tuxedo system.
  2. Select the messages you think most likely reflect the current problem. Note the catalog name and number of each of message, so you can look up the message in System Messages. The manual entry provides:
    • Details about the error condition flagged by the message.
    • Recommendations for recovery actions.
  3. Prepare for debugging in the following ways:
    • Shut down the suspend service.
    • Use tmboot -n -s(server) -d1. (This will not boot the server, but prints the command line used to boot the server by the BEA Tuxedo system.) Use that command line with a debugger such as dbx.

Comments

You must Sign In to comment on this topic.


© 2024 Digcode.com