Server monitoring can be done using a number of utilities. Those utilities are top, sar (system activitiy reporter), sadc, vmstat, the Unix (w)atch command, iostat, etc. Note that the results of these processes can parsed and placed into database tables on a scheduled basis for later statistical analysis.
last pid: 15713; load averages: 0.01, 0.01, 0.01 04:15:15 58 processes: 57 sleeping, 1 on cpu CPU states: 99.7% idle, 0.0% user, 0.3% kernel, 0.0% iowait, 0.0% swap Memory: 1024M real, 34M free, 598M swap in use, 2263M swap free PID USERNAME THR PRI NICE SIZE RES STATE TIME CPU COMMAND 15713 oracle 1 58 0 2136K 1664K cpu0 0:00 0.08% top 1732 build 26 58 0 195M 60M sleep 5:45 0.05% java 14478 ted 25 58 0 182M 40M sleep 1:08 0.01% java 15711 root 1 58 0 2152K 1728K sleep 0:00 0.00% sendmail 127 root 1 58 0 3208K 2672K sleep 0:38 0.00% in.named 235 root 1 58 0 1752K 1056K sleep 0:20 0.00% sshd1 9259 oracle 1 58 0 9144K 4544K sleep 0:18 0.00% tnslsnr 14829 wwong 22 58 0 50M 25M sleep 0:03 0.00% java 14950 wwong 21 0 0 46M 19M sleep 0:01 0.00% java 1609 build 20 0 0 167M 18M sleep 0:01 0.00% java 179 root 8 52 0 2376K 1944K sleep 0:01 0.00% nscd 15676 root 1 38 0 1568K 1264K sleep 0:00 0.00% in.rlogind 143 daemon 4 44 0 2184K 1568K sleep 0:00 0.00% statd 138 root 1 48 0 2008K 1584K sleep 0:00 0.00% inetd 231 root 1 48 0 2096K 1552K sleep 0:00 0.00% sendmail
The sar utility is the System Activity Reporter. sar -u shows CPU activity, sar -w shows swapping activity, sar -gp shows paging activity, sar -r shows unused memory and sar -b show buffer activity. All the examples below execute five times every three seconds, ie. sar [-u|-w|-gp|-r|-b] t n where t determines how often (in seconds) a report is made and n determines the number of seconds between each report.
The sadc utility is used to collect server statistics. As opposed to the sar utility which displays system activity, the sadc utility is used to sample, save and process system activity statistics. These statistics are in general CPU utilization, buffers usage, disk I/O activity, TTY device activity, switching and system-call activity, file-access, queue activity, inter-process communications, and paging. The sar, sa1 and sa2 utilities are packaged in sar. sadc packages can be used to schedule cron jobs in the crontab file for collection and reporting of system activity statistics.
# sar -u 3 5 15:25:05 %usr %sys %wio %idle 15:25:08 1 0 1 98 15:25:11 1 2 9 88 15:25:14 0 0 1 99 15:25:17 0 0 1 99 15:25:20 1 0 1 98 Average 1 1 2 96
# sar -w 3 5 15:27:13 swpin/s bswin/s swpot/s bswot/s pswch/s 15:27:16 0.00 0.0 0.00 0.0 146 15:27:19 0.00 0.0 0.00 0.0 170 15:27:22 0.00 0.0 0.00 0.0 167 15:27:25 0.00 0.0 0.00 0.0 187 15:27:28 0.00 0.0 0.00 0.0 163 Average 0.00 0.0 0.00 0.0 167
# sar -gp 3 5 15:28:14 pgout/s ppgout/s pgfree/s pgscan/s %ufs_ipf atch/s pgin/s ppgin/s pflt/s vflt/s slock/s 15:28:17 1.33 1.33 1.33 0.00 0.00 1.33 0.00 0.00 1.67 1.67 0.00 15:28:20 1.33 1.33 1.33 0.00 0.00 1.33 0.00 0.00 0.00 0.00 0.00 15:28:23 1.33 1.33 1.33 0.00 0.00 1.33 0.00 0.00 0.00 0.00 0.00 15:28:26 1.67 1.67 1.67 0.00 0.00 1.33 0.00 0.00 0.00 1.00 0.00 15:28:29 1.33 1.33 1.33 0.00 0.00 1.33 0.00 0.00 0.00 0.00 0.00 Average 1.40 1.40 1.40 0.00 0.00 1.33 0.00 0.00 0.33 0.53 0.00
# sar -r 3 5 15:28:56 freemem freeswap 15:28:59 2097 3590815 15:29:02 2110 3614833 15:29:05 2089 3578925 15:29:08 2105 3602784 15:29:11 2113 3614866 Average 2103 3600390
# sar -b 3 5 15:29:31 bread/s lread/s %rcache bwrit/s lwrit/s %wcache pread/s pwrit/s 15:29:34 0 0 100 0 0 100 0 0 15:29:37 0 0 100 0 0 100 0 0 15:29:40 0 1 100 0 0 100 0 0 15:29:43 0 1 100 0 0 100 0 0 15:29:46 0 0 100 0 0 100 0 0 Average 0 0 100 0 0 100 0 0
The vmstat utility is used to report virtual memory statistics.
# vmstat procs memory page disk faults cpu r b w swap free re mf pi po fr de sr m0 m1 m1 s0 in sy cs us sy id 0 0 0 54656 93832 0 7 2 1 3 0 0 1 1 1 1 423 74 92 0 0 99
# vmstat -cisS interrupt total rate -------------------------------- clock 19428784 100 hmec0 830039 4 hmec1 0 0 -------------------------------- Total 20258823 104
# w 4:28am up 2 day(s), 6 hr(s), 1 user, load average: 0.00, 0.00, 0.01 User tty login@ idle JCPU PCPU what oracle pts/2 4:08am w
iostat is used to report I/O statistics.
# iostat
tty md0 md1 md10 sd0 cpu
tin tout kps tps serv kps tps serv kps tps serv kps tps serv us sy wt id
0 5 17 1 17 17 1 21 23 1 26 17 1 17 0 0 1 99
# iostat -xtc
extended device statistics tty cpu
device r/s w/s kr/s kw/s wait actv svc_t %w %b tin tout us sy wt id
md0 0.1 1.0 3.4 13.5 0.0 0.0 17.1 0 1 0 5 0 0 1 99
md1 0.1 1.0 3.3 13.5 0.0 0.0 20.5 0 1
md10 0.3 1.0 6.7 16.3 0.0 0.0 26.4 1 1
sd0 0.1 1.0 3.4 13.6 0.0 0.0 17.1 0 1
sd1 0.1 1.6 3.3 13.9 0.0 0.0 21.2 0 1
sd6 0.0 0.0 1.4 0.0 0.0 0.0 26.0 0 0
nfs1 0.0 0.0 0.0 0.0 0.0 0.0 0.2 0 0
# iostat -xnp
extended device statistics
r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device
0.1 1.0 3.4 13.5 0.0 0.0 0.0 17.1 0 1 md0
0.1 1.0 3.3 13.5 0.0 0.0 0.0 20.5 0 1 md1
0.3 1.0 6.7 16.3 0.0 0.0 5.6 20.8 1 1 md10
0.1 1.0 3.4 13.6 0.0 0.0 0.0 17.0 0 1 c0t0d0
0.1 1.0 3.4 13.5 0.0 0.0 0.0 17.1 0 1 c0t0d0s0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 13.3 0 0 c0t0d0s1
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c0t0d0s2
0.1 1.6 3.3 13.9 0.0 0.0 0.0 21.2 0 1 c0t1d0
0.1 1.0 3.3 13.5 0.0 0.0 0.0 20.5 0 1 c0t1d0s0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c0t1d0s1
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c0t1d0s2
0.0 0.7 0.0 0.3 0.0 0.0 0.0 22.5 0 1 c0t1d0s7
0.0 0.0 1.4 0.0 0.0 0.0 0.0 25.9 0 0 c0t6d0
Display the top CPU users with the command ps -ef | sort +6 | tail -10.
# ps -ef | sort +6 | tail -10
root 28066 1 0 Jun 18 ? 2:25 /usr/sbin/nscd
root 23790 1 0 Jun 13 ? 8:31 /usr/local/sbin/sshd
root 276 253 0 May 18 ? 27:50 mibiisa -r -p 32788
root 3 0 1 May 18 ? 733:26 fsflush
mjlewis 9190 9187 0 Aug 09 pts/2 0:00 -ksh
network 9206 9196 0 Aug 09 pts/2 0:00 -sh
mjlewis 9196 9190 0 Aug 09 pts/2 0:00 tcsh
network 9237 9206 0 Aug 09 pts/2 0:00 tcsh
Display RAM size with the command prtconf | grep "Memory Size".
prtconf|grep -i "Memory Size" Memory size: 1024 Megabytes
# ipcs -mqs IPC status from <running system> as of Sat Jul 21 03:53:20 2001 T ID KEY MODE OWNER GROUP Message Queues: Shared Memory: m 300 0x5000a417 --rw-r--r-- root other m 101 0x7f8207c8 --rw-rw---- oracle dba Semaphores: s 458752 0x061761b5 --ra-ra---- oracle dba s 65537 0x061761b6 --ra-ra---- oracle dba s 65538 0x061761b7 --ra-ra---- oracle dba s 65539 0x061761b8 --ra-ra---- oracle dba s 65540 0x061761b9 --ra-ra---- oracle dba
# ipcs -pmb IPC status from <running system> as of Sat Jul 21 03:53:46 2001 T ID KEY MODE OWNER GROUP SEGSZ CPID LPID Shared Memory: m 300 0x5000a417 --rw-r--r-- root other 68 23372 23372 m 101 0x7f8207c8 --rw-rw---- oracle dba 249061376 7379 7367
Display the number of CPUs using the psrinfo -v command.
# psrinfo -v
Status of processor 0 as of: 07/21/01 03:58:02
Processor has been on-line since 07/18/01 22:28:48.
The sparc processor operates at 450 MHz,
and has a sparc floating point processor.
Status of processor 2 as of: 07/21/01 03:58:02
Processor has been on-line since 07/18/01 22:28:49.
The sparc processor operates at 450 MHz,
and has a sparc floating point processor.
Display system details using the uname -X command.
# uname -X System = SunOS Node = <node> Release = 5.6 KernelID = Generic_105181-26 Machine = sun4u BusType =Serial = Users = OEM# = 0 Origin# = 1 NumCPU = 2
Find the number CPUs and the speed of those CPUs or memory using dmesg | grep -i [cpu | mem].
# dmesg | grep -i cpu cpu0: SUNW,UltraSPARC-II (upaid 0 impl 0x11 ver 0xa0 clock 450 MHz) cpu1: SUNW,UltraSPARC-II (upaid 2 impl 0x11 ver 0xa0 clock 450 MHz) cpu2: SUNW,UltraSPARC-II (upaid 2 impl 0x11 ver 0xa0 clock 450 MHz) cpu3: SUNW,UltraSPARC-II (upaid 2 impl 0x11 ver 0xa0 clock 450 MHz) # dmesg | grep -i mem mem = 1048576K (0x40000000) avail mem = 1039376384
Unix system log files are in /var/log, /var/adm and /var/cron. /var/log contains numbered sequences of /var/log/syslog[.n]. Examine the system logs using the tail command because log files can be large. /var/adm/messages[.n] files contain mostly login record histories. See Understanding system log files on a Solaris 2.x operating system for further details.
Display active dedicated connections as shown below. Execute the command below excluding |wc -l to view connected dedicated server processes.
ps -ef |grep $ORACLE_SID|grep -v grep|grep -v ora_|wc -l
# ps -ef |grep $ORACLE_SID|grep -v grep|grep -v ora_ oracle 24440 24439 0 16:35:42 ? 0:00 oracle<SID> (DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROTOCOL=beq)))
Find all Oracle processes. grep -v grep removes the grep command from the list of processes. Also note that the results of one unix shell command can be passed to the next using the | (pipe) command.
ps -ef | grep ora_ | grep -v grep
Check the listener and the agent.
ps -ef | grep [lsnr|dbsnmp] | grep -v grep
Below is a command to kill all Oracle processes. Only do this in absolute desperation. Solaris may require the manual removal of semaphores after executing a command like this. Always try a shutdown immediate | abort within svrmgrl and lsnrctl stop | dbsnmp_stop first. The ps -ef command in combination with grep will produce a list of processes named as "ora_". Application of the $ORACLE_SID variable to the grep function will filter and include all oracle processes containing the currently set Oracle database service identifier. The awk processor will pull the second column of the output, ie. the process ID. The xargs function will allow construction and evaluation of the command kill -9 <PID>. The {} curly braces are generally used to supply xargs with the argument passed from the previous pipe command.
ps -ef | grep ora_ | grep -v grep | grep $ORACLE_SID | awk '{print $2}' | xargs kill -9 {}
I find myself looking for files on a Unix box constantly. There are a number ways to do this using three commands I can think of off the top of my head, which, whereis and find. Both which and whereis require that the file be in the path ($PATH variable). The find command can be used to search for file names recursively through sub-directories as shown below. The . option shown below can be an absolute or relative path.
find . -name "listener.ora" -print
The -print option will display only file names found. However, if this command is executed from the root directory as the oracle user, and obviously the oracle unix user does not have full access to the entire file system, any non-readable directories will be displayed as non-readable in the output. To remove listings of non-readable directories either redirect the results of the file command to another file. Otherwise use a grep command and/or an xargs command to filter and/or re-execute on the find command STDOUT.
See the .cshrc file in Installing Oracle8i on Solaris in the Oracle User Startup Configuration Files section.
Aliases in Unix can allow quick access to otherwise difficult to find directories and can even execute stored commands when applied. Be careful with aliases, some really wild things can be done with them, causing some equally wild results. From an Oracle DBA perspective, it is better to use aliases with respect and perhaps only for quick changes to different directories within the oracle user directory structure. And even then, why would you want to go into directories containing Oracle database datafiles. There is no reason for this. In fact, doing so could cause an error such as deleting a datafile OUCH !!!. Note the inclusion of the alias rm 'rm -i'. This alias will always prompt prior to deletion.
See the .cshrc file in Installing Oracle8i on Solaris in the Oracle User Startup Configuration Files section.
The short example below will execute a shutdown immediate on the database accessed by the currently set $ORACLE_SID value. Note that if you wish to execute a script such as below as a crontab job you must declare all variables required within the script or execute the .cshrc (.log or .profile) file at the begginning of the execution of each crontab job. Note, when passing scripts as cron-jobs through crontab, crontab cannot see the PATH variable setting. Thus all variables used in scripts must be explicitly set in those scripts.
#!/bin/sh setenv ORACLE_BASE /<mount-point-1>/oracle setenv ORACLE_HOME $ORACLE_BASE/product/8.1.7 setenv LD_LIBRARY_PATH $ORACLE_HOME/lib setenv JAVA_HOME $ORACLE_BASE/jre/1.1.8 setenv ORACLE_SID <SID> $ORACLE_HOME/bin/svrmgrl<<! connect internal; shutdown immediate; disconnect; exit !
The scheduled execution of jobs can be handled using what Unix calls a crontab or cron-job. A crontab file can be stored per user, ie. root would have a different crontab file to that of the oracle user for instance. The [-elr] options will allow editing, listing and removal of the user specified crontab file. Note that by default the crontab editor uses a really nasty editor. Set the EDITOR environment variable to the value vi to allow editing of the crontab file with the vi editor.
/bin/sh; setenv EDITOR vi; crontab [-elr] <username>
The crontab entry format is basically <execution-time> <process>. The execution time is entered as minute, hour, day, month of year and day of week as shown below. The * (asterisk) is used as a wildcard to indicate repetition.
minute (0-59) hour (0-23) day of the month (1-31) month of the year (1-12) day of the week (0-6 with 0=Sunday).
In the example crontab script below two backups, one export and a trace file removal process are executed. The two backups are executed on Monday to Saturday and Sunday at 03h30. The Sunday backup is a more full-backup than the other backup. The export exports the contents of the Oracle USER schema every day. In a large database exporting anything but small things (not data) would be very impractical. In a production environment typically individual tables can be exproted on a daily basis. However, exporting of specific tables and/or Oracle schemas are probably more applicable to development and testing environments. Developers can quite easily drop tables and other database objects in error. Export is a very quick and easy way of restoration for small database objects, particulary not ridiculously large tables. Also note the inclusion of explicit path names. Also note that the inclusion of passwords into crontab executed shell scripts is a possible security risk. However, if a hacker is in that far then the hacker is in already in anyway. Also Oracle password files tend to give complete access to the database if one knows of the existence of the internal user. As far as I know Oracle is trying to phase out the use of the internal, system and sys users. This security issue may be resolved in the future.
30 3 * * 1-6 /<mount-point-1>/oracle/product/8.1.7/sbin/backup.sh <password> <SID> <SID> 30 3 * * 0 /<mount-point-1>/oracle/product/8.1.7/sbin/backup.sh <password> <SID> <SID> All 30 4 * * 0 /<mount-point-1>/oracle/product/8.1.7/sbin/export.sh <password> <SID> <SID> user 30 4 * * 0-6 /<mount-point-1>/oracle/product/8.1.7/sbin/utils/delTraceAudit.sh <SID>
Two files in the etc/cron.d directory are the cron.allow and the cron.deny files. If either or both of the files do not exist then their are no restrictions. Obviously inclusion of a username into either allows or denies cronr-jobs for a user specifically declared in /etc/cron.d/cron.allow and /etc/cron.d/cron.deny respectively.
The nohup command can be used to run processes or scripts in background. What this means is that a process can be executed and the calling shell can be aborted. A crontab job will execute in background. Note that 2>&1 will redirect standard error messages to the standard output device (1 = standard output device and 2 = standard error device). The final & character in the command causes the process to run in background, freeing up the shell for further use.
nohup ./$ORACLE_HOME/sbin/backup.sh temp <SID> <SID> > $ORACLE_HOME/logs/backup.log 2>&1 &
The command nohup ./<shell script> & will send the output to a file called nohup.out.
You may want to watch the execution of the nohup executed process. Do this by viewing the log file as shown below. Note that the tail -f command will display each line as it is appended to a file such as a log file.
more $ORACLE_HOME/logs/backup.log
cat $ORACLE_HOME/logs/backup.log
tail [-20 | -f] $ORACLE_HOME/logs/backup.log
You could even email the results to yourself at an email address. This email address could even be a pager or a cellphone. Pagers and cellphones can be used very effectively to pass problematic or potentially problematic results of a script's execution requiring action by the DBA. Automated database monitoring and O/S level monitoring can be very effective by utilising email as show below.
nohup $ORACLE_HOME/sbin/backup.sh temp <SID> <SID> > $ORACLE_HOME/logs/backup.log & | mailx -s "backup.log" errors@hostname.com
df -k will show mount points and space used.
# df -k Filesystem kbytes used avail capacity Mounted on /proc 0 0 0 0% /proc /dev/dsk/c0t0d0s0 7670973 6310535 1283729 84% / fd 0 0 0 0% /dev/fd swap 1300760 400 1300360 1% /tmp
The command below could constitute a basis for a disk space running-low warning script.
# df -k | awk '{print $1 " " $5}'
Filesystem capacity
/proc 0%
/dev/dsk/c0t0d0s0 84%
fd 0%
swap 1%
The script below can be used to automate checking of disk space.
#!/bin/ksh
#
# ---------------------------------------------------------------------------------------------------------------------------------
#
export PATH=/bin:/usr/bin:/etc:/usr/ccs/bin:/usr/openwin/bin:/usr/ucb:/usr/local/bin:/usr/sbin:/usr/X/bin:$JAVA_HOME
export host=`hostname`
export email=$1
export USAGE="$0: Incorrect arguments, Usage: $0 <\"email1 email2 ... emailn\">"
if [ -z "$email" ]; then
echo "$USAGE"
mailx -s "$USAGE" emergencyEMail@whatever.com < /dev/null
exit 1
elif [ `whoami` != 'root' ]; then
echo "$0 aborted - user `whoami` is incorrect, must be user root"
mailx -s "$0 aborted - user `whoami` is incorrect, must be user root" $email < /dev/null
exit 1
fi
if [ -z "$PATH" ] || [ -z "$host" ]; then
echo "$0 aborted - variable not defined"
mailx -s "$0 aborted - variable not defined" $email < /dev/null
exit 1
fi
echo "Executing $0 on $host"
#
# ---------------------------------------------------------------------------------------------------------------------------------
#
panic=95
scream=99
df -k | awk '{\
if (($1 != "Filesystem") && ($1 != "fd") && ($1 != "/proc"))\
{\
if ($5 > scream) { print "SCREAM !!! - Disk space on",host,$1,"@",$5 }\
else if ($5 > panic) { print "Panic - Disk space on",host,$1,"@",$5 }\
}\
}' scream=$scream panic=$panic host=$host > /usr/local/sbin/logs/diskspace.log
if [ -s /usr/local/sbin/logs/diskspace.log ]; then
sub="Script $0 on $host detected disk space limits exceeded !!!"
echo $sub
mailx -s "$sub" $email < /usr/local/sbin/logs/diskspace.log
exit 1
fi
exit 0
# du -s $ORACLE_HOME | sort -n 1801978 /<mount-point-1>/oracle/product/8.1.7
# du -sk $ORACLE_HOME 900989 /<mount-point-1>/oracle/product/8.1.7
Compress <file> with tar cvf <file> | gzip <file> and decompress with gzip -d <file>.tar.gz | xargs tar xvf {}.
#!/bin/ksh
#
# ---------------------------------------------------------------------------------------------------------------------------------
#
export ORACLE_BASE=/<mount-point-1>/oracle
export ORACLE_HOME=$ORACLE_BASE/product/8.1.7
export LD_LIBRARY_PATH=$ORACLE_HOME/lib:/usr/openwin/lib
export PATH=/bin:/usr/bin:/etc:/usr/ccs/bin:/usr/openwin/bin:/usr/ucb:/usr/local/bin:/usr/sbin:/usr/X/bin:$JAVA_HOME
export host=`hostname`
export email=$1
export USAGE="$0: Incorrect arguments, Usage: $0 <\"email1 email2 ... emailn\">"
if [ -z "$email" ]; then
echo "$USAGE"
mailx -s "$USAGE" emergencyEMail@whatever.com < /dev/null
exit 1
elif [ `whoami` != 'oracle' ]; then
echo "$0 aborted - user `whoami` is incorrect, must be user root"
mailx -s "$0 aborted - user `whoami` is incorrect, must be user root" $email < /dev/null
exit 1
fi
if [ -z "$PATH" ] || [ -z "$host" ] || [ -z "$ORACLE_BASE" ] || [ -z "$ORACLE_HOME" ]; then
echo "$0 aborted - variable not defined"
mailx -s "$0 aborted - variable not defined" $email < /dev/null
exit 1
fi
echo "Executing $0 on $host"
#
# ---------------------------------------------------------------------------------------------------------------------------------
#
echo Cleaning trace and transaction audit files for $1 upto 14 days ago
unalias rm
find $ORACLE_BASE/admin/$1/bdump/*.trc -mtime +14 | xargs rm -f
find $ORACLE_BASE/admin/$1/udump/*.trc -mtime +14 | xargs rm -f
find $ORACLE_BASE/admin/$1/cdump/*.* -mtime +14 | xargs rm -f
find $ORACLE_HOME/rdbms/audit/*.aud -mtime +14 | xargs rm -f
alias rm 'rm -i'
Change file permissions with the chmod command and change file ownership with the chown command.
#!/bin/ksh
#
# ---------------------------------------------------------------------------------------------------------------------------------
#
export ORACLE_BASE=/<mount-point-1>/oracle
export ORACLE_HOME=$ORACLE_BASE/product/8.1.7
export LD_LIBRARY_PATH=$ORACLE_HOME/lib:/usr/openwin/lib
export TNS_ADMIN=$ORACLE_HOME/network/admin
export ORACLE_SBIN=$ORACLE_HOME/sbin
export PATH=/bin:/usr/bin:/etc:/usr/ccs/bin:/usr/openwin/bin:/usr/ucb:/usr/local/bin:/usr/sbin:/usr/X/bin:$JAVA_HOME
export host=`hostname`
export email=$1
export password=$2
export ORACLE_SID=$3
export tnsname=$4
USAGE="$0: Incorrect arguments, Usage: $0 <email> <password> <sid> <tnsname>"
if [ -z "$email" ]; then
echo "$USAGE"
mailx -s "$USAGE" emergencyEMail@whatever.com < /dev/null
exit 1
elif [ `whoami` != 'oracle' ]; then
echo "$0 aborted - user `whoami` is incorrect, must be user root"
mailx -s "$0 aborted - user `whoami` is incorrect, must be user root" $email < /dev/null
exit 1
fi
if [ -z "$PATH" ] || [ -z "$host" ] || [ -z "$ORACLE_BASE" ] || [ -z "$ORACLE_HOME" ]\
|| [ -z "$password" ] || [ -z "$ORACLE_SID" ] || [ -z "$tnsname" ]; then
echo "$0 aborted - variable not defined"
mailx -s "$0 aborted - variable not defined" $email < /dev/null
exit 1
fi
echo "Executing $0 on $host"
#
# ---------------------------------------------------------------------------------------------------------------------------------
#
#check processes
pmon="`ps -eaf | grep -i pmon | grep -i ora_ | grep -v grep | wc -l`"
smon="`ps -eaf | grep -i smon | grep -i ora_ | grep -v grep | wc -l`"
dbwr="`ps -eaf | grep -i dbw | grep -i ora_ | grep -v grep | wc -l`"
lgwr="`ps -eaf | grep -i lgwr | grep -i ora_ | grep -v grep | wc -l`"
ckpt="`ps -eaf | grep -i ckpt | grep -i ora_ | grep -v grep | wc -l`"
reco="`ps -eaf | grep -i reco | grep -i ora_ | grep -v grep | wc -l`"
arch="`ps -eaf | grep -i arc | grep -i ora_ | grep -v grep | wc -l`"
processes=`echo "$pmon+$smon+$dbwr+$lgwr+$ckpt+$reco+$arch" | bc`
if [ $processes -eq 0 ]; then
echo Aborting - Database process ERROR
echo pmon=$pmon,smon=$smon,dbwr=$dbwr,lgwr=$lgwr,ckpt=$ckpt,reco=$reco,arch=$arch
mailx -s "Aborting - Database process ERROR" $email > /dev/null
exit 1
fi
#check listener
listener="`ps -eaf |grep lsnr |grep -v grep | wc -l`"
if [ $listener -eq 1 ]; then
lsnr_SIDs="`lsnrctl status | grep $2 | grep -v grep | wc -l`"
if [ $lsnr_SIDs -eq 0 ]; then
echo Aborting - Listener configuration does not match SID parameter
mailx -s "Aborting - Listener configuration does not match SID parameter" $email > /dev/null
exit 1
fi
else
i=0
while [ $listener -eq 0 ]; do
if [ $i -gt 1 ]; then
echo Aborting - Listener is DOWN - failed to restart
echo Check $TNS_ADMIN/listener.ora configuration
mailx -s "Aborting - Listener is DOWN - failed to restart" $email > /dev/null
exit 1
fi
lsnrctl start
listener="`ps -eaf |grep lsnr |grep -v grep | wc -l`"
i=`echo "$i+1" | bc`
done
fi
#check database state
$ORACLE_HOME/bin/sqlplus $3<<!
connect internal;
set termout off echo off feed off trimspool on head off pages 0;
spool $ORACLE_SBIN/logs/databaseAlive.log;
select name from v$database;
select open_mode from v$database;
spool off;
disconnect;
exit;
!
dbState="`cat $ORACLE_SBIN/logs/databaseAlive.log |grep READ |grep -v grep | wc -l`"
if [ $dbState -eq 0 ]; then
echo Aborting - database for SID $2 not open
mailx -s "Aborting - database for SID $2 not open" $email > /dev/null
exit 1
fi
The script below will detect and email ORA-00600 errors in the Oracle database alter log file.
#!/bin/ksh
#
# ---------------------------------------------------------------------------------------------------------------------------------
#
export ORACLE_BASE=/<mount-point-1>/oracle
export ORACLE_HOME=$ORACLE_BASE/product/8.1.7
export TNS_ADMIN=$ORACLE_HOME/network/admin
export ORACLE_SBIN=$ORACLE_HOME/sbin
export PATH=/bin:/usr/bin:/etc:/usr/ccs/bin:/usr/openwin/bin:/usr/ucb:/usr/local/bin:/usr/sbin:/usr/X/bin:$JAVA_HOME
export host=`hostname`
export email=$1
USAGE="$0: Incorrect arguments, Usage: $0 <email>"
if [ -z "$email" ]; then
echo "$USAGE"
mailx -s "$USAGE" emergencyEMail@whatever.com < /dev/null
exit 1
elif [ `whoami` != 'oracle' ]; then
echo "$0 aborted - user `whoami` is incorrect, must be user root"
mailx -s "$0 aborted - user `whoami` is incorrect, must be user root" $email < /dev/null
exit 1
fi
if [ -z "$PATH" ] || [ -z "$host" ] || [ -z "$ORACLE_BASE" ] || [ -z "$ORACLE_HOME" ]\
|| [ -z "$password" ] || [ -z "$ORACLE_SID" ] || [ -z "$tnsname" ]; then
echo "$0 aborted - variable not defined"
mailx -s "$0 aborted - variable not defined" $email < /dev/null
exit 1
fi
echo "Executing $0 on $host"
#
# ---------------------------------------------------------------------------------------------------------------------------------
#
err="`tail -50 $ORACLE_ALERT/alert_$ORACLE_SID.log | grep ORA-00600 | grep -v grep | wc -l`"
if [ $err -gt 0 ]; then
tail -50 $ORACLE_ALERT/alert_$ORACLE_SID.log | grep ORA-00600 | grep -v grep > $ORACLE_SBIN/logs/ora600.log
sub="Script $0 on $host detected ORA-00600 for SID $ORACLE_SID"
echo $sub
mailx -s "$sub" $email < $ORACLE_SBIN/logs/ora600.log
exit 1
fi
exit 0
This script is intended for a database with non-AutoExtensible tablespace databases. The highlighted sections apply to a standby database only.
#!/bin/ksh
#
# ---------------------------------------------------------------------------------------------------------------------------------
#
ORACLE_BASE=/<mount-point-1>/oracle
ORACLE_HOME=$ORACLE_BASE/product/8.1.7
LD_LIBRARY_PATH=$ORACLE_HOME/lib:/usr/openwin/lib
JAVA_HOME=$ORACLE_BASE/jre/1.1.8
TNS_ADMIN=/<mount-point-1>/oracle/product/8.1.7/network/admin
ORACLE_SID=<SID>
ORACLE_DBF1=/<moint-point-1>/oracle/oradata/$ORACLE_SID
ORACLE_DBF2=/<moint-point-2>/oracle/oradata/$ORACLE_SID
ORACLE_BACKUPS=/<moint-point-2>/backups
ORACLE_SBIN=$ORACLE_HOME/sbin
ORACLE_ALERT=$ORACLE_BASE/admin/$ORACLE_SID/bdump
PATH=/bin:/usr/bin:/etc:/usr/ccs/bin:/usr/openwin/bin:/usr/ucb:/usr/local/bin:/usr/sbin:/usr/X/bin:$JAVA_HOME
PATH=${PATH}:$ORACLE_HOME/bin:$ORACLE_HOME/sbin
host=`hostname`
if [ -z "$ORACLE_BASE" ] || [ -z "$ORACLE_HOME" ] || [ -z "$LD_LIBRARY_PATH" ] || [ -z "$JAVA_HOME" ]\
|| [ -z "$TNS_ADMIN" ] || [ -z "$ORACLE_SID" ] || [ -z "$ORACLE_DBF1" ] || [ -z "$ORACLE_DBF2" ] || [ -z "$ORACLE_BACKUPS" ]\
|| [ -z "$ORACLE_SBIN" ] || [ -z "$ORACLE_ALERT" ] || [ -z "$PATH" ] || [ -z "$host" ]; then
echo "$0 aborted - variable not defined"
mailx -s "$0 aborted - variable not defined" $email < /dev/null
exit 1
fi
USAGE="$0: Incorrect arguments, Usage: $0 <\"email1 email2 ... emailn\">"
if [ `whoami` != 'oracle' ]; then
echo "$0 aborted - user `whoami` is incorrect, must be user oracle"
mailx -s "$0 aborted - user `whoami` is incorrect, must be user oracle" $email < /dev/null
exit 1
elif [ -z "$1" ]; then
echo "$USAGE"
mailx -s "$USAGE" name@xyz.com < /dev/null
exit 1
fi
email=$1
echo "Executing $0 on $host"
#
# ---------------------------------------------------------------------------------------------------------------------------------
#
panic=95
scream=99
$ORACLE_HOME/bin/svrmgrl<<!
connect internal;
alter database open read only;
exit
!
$ORACLE_HOME/bin/sqlplus system/<password><<!
set term off echo off feedback off show off trim off trims off verify off linesize 132;
spool $ORACLE_SBIN/logs/tablespace.log;
SELECT 'Tablespace '||df.tablespace_name "TBS"
,round((sum(nvl(fs.bytes,0))/ (df.bytes)) * 100) "%Free"
,round(((df.bytes - sum(nvl(fs.bytes,0))) / (df.bytes) ) * 100) "%Used"
,round(sum(nvl(fs.bytes/1024/1024,0))) "Mb Free"
,round(df.bytes/1024/1024 - sum(nvl(fs.bytes/1024/1024,0))) "Mb Used"
,df.autoextensible "AutoExtensible"
FROM dba_free_space fs,
dba_data_files df
WHERE fs.file_id(+) = df.file_id
GROUP BY df.tablespace_name, df.file_id, df.bytes, df.autoextensible
ORDER BY df.file_id;
SELECT round(sum(round(sum(nvl(fs.bytes/1024/1024,0)))) / sum(round(sum(nvl(fs.bytes/1024/1024,0))) + round(df.bytes/1024/1024 - sum(nvl(fs.bytes/1024/1024,0)))) * 100, 0) "%Free"
,round(sum(round(df.bytes/1024/1024 - sum(nvl(fs.bytes/1024/1024,0)))) / sum(round(sum(nvl(fs.bytes/1024/1024,0))) + round(df.bytes/1024/1024 - sum(nvl(fs.bytes/1024/1024,0)))) * 100, 0) "%Used"
,sum(round(sum(nvl(fs.bytes/1024/1024,0)))) "Mb Free"
,sum(round(df.bytes/1024/1024 - sum(nvl(fs.bytes/1024/1024,0)))) "Mb Used"
,sum(round(sum(nvl(fs.bytes/1024/1024,0))) + round(df.bytes/1024/1024 - sum(nvl(fs.bytes/1024/1024,0)))) "Size"
FROM dba_free_space fs,
dba_data_files df
WHERE fs.file_id(+) = df.file_id
GROUP BY df.tablespace_name, df.file_id, df.bytes, df.autoextensible
ORDER BY df.file_id;
spool off;
exit
!
$ORACLE_HOME/bin/svrmgrl<<!
connect internal;
shutdown immediate;
startup nomount;
alter database mount standby database;
exit
!
cat $ORACLE_SBIN/logs/tablespace.log | grep Tablespace | grep -v grep | grep -v SQL\> | awk '{\
if (($7 == "NO") && (int($4) > int(scream)))\
{\
print "SCREAM !!! - Non-AutoExtensible Tablespace",$2,"space in database",sid,"on",host,"@",$4"%"\
}\
else if (($7 == "NO") && (int($4) > int(panic)))\
{\
print "Panic - Non-AutoExtensible Tablespace",$2,"space in database",sid,"on",host,"@",$4"%"\
}\
}' scream=$scream panic=$panic host=$host sid=$ORACLE_SID > $ORACLE_SBIN/logs/dbfspace.log
#else if (int($4) > int(scream)) { print "SCREAM !!! - Tablespace",$2,"space in database",sid,"on",host,"@",$4"%" }\
#else if (int($4) > int(panic)) { print "Panic - Tablespace",$2,"space in database",sid,"on",host,"@",$4"%" }\
if [ -s $ORACLE_SBIN/logs/dbfspace.log ]; then
sub="Script $0 on $host in database $ORACLE_SID detected tablespace limits exceeded !!!"
echo $sub
mailx -s "$sub" $email < $ORACLE_SBIN/logs/dbfspace.log
exit 1
fi
exit 0
#!/bin/ksh
#
# ---------------------------------------------------------------------------------------------------------------------------------
#
ORACLE_HOME=/<mount-point>/oracle/product/8.1.7
LD_LIBRARY_PATH=/<mount-point>/oracle/product/8.1.7/lib:/usr/ucblib:/usr/openwin/lib
JAVA_HOME=/<mount-point>/oracle/product/jre/1.1.8
ORACLE_SID=<SID>
ORACLE_SBIN=/<mount-point>/oracle/product/8.1.7/sbin
ORACLE_ALERT=/<mount-point>/oracle/admin/sudbp/bdump
PATH=/bin:/usr/bin:/etc:/usr/ccs/bin:/usr/openwin/bin:/usr/ucb:/usr/local/bin:/usr/sbin:/usr/X/bin:$JAVA_HOME
PATH=${PATH}:$ORACLE_HOME/bin:$ORACLE_HOME/sbin
host=`hostname`
if [ -z "$ORACLE_HOME" ] || [ -z "$LD_LIBRARY_PATH" ] || [ -z "$JAVA_HOME" ]\
|| [ -z "$ORACLE_SID" ] || [ -z "$ORACLE_ALERT" ] || [ -z "$ORACLE_SBIN" ]\
|| [ -z "$PATH" ] || [ -z "$host" ]; then
echo "$0 aborted - variable not defined"
mailx -s "$0 aborted - variable not defined" $email < /dev/null
exit 1
fi
USAGE="$0: Incorrect arguments, Usage: $0 <\"email1 email2 ... emailn\">"
if [ `whoami` != 'oracle' ]; then
echo "$0 aborted - user `whoami` is incorrect, must be user oracle"
mailx -s "$0 aborted - user `whoami` is incorrect, must be user oracle" $email < /dev/null
exit 1
elif [ -z "$1" ]; then
echo "$USAGE"
mailx -s "$USAGE" "emergency@xyz.com" < /dev/null
exit 1
fi
email=$1
echo "Executing $0 on $host"
#
# ---------------------------------------------------------------------------------------------------------------------------------
#
$ORACLE_HOME/bin/sqlplus internal<<!
set head off term off echo off feedback off show off trim off trims off verify off linesize 132 escape on;
spool /u01/app/oracle/product/8.1.5/sbin/logs/archives_e4501.log;
select 'SEQUENCE '||to_char(sequence#)||' DATE '||to_char(first_time,'yyyy/mm/dd hh:mi:ss')
from v\$log_history where sequence# = (select max(sequence#)-1 from
v\$log_history);
spool off;
exit
!
cat $ORACLE_SBIN/logs/archives_production.log | grep "SEQUENCE" | grep -v grep | grep -v SQL\> | awk '{ print $1,$2,$3,$4,$5 }' > $ORACLE_SBIN/logs/archives_production.log
if [ -s $ORACLE_SBIN/logs/archives_production.log ]; then
/usr/bin/rcp -p $ORACLE_SBIN/logs/archives_production.log <Standby Database Hostname>:/<mount-point>/oracle/product/8.1.7/sbin/logs/archives_production.log
else
sub="Script $0 on $host in database $ORACLE_SID detected missing archive logs"
echo $sub
mailx -s "$sub" $email < "Error"
exit 1
fi
exit 0
#!/bin/ksh
#
# ---------------------------------------------------------------------------------------------------------------------------------
#
ORACLE_HOME=/<mount-point>/oracle/product/8.1.7
LD_LIBRARY_PATH=/<mount-point>/oracle/product/8.1.7/lib:/usr/ucblib:/usr/openwin/lib
JAVA_HOME=/<mount-point>/oracle/product/jre/1.1.8
ORACLE_SID=<SID>
ORACLE_SBIN=/<mount-point>/oracle/product/8.1.7/sbin
ORACLE_ALERT=/<mount-point>/oracle/admin/sudbp/bdump
PATH=/bin:/usr/bin:/etc:/usr/ccs/bin:/usr/openwin/bin:/usr/ucb:/usr/local/bin:/usr/sbin:/usr/X/bin:$JAVA_HOME
PATH=${PATH}:$ORACLE_HOME/bin:$ORACLE_HOME/sbin
host=`hostname`
if [ -z "$ORACLE_HOME" ] || [ -z "$LD_LIBRARY_PATH" ] || [ -z "$JAVA_HOME" ]\
|| [ -z "$ORACLE_SID" ] || [ -z "$ORACLE_ALERT" ] || [ -z "$ORACLE_SBIN" ]\
|| [ -z "$PATH" ] || [ -z "$host" ]; then
echo "$0 aborted - variable not defined"
mailx -s "$0 aborted - variable not defined" $email < /dev/null
exit 1
fi
USAGE="$0: Incorrect arguments, Usage: $0 <\"email1 email2 ... emailn\">"
if [ `whoami` != 'oracle' ]; then
echo "$0 aborted - user `whoami` is incorrect, must be user oracle"
mailx -s "$0 aborted - user `whoami` is incorrect, must be user oracle" $email < /dev/null
exit 1
elif [ -z "$1" ]; then
echo "$USAGE"
mailx -s "$USAGE" "emergency@xyz.com" < /dev/null
exit 1
fi
email=$1
echo "Executing $0 on $host"
#
# ---------------------------------------------------------------------------------------------------------------------------------
#
$ORACLE_HOME/bin/sqlplus internal<<!
set head off term off echo off feedback off show off trim off trims off verify off linesize 132 escape on;
spool $ORACLE_SBIN/logs/archives_standby.log;
select 'SEQUENCE '||to_char(sequence#)||' DATE '||to_char(first_time,'yyyy/mm/dd hh:mi:ss')
from v\$log_history where sequence# = (select max(sequence#) from v\$log_history);
spool off;
exit
!
cat $ORACLE_SBIN/logs/archives_standby.log | grep "SEQUENCE" | grep -v grep | grep -v SQL\> | awk '{ print $2 }' > $ORACLE_SBIN/logs/archives_standby.log
if [ -s $ORACLE_SBIN/logs/archives_production.log ] -a [ -s $ORACLE_SBIN/logs/archives_standby.log ]; then
export prod=`/bin/cat $ORACLE_SBIN/logs/archives_production.log`
export stby=`/bin/cat $ORACLE_SBIN/logs/archives_standby.log`
export diffs=$(($prod-$stby))
if [ $diffs -gt 2 ]; then
sub="Script $0 on $host in database $ORACLE_SID detected missing archive logs ($diffs)"
echo $sub
mailx -s "$sub" $email < /dev/null
exit 1
fi
else
sub="Script $0 on $host in database $ORACLE_SID detected missing archive logs"
echo $sub
mailx -s "$sub" $email < /dev/null
exit 1
fi
exit 0
#!/bin/ksh export ORACLE_SID=<SID> export ORACLE_HOME=/<mount-point-1>/oracle/product/8.1.7 PATH=$PATH:$ORACLE_HOME/bin export LD_LIBRARY_PATH=lib:/usr/ucblib svrmgrl << EOF connect internal recover standby database auto exit
Be extremely careful using a script such as shown below where global changes can be made.
#!/bin/ksh for file in $* do cp $file $file.old sed -e 's/find/replace/g' < $file > $file done