Monday, February 17, 2020

Errors in 19c rac cluster *root.sh script execution during GRID installation.

Errors in 19c rac cluster *root.sh script execution during GRID installation.

1. run(before gird install)  runcluvfy.sh stage -pre crsinst -n  node1, node2 and ensure output is endup with all r in PASSED state.

Error seen in log file:


  2020-02-13 20:12:33: Invoking "/u10/app/19.3.0/grid/bin/cluutil -ckpt -global -oraclebase /u10/app/oracle -chkckpt -name ROOTCRS_FIRSTNODE -status"
2020-02-13 20:12:33: trace file=/u10/app/oracle/crsdata/nodemachine_1/crsconfig/cluutil9.log
2020-02-13 20:12:33: Running as user oracle: /u10/app/19.3.0/grid/bin/cluutil -ckpt -global -oraclebase /u10/app/oracle -chkckpt -name ROOTCRS_FIRSTNODE -status
2020-02-13 20:12:33: Removing file /tmp/C3gYWRsAf7
2020-02-13 20:12:33: Successfully removed file: /tmp/C3gYWRsAf7
2020-02-13 20:12:33: pipe exit code: 0
2020-02-13 20:12:33: /bin/su successfully executed


2020-02-13 20:12:33: FAIL


2020-02-13 20:12:33: The 'ROOTCRS_FIRSTNODE' status is FAILED
2020-02-13 20:12:33: Global ckpt 'ROOTCRS_FIRSTNODE' state: FAIL
2020-02-13 20:12:33: First node operations have not been done, and local node is installer node.
2020-02-13 20:12:33: Local node: nodemachine_1 is the first node.
2020-02-13 20:12:33: ORACLE_BASE is shared: 0
2020-02-13 20:12:33: Invoking "/u10/app/19.3.0/grid/bin/cluutil -ckpt -global -oraclebase /u10/app/oracle -writeckpt -name ROOTCRS_FIRSTNODE -state FAIL -nodelist nodemachine_1,nodemachine_2 -transferfile"
2020-02-13 20:12:33: trace file=/u10/app/oracle/crsdata/nodemachine_1/crsconfig/cluutil10.log
2020-02-13 20:12:33: Running as user oracle: /u10/app/19.3.0/grid/bin/cluutil -ckpt -global -oraclebase /u10/app/oracle -writeckpt -name ROOTCRS_FIRSTNODE -state FAIL -nodelist nodemachine_1,nodemachine_2 -transferfile
2020-02-13 20:12:36: Removing file /tmp/PIG92egVCe
2020-02-13 20:12:36: Successfully removed file: /tmp/PIG92egVCe
2020-02-13 20:12:36: pipe exit code: 0
2020-02-13 20:12:36: /bin/su successfully executed

2020-02-13 20:12:36:
2020-02-13 20:12:36:  "-ckpt -global -oraclebase /u10/app/oracle -writeckpt -name ROOTCRS_FIRSTNODE -state FAIL -nodelist nodemachine_1,nodemachine_2 -transferfile" succeeded with status 0.
2020-02-13 20:12:36: succeeded to write global ckpt 'ROOTCRS_FIRSTNODE' with status 'FAIL'
[root@nodemachine_1 crsconfig]#

running command what root.sh attempting to do manually also gave same error..


Solution:
Issue with  IPMI Management Network configuration.. 
1. aborted the grid install  .
2. deinstalled the grid (whatever portion installed)
3. restarted with grid install without IPMI option.

and found grid install done correctly.. 

There will be another post on IPMI configuration.

what is IPMI?

For now :
IPMI provides a set of common interfaces to computer hardware and firmware that system administrators can use to monitor system health and manage the system. more details in coming blogs... 

How to connect database using JDBC connect string


How to connect database using JDBC connect string:

As these days, databases are on cloud and at times the only way seems to connect database using JDBC url format.

it is very simple:

 scan name:
 sqlplus 'user1/user1@(DESCRIPTION=(ADDRESS_LIST=(ADDRESS=(PROTOCOL=TCP)(HOST=myhost.mydomain.myurl)(PORT=1521)))(CONNECT_DATA=(SERVER=DEDICATED)(SERVICE_NAME=PDB19C1)))'


these days scan is default option..

if there are people who want list each node name in connection string or some kind of load balancing IS ON is enabled and still want to connect the way we connect  as shown above. the answer is simple and it is below


sqlplus 'user1/user1@(DESCRIPTION = (LOAD_BALANCE = on) (ADDRESS = (PROTOCOL = TCP)(HOST = hostname1)(PORT = 1521)) (ADDRESS = (PROTOCOL = TCP)(HOST = hostname2)(PORT = 1521)) (ADDRESS = (PROTOCOL = TCP)(HOST = hostname3)(PORT = 1522)) (ADDRESS = (PROTOCOL = TCP)(HOST = hostname4)(PORT = 1521)) (CONNECT_DATA = (SERVER = DEDICATED) (SERVICE_NAME = mydatabase)))'




Friday, February 14, 2020

How to Calculate OS CPU Utilization From Oracle Database AWR report.

How to Calculate OS CPU Utilization From Oracle Database AWR report.:

please note: the calculation formulas given here are the formulas i got from various sources on reading AWRs (in process of learning AWR myself).. the credit goes to them who could find time to share their knowledge.. many authors ...many blogs on awr analysis.


TOTAL CPU= NUM_CPUS*ELAPSED_TIME*60
TOTAL CPU= 8*60*60 = 28800 seconds ( 8 is cpu core box, 60 is elaspsed time)

DB cpu : 405.14 seconds
DB time: 885.6  seconds

‭45% cpu usage of DB time,. TOTAL CPU available in the system is 28800 seconds. it means actual cpu usage is 1.4 % usage.
----------------

BUSY_TIME 65,579
IDLE_TIME 2,804,991

% BUSY TIME= {BUSY_TIME/ (BUSY_TIME+IDLE_TIME)}*100                           = {5788280/8365936}*100
select (65579/(65579+2804991))*100 from dual; 2.28452885663822864448524160706758588016

This means that the system was overall 2.3% was busy utilizing the CPU.

BUSY_TIME=SYS_TIME+USER_TIME

SYS_TIME :12,835
USER_TIME : 52,272
select 12835+52272  from dual; -- 65107
calculate % of SYS_TIME and % of USER_TIME.:
% SYS_TIME = (SYS_TIME/BUSY_TIME)*100 - select (12835/65107)*100 from dual; 19.8% system time usage.
%USER_TIME = (USER_TIME/BUSY_TIME)*100 - select (52272/65107)*100 from dual; 80.3%
this 80.3% is actually 80.3% of %BUSY_TIME ie 2.3%.  -- this user time..
this 19.8% is actually 19.8% of %BUSY_TIME ie 2.3%. -- this is sys time..

SYS$USERS service is the default service name used when a user session is established without explicitly identifying its service name.

Service Statistics:
---------------------------
Service Name DB Time (s) DB CPU (s) Physical Reads (K) Logical Reads (K)
SYS$USERS 581 396 86 22,706
db_name  304 9 57 369
SYS$BACKGROUND 0 0 4 4,298

Major time spent in SYS$USERS category however the cpu used by database is very low.. need to check further based on issues we want to investigate.

Assumptions: single database is running on hosts.




Thursday, February 13, 2020

[INS-40724] No locally defined network interface matches the SCAN subnet.

Errors while installing oracle 19c cluster - grid software installation..

[INS-40724] No locally defined network interface matches the SCAN subnet.

Issue: none of the locally defined network interfaces has a subnet matching the SCAN subnet.

Fixes:
1.  error entry in /etc/hosts file on RAC nodes.
2. defined network interfaces has a subnet matching the SCAN subnet i.e. the Public IP and SCAN VIPs should be in same subnet, and public IP should be primary IP on a NIC.

Recommendations:
1. The Public IP and SCAN VIPs should be in same subnet, and public IP should be primary IP on a NIC.
2.priv and public network subnet have to be different also do configure ssh passwordless connection for user if not done already.


you may check https://docs.oracle.com/en/database/oracle/oracle-database/19/cwlin/network-checklist-for-oracle-grid-infrastructure.html for more info on NIC configuration for 19c oracle RAC.

All suggestions and inputs are welcome in this regard. Thanks.