In my previous post, I had discussed about the workaround to fix Shareplex startup issue related to the shared memory segment or stale process. In today’s post, I would be presenting a shell script that can be used alternatively to fix the same issue.
To start with, our Shareplex instance startup is failing with following errors.
##--- ##--- starting Shareplex instance ---## ##--- [oracle@mylab-01 ~]$ $SP_BIN/sp_cop -u$SP_COP_TPORT & [1] 5317 [oracle@mylab-01 ~]$ ******************************************************* * SharePlex for Oracle Startup * Copyright 2014 Dell, Inc. * ALL RIGHTS RESERVED. * Protected by U.S. Patents: 7,461,103 and 7,065,538 * Version: 8.6.2.43-m64-oracle120 * VarDir : /shareplex/pdb131_2103 * Port : 2103 ******************************************************* can't setup shared memory statistics capability - exiting
We could notice following errors in Shareplex event_log file.
##--- ##--- errors logged in event_log file ---## ##--- Error 2015-12-27 18:23:18.757312 5317 3345016656 Cop: Error cleaning up previous shared memory segment 23068680 [module shs] Error 2015-12-27 18:23:18.757476 5317 3345016656 Cop: Cannot delete because there are users attached [module shs] Notice 2015-12-27 18:23:18.757595 5317 3345016656 Cop: Check if SharePlex processes are running and kill them if necessary [module shs] Notice 2015-12-27 18:23:18.757618 5317 3345016656 Cop: Remove shared-memory segments, owned by the SharePlex user, with a key value ending in '8cbf' [module shs] Notice 2015-12-27 18:23:18.757642 5317 3345016656 Cop: Remove semaphores, owned by the SharePlex user, with key values of 0x00008cbf or 0x00108cbf [module shs]
Further, we could see stale Shareplex processes running against the port on which we want to start Shareplex instance.
##--- ##--- Shareplex stale processes running on the system ---## ##--- [oracle@mylab-01 ~]$ ps -ef | grep sp_ oracle 5236 1 0 18:21 pts/1 00:00:00 /shareplex/splex_8.6.2_12/.app-modules/sp_ocap -u2103 oracle 5238 1 0 18:21 pts/1 00:00:00 /shareplex/splex_8.6.2_12/.app-modules/sp_xport -u2103 oracle 5239 1 0 18:21 pts/1 00:00:00 /shareplex/splex_8.6.2_12/.app-modules/sp_opst_mt -u2103 oracle 5240 1 0 18:21 pts/1 00:00:00 /shareplex/splex_8.6.2_12/.app-modules/sp_ordr -u2103 oracle 5259 1 0 18:21 pts/1 00:00:00 /shareplex/splex_8.6.2_12/.app-modules/sp_mport 0xc0a8e60c+PI+mylab-02+sp_mport+0xc0a8e60d R -u2103
In my previous post, we have seen how we can fix this issue by killing all the stale processes and then removing any shared memory segment or semaphore matching with the key found in the event_log file. In today’s post we are going to fix the same issue with the help of a script rather than manually doing each step.
Here is the script, that I have written based on the concept that we have discussed in the previous post. We need to pass two parameters to this script. The first parameter is the segment key that is logged in the event_log (example: Remove shared-memory segments, owned by the SharePlex user, with a key value ending in ‘8cbf’ [module shs]) and the second parameter is the port on which Shareplex instance startup is failing.
#!/bin/sh ########################################################################### # # Description: Script to remove orphan Shared Memory segs for Shareplex # Author: Abu Fazal Abbas # Version: 1.0 # Usage: sp_rem_sm_seg.sh {seg_ending_key} {sp_cop_tport} # # Standard Assumption: # 1. Shareplex instance startup uses following syntax # $SP_HOME/bin/sp_cop -u$SP_COP_TPORT & # 2. Script must be executed by the user running Shareplex ########################################################################### ##// Checking for script argument // if [ $# -lt 2 -o $# -gt 2 ]; then echo "Invalid Parameter" echo "Usage: sp_rem_sm_seg.sh {seg_ending_key} {sp_cop_tport}" else echo "Segment Key provided is: $1" echo "Shareplex PORT is: $2" echo "Proceed? [Y/N]" read resp echo case $resp in y|Y) ##// Proceed with script execution // ##// reading seg key and port from script argument // seg_key=$1 sp_tport=$2 sp_owner=`id | awk '{print $1}' | awk -F "(" '{print $2}' | awk -F ")" '{print $1}'` ##// fetching segment IDs using ipcs // shm_id=`ipcs -m | grep ${seg_key} | grep -w ${sp_owner} | awk '{print $2}'` ##// Validating Segment IDs // if [ "${shm_id}" = "" ]; then echo echo "Could not locate any Shared Memory segment with the given key" else echo ##// Looping through all fetched Segment IDs // for x in ${shm_id} do echo "Removing Shared Memory Segment with ID: $x" ##// Removing all orphan segments with the matching keys // ipcrm -m $x done fi ##// fetching semaphore IDs using ipcs // sem_id=`ipcs -s | grep $seg_key | grep -w ${sp_owner} | awk '{print $2}'` ##// Validating Semaphore IDs // if [ "${sem_id}" = "" ]; then echo echo "Could not locate any Semaphore with the given key" else echo ##// Looping through all fetched semaphore IDs // for y in ${sem_id} do echo "Removing Semaphore with ID: $y" ##// Removing all orphan semaphores with matching keys // ipcrm -s $y done fi ##// fetching orphan Shareplex processes matching with SP_COP_TPORT // echo echo "Killing Shareplex Orphan Processes..." proc_id=`ps -ef | grep sp_ | grep -w u${sp_tport} | awk '{print $2}'` if [ "${proc_id}" = "" ]; then echo "Orphan processes are found to cleared for PORT: ${sp_tport}" else echo for z in ${proc_id} do ##// Killing orphan Shareplex processes // echo "Killing Shareplex Orphan Process with ID: $z" kill -9 $z done echo fi echo "Clean up Completed" echo "Shareplex instance can be now started against port: ${sp_tport}" echo ;; n|N|*) ##/ Abort script execution // echo "Terminating Program.." exit; ;; esac fi ################################## End of Script ###########################################
Let’s use the script to fix the problem in hand. In our case the segment key is ‘8cbf’ found from event_log and the port is 2103 on which Shareplex instance startup is failing.
##--- ##--- script will fail if parameters are not passed ---## ##--- [oracle@mylab-01 ~]$ sh sp_rem_sm_seg.sh Invalid Parameter Usage: sp_rem_sm_seg.sh {seg_ending_key} {sp_cop_tport} ##--- ##--- executing script with seg key '8cbf' and port 2103 ---## ##--- [oracle@mylab-01 ~]$ sh sp_rem_sm_seg.sh 8cbf 2103 Segment Key provided is: 8cbf Shareplex PORT is: 2103 Proceed? [Y/N] Y Removing Shared Memory Segment with ID: 23068680 Removing Shared Memory Segment with ID: 23101449 Removing Shared Memory Segment with ID: 23134218 Removing Shared Memory Segment with ID: 23166987 Removing Shared Memory Segment with ID: 23199756 Removing Semaphore with ID: 294918 Removing Semaphore with ID: 327687 Killing Shareplex Orphan Processes... Killing Shareplex Orphan Process with ID: 5236 Killing Shareplex Orphan Process with ID: 5238 Killing Shareplex Orphan Process with ID: 5239 Killing Shareplex Orphan Process with ID: 5240 Killing Shareplex Orphan Process with ID: 5259 Clean up Completed Shareplex instance can be now started against port: 2103
As we can see from the script output, it will find and release all the matching shared memory segments and semaphores associated with the stale Shareplex processes. Further, it will kill any stale Shareplex process found to be running against the given port number.
After running this script, we should be able to startup Shareplex instance without any issues. Let’s startup the Shareplex instance.
##--- ##--- starting up Shareplex instance ---## ##--- [oracle@mylab-01 ~]$ $SP_BIN/sp_cop -u$SP_COP_TPORT & [1] 5385 [oracle@mylab-01 ~]$ ******************************************************* * SharePlex for Oracle Startup * Copyright 2014 Dell, Inc. * ALL RIGHTS RESERVED. * Protected by U.S. Patents: 7,461,103 and 7,065,538 * Version: 8.6.2.43-m64-oracle120 * VarDir : /shareplex/pdb131_2103 * Port : 2103 ******************************************************* ##--- ##--- validate Shareplex is up and running fine ---## ##--- [oracle@mylab-01 ~]$ $SP_BIN/sp_ctrl ******************************************************* * SharePlex for Oracle Command Utility * Copyright 2014 Dell, Inc. * ALL RIGHTS RESERVED. * Protected by U.S. Patents: 7,461,103 and 7,065,538 ******************************************************* sp_ctrl (mylab-01:2103)> show Process Source Target State PID ---------- ------------------------------------ ---------------------- -------------------- ------ Capture o.orppdb13_1 Running 5386 Read o.orppdb13_1 Running 5390 Import mylab-02 mylab-01 Running 5407 Post o.mypdb_01-mylab-02 o.orppdb13_1 Running 5389 Export mylab-01 mylab-02 Running 5387
As expected, we are now able to startup Shareplex instance without any issues. All the Shareplex processes are now up and running fine.
Hope you will find the script useful!