Another Production Issue
Just as I was returning from lunch I recieved a bunch of alerts on my phone from our Microsoft SCOM log monitor that one of our FSCM production application server processes had crashed. The message in the log was:
PSAPPSRV.8944 (5354) [2016-11-08T13:01:45.030 user@smartpeoplesoftadmin.com (FIREFOX 49.0; WIN7) ICPanel](0) PSAFFIRM(GlobalLock : invalid memory address from E:\pt85405c-retail\peopletools\src\pssys\qdmutil.cpp 4385) failed at E:\pt85405c-retail\peopletools\src\pscmnutils\globalmem.cpp, line 233. Processing will abort. PSAPPSRV.8944 (5354) [2016-11-08T13:01:45.031 user@smartpeoplesoftadmin (FIREFOX 49.0; WIN7) ICPanel](0) PSAFFIRM(GlobalLock : invalid memory address from E:\pt85405c-retail\peopletools\src\pssys\qdmutil.cpp 4385) failed at E:\pt85405c-retail\peopletools\src\pscmnutils\globalmem.cpp, line 233. Processing will abort. PSPAL: Abort: PSAFFIRM(GlobalLock : invalid memory address from E:\pt85405c-retail\peopletools\src\pssys\qdmutil.cpp 4385) failed at E:\pt85405c-retail\peopletools\src\pscmnutils\globalmem.cpp, line 233. Processing will abort. PSPAL: Abort: Location: E:\pt85405c-retail\peopletools\src\pssys\dump.cpp:970: CPSAbortObserver::OnEvent PSPAL: Abort: Generating process state report to e:\p92fscm\appserv\P92FSCM\LOGS\PSAPPSRV.8944\process_state.txt PSAPPSRV.8944 (5354) [2016-11-08T13:01:56.604 user@smartpeoplesoftadmin (FIREFOX 49.0; WIN7) ICPanel](0) Process aborted. PSAPPSRV.6588 (0) [2016-11-08T13:01:58.675](0) PeopleTools Release 8.54.05 (Windows) starting. Tuxedo server is APPSRV(99)/1 PSAPPSRV.6588 (0) [2016-11-08T13:01:58.676](3) Detected time zone is Mountain Standard Time PSAPPSRV.6588 (0) [2016-11-08T13:01:58.787](-1) CMgrCntrlImp::CreateCacheDir error locking cache directory: system errno = 32-Broken pipe PSAPPSRV.6588 (0) [2016-11-08T13:01:58.788](0) Server failed to start
Server Failed to Start
I didn’t like the last line in the error message “Server failed to start.” This indicate that the APPSRV server process that crashed did not automatically restart. Luckily we boot several APPSRV services, so our environment was still up, but with reduced performance. So how do I get that service running again? I could restart the environment, but that would terminate all user connections and that was not the solution I was looking for.
Restarting the APPSRV Process
Instead of restarting the environment I brought up PSADMIN and navigated to:
1) Application Server > 1) Administer a domain > 1) FSCM > 5) TUXEDO command line (tmadmin)
In tmadmin I issued the command “boot -g APPSRV” to boot the APPSRV processes:
As shown the in screenshot the APPSRV process was started. It also complains about duplicate server, but that error can be ignored.
After issuing this command, all the expected APPSRV processes is running again.