DW.sap does not start after migration

Question: Hey

Did a migration. Just on a test system. All went pretty much like a charm. Migration is from Windows (2003) to unix (Solaris). DB and SAP stays the same (Oracle9/SAP 4.7 enterprise SR1).

The target system was installed from install cDs by someone else. So it had a standard (empty) SAP database. He also put a new kernel on (2006 Q3 stack). I threw away all the DB files and just installed a DB instance (I chose "system copy with R3load").

My migration tool passed the DB Load step, and recalc of stats etc. At the end (step 35 and 36 I think) there's 2 post processing tasks done via RFC. The last one fails as it is unable to connect.

Upon inspection, I saw that it was SAP that fails to startup correctly. This went by unseen by the migration tool. Indeed; starting SAP does not give an error at startup. It just says "SAP started"!

But the startsap.log says something else:
- all processes are started (ms.sap, se.sap, co.sap, dw.sap...)
- then it says "waiting for child to terminate" and "PID exited with return code 1"
- looking at the process list, I see all SAP processes but no dw.sap's.

I cannot find any other log files with more info on the startup failure.

Any ideas what it could be and how to resolve?

I've been thinking it could be the NLS_LANG variable (export was done on a US7ASCII, and import on a WE8DEC). But normally the conversion (automatically done at import time) should not be a problem, right?

I'd install a new kernel, but it already has the latest one!

The only thing I didn't try is a reboot (dunno if it could be a memory thing).

I'm really out of clues here.
_________________


Answer:
Please provide logfiles like dev_disp, dev_ms and dev_w0

Answer:
That won't help I'm affraid.

Dev_ms starts up fine, but there's no enteries in the dev_wXX files (he doesn't even get to that point), and dev_disp says it can't connect because of an rfc error (basically because there's no dw processes, I suppose).
_________________


Answer:

dev_disp says it can't connect because of an rfc error (basically because there's no dw processes, I suppose).

Show me the dev_disp. dev_disp doesn't talk to WP's by using RFC's

Answer:
1. check /etc/hosts
2. check /etc/services

3:

find / -name dev_w0 -print

4. check SAP profiles for obvious errors (SID, hostnames etc...)

5. Try to startsap manually under userID [sid]adm
6. check if database is up and running
_________________
SapFans Moderator
NetWeaver ‘04–SAP Web AS for ORACLE certified

Search: /forums/search.php
SAP Notes: http://service.sap.com/notes
SAP Help: http://help.sap.com
Basic Rules: /forums/viewtopic.php?t=222759

Answer:
As snmsee stated, provide the contents of the dev_disp file.

Also before re-starting SAP, clear the shared memory segments.

I would bet on a reboot.
_________________
Regards,
Buddha.

Answer:

Thanks a lot 4 the tips so far, guys!!!

I first rebooted the server to see if that would solve anything. And I must say it did something, as my dev_disp entries are way more now. Here's the contents:

m03adm> more dev_disp

---------------------------------------------------
trc file: "dev_disp", trc level: 1, release: "640"
---------------------------------------------------

Mon Sep 18 09:07:27 2006
kernel runs with dp version 130(ext=102) (@(#) DPLIB-INT-VERSION-130)
length of sys_adm_ext is 312 bytes
sysno      00
sid        M03
systemid   370 (SUN on SPARC CPU with Solaris 2.2)
relno      6400
patchlevel 0
patchno    129
intno      20020600
make:      single threaded, ASCII, 64 bit
pid        20931

***LOG Q00=> DpSapEnvInit, DPStart (00 20931) [dpxxdisp.c   1098]
        shared lib "dw_xml.so" version 129 successfully loaded
        shared lib "dw_xtc.so" version 129 successfully loaded
        shared lib "dw_stl.so" version 129 successfully loaded
        shared lib "dw_gui.so" version 129 successfully loaded
        shared lib "dw_mdm.so" version 129 successfully loaded
MtxInit: -2 0 0
DpSysAdmExtInit: ABAP is active
DpIPCInit2: start server >sapux1_M03_00                           <
DpShMCreate: sizeof(wp_adm)             6848    (856)
DpShMCreate: sizeof(tm_adm)             3113088 (15488)
DpShMCreate: sizeof(wp_ca_adm)          26400   (88)
DpShMCreate: sizeof(appc_ca_adm)        8800    (88)
DpShMCreate: sizeof(comm_adm)           216000  (432)
DpShMCreate: sizeof(vmc_adm)            0       (504)
DpShMCreate: sizeof(wall_adm)           (22440/36712/80/104)
DpShMCreate: SHM_DP_ADM_KEY             (addr: ffffffff72300000, size: 3436840)
DpShMCreate: allocated sys_adm at ffffffff72300000
DpShMCreate: allocated wp_adm at ffffffff723018d8
DpShMCreate: allocated tm_adm_list at ffffffff72303398
DpShMCreate: allocated tm_adm at ffffffff723033c0
DpShMCreate: allocated wp_ca_adm at ffffffff725fb440
DpShMCreate: allocated appc_ca_adm at ffffffff72601b60
DpShMCreate: allocated comm_adm_list at ffffffff72603dc0
DpShMCreate: allocated comm_adm at ffffffff72603dd8
DpShMCreate: allocated vmc_adm_list at ffffffff72638998
DpShMCreate: system runs without vmc_adm
DpShMCreate: allocated ca_info at ffffffff726389c0
DpShMCreate: allocated wall_adm at ffffffff726389c8
MBUF state OFF
EmInit: MmSetImplementation( 2 ).
*** ERROR => shmget(10051,16871044,2016) (22: Invalid argument) [shmux.c      15
11]
*** ERROR => ICreateAdmin shared initialization failed: 16871044 rc=6 [emxxi.c
    198]
*** ERROR => DpEmInit: EmInit (6) [dpxxdisp.c   8023]
*** ERROR => DpMemInit: DpEmInit (-1) [dpxxdisp.c   7962]
*** DP_FATAL_ERROR => DpSapEnvInit: DpMemInit
*** DISPATCHER EMERGENCY SHUTDOWN ***
increase tracelevel of WPs
*** ERROR => DpWpKill: illegal pid (-1,5) [dpxxtool.c   2409]
*** ERROR => DpWpKill: illegal pid (-1,5) [dpxxtool.c   2409]
*** ERROR => DpWpKill: illegal pid (-1,5) [dpxxtool.c   2409]
*** ERROR => DpWpKill: illegal pid (-1,5) [dpxxtool.c   2409]
*** ERROR => DpWpKill: illegal pid (-1,5) [dpxxtool.c   2409]
*** ERROR => DpWpKill: illegal pid (-1,5) [dpxxtool.c   2409]
*** ERROR => DpWpKill: illegal pid (-1,5) [dpxxtool.c   2409]
*** ERROR => DpWpKill: illegal pid (-1,5) [dpxxtool.c   2409]
NiWait: sleep (10000 msecs) ...
NiISelect: timeout 10000 ms
NiISelect: maximum fd=1
NiISelect: read-mask is NULL
NiISelect: write-mask is NULL
Mon Sep 18 09:07:37 2006
NiISelect: TIMEOUT occured (10000 ms)
dump system status
Workprocess Table (long)                        Mon Sep 18 07:07:37 2006
========================

No Ty. Pid      Status  Cause Start Err Sem CPU    Time  Program  Cl  User
   Action                    Table
--------------------------------------------------------------------------------
---------------------------------------
*** ERROR => DpRqTxt: bad rqtype -1 [dpxxrq.c     742]
 0 ?         -1 Free          no      0   0             0

*** ERROR => DpRqTxt: bad rqtype -1 [dpxxrq.c     742]
 1 ?         -1 Free          no      0   0             0

*** ERROR => DpRqTxt: bad rqtype -1 [dpxxrq.c     742]
 2 ?         -1 Free          no      0   0             0

*** ERROR => DpRqTxt: bad rqtype -1 [dpxxrq.c     742]
 3 ?         -1 Free          no      0   0             0

*** ERROR => DpRqTxt: bad rqtype -1 [dpxxrq.c     742]
 4 ?         -1 Free          no      0   0             0

*** ERROR => DpRqTxt: bad rqtype -1 [dpxxrq.c     742]
 5 ?         -1 Free          no      0   0             0

*** ERROR => DpRqTxt: bad rqtype -1 [dpxxrq.c     742]
 6 ?         -1 Free          no      0   0             0

*** ERROR => DpRqTxt: bad rqtype -1 [dpxxrq.c     742]
 7 ?         -1 Free          no      0   0             0

Dispatcher Queue Statistics                     Mon Sep 18 07:07:37 2006
===========================

+------+--------+--------+--------+--------+--------+
|  Typ |    now |   high |    max | writes |  reads |
+------+--------+--------+--------+--------+--------+
| NOWP |      0 |      0 |   2000 |      0 |      0 |
+------+--------+--------+--------+--------+--------+
|  DIA |      0 |      0 |   2000 |      0 |      0 |
+------+--------+--------+--------+--------+--------+
|  UPD |      0 |      0 |   2000 |      0 |      0 |
+------+--------+--------+--------+--------+--------+
|  ENQ |      0 |      0 |   2000 |      0 |      0 |
+------+--------+--------+--------+--------+--------+
|  BTC |      0 |      0 |   2000 |      0 |      0 |
+------+--------+--------+--------+--------+--------+
|  SPO |      0 |      0 |   2000 |      0 |      0 |
+------+--------+--------+--------+--------+--------+
|  UP2 |      0 |      0 |   2000 |      0 |      0 |
+------+--------+--------+--------+--------+--------+


max_rq_id               0
wake_evt_udp_now        0

wake events             total     0,  udp     0 (  0%),  shm     0 (  0%)
since last update       total     0,  udp     0 (  0%),  shm     0 (  0%)


Dump of tm_adm structure:                       Mon Sep 18 07:07:37 2006
=========================

Term    uid  man user    term   lastop  mod wp  ta   a/i (modes)

Workprocess Comm. Area Blocks                   Mon Sep 18 07:07:37 2006
=============================

Slots: 300, Used: 0, Max: 0
+------+--------------+----------+-------------+
|   id | owner        |   pid    | eyecatcher  |
+------+--------------+----------+-------------+

NiWait: sleep (5000 msecs) ...
NiISelect: timeout 5000 ms
NiISelect: maximum fd=1
NiISelect: read-mask is NULL
NiISelect: write-mask is NULL
Mon Sep 18 09:07:42 2006
NiISelect: TIMEOUT occured (5000 ms)
DpHalt: shutdown server >sapux1_M03_00                           < (normal)
DpJ2eeDisableRestart
Switch off Shared memory profiling
ShmProtect( 57, 3 )
ShmKeySharedMMU( 57 ) = 0 (octal)
ShmProtect(SHM_PROFILE, SHM_PROT_RW
ShmProtect( 57, 1 )
ShmKeySharedMMU( 57 ) = 0 (octal)
ShmProtect(SHM_PROFILE, SHM_PROT_RD
DpWakeUpWps: wake up all wp's
Stop work processes...
Terminate gui connections
not attached to the message server
SigIGenAction (pid=20931)
SigIRegisterRoutine: handler for signal 14 installed (DpSigAlrm)
signal 11 was in UNBLOCKED mode
------------------ C-STACK ----------------------
[0] EsCleanup ( 0x104705660, 0x1, 0x0, 0x10448ea30, 0x0, 0x0 ), at 0x101b6ee28
[1] DpHalt ( 0x0, 0x100000000, 0x26e8c00, 0x102a3a700, 0x0, 0x2 ), at 0x10019a18
8
[2] DpFatalErr ( 0x0, 0x1, 0x1026d6000, 0x1, 0x3, 0x1 ), at 0x1001b98ec
[3] DpSapEnvInit ( 0x102245c00, 0x26d6000, 0x1400, 0x1026d6130, 0x14, 0x1ef400 )
, at 0x10017dde4
[4] DpMain ( 0x1, 0x26d5c00, 0x0, 0xffffffff7ffffac8, 0x1026d5c00, 0x100000000 )
, at 0x10017c754
-------------------------------------------------
Mon Sep 18 09:07:45 2006
***LOG Q0E=> SigIGenAction, signal ( 11) [sigux.c      898]
SigIGenAction: call exithandler DpHalt(FALSE)
DpHalt: shutdown server >sapux1_M03_00                           < (normal)
DpJ2eeDisableRestart
Switch off Shared memory profiling
ShmProtect( 57, 3 )
ShmKeySharedMMU( 57 ) = 0 (octal)
ShmProtect(SHM_PROFILE, SHM_PROT_RW
DpShmPrfSwitch : State already set (OFF)
ShmProtect( 57, 1 )
ShmKeySharedMMU( 57 ) = 0 (octal)
ShmProtect(SHM_PROFILE, SHM_PROT_RD
DpWakeUpWps: wake up all wp's
Stop work processes...
Terminate gui connections
not attached to the message server


A problem with shared memory? Would be very weird. As I said; this SAP was running fine before I re-installed it's DB server. And with the same kernel too, for that matter.

Ah yes; some other info:
-there's also a big core file in the work directory now.
-dev_wX files are still empty & untouched.



I sometimes end up with the weirdest errors. Maybe I should consider a gardening-carreer. ...
_________________


Answer:
All right! Found it! Apparently the DB load also puts a new Oracle SPfile in place which about doubles the SGA size! And that's something our test server could not handle!

Thanks a lot guys! I would never have figured it was a memory problem!(was thinking of an issue with ddic PW in client 000...)

(puts thumb ub)
Copyright ?2007 - 2008 www.jt77.com