Question:
Hi all,
I am working in 4.5B system and i need to validate a field for the double byte characters.
Can anybody please tell me how to find out whether the character is a double byte character or not.
Thanks in Advance.
Regards,
Mahesh
Answer:
Mahesh, I had the same problem a couple of weeks ago and tried to find a solution together with some colleagues. Unfortunately, we came to the conclusion that there is no fool-proof way to determine if you are looking at double-byte or single-byte content in a field. You'd basically need to know beforehand, what you are looking at, which is a typical catch-22...
We tried several things: looked at a string in two-byte intervals, converted that to hex (and later binary), tried to add the hex/binary values and so on. Unfortunately, a double-byte value for eg. a Traditional Chinese sign, could also be two perfectly valid characters/signs in any other language. Only the used code-page (the one for the GUI/user) will properly translate the field-content to whatever is needed. The "best" I could come up with was a kind of "probability index" how likely it was that the field contains double-byte (relative number of "high HEX-values" compared to used number of characters in the field).
We couldn't find anything in OSS either....
If anybody out there has a solution for this problem, I'd also very much like to know (our business case still exist, we've just shelved it for the time being).
Cheers
Baerbel
Answer:
First of all, if you are using a locale that corresponds to single-byte, speaking of double-byte characters doesn't make sense at all. Then, if the locate is double-byte:
If I have a string like ABCD and I want to know whether BC is two (Latin-1) or one character (the first byte should then normally be 8-bit), I would try one of the following:
1. Try to split with FM STRING_SPLIT_AT_POSITION at position 2 and see how many bytes will be in the first output string. If 2 bytes - Latin-1, 3 bytes - double-byte.
2. If you look at the source code of STRING_SPLIT_AT_POSITION, you'll see that you could also play with SHIFT operator. I suspect that shifting for 2 pos left will decrease the length on 2 bytes (before space) for Latin-1, and on 3 bytes for double-byte.
Note: it's just a theory, I never tried that. If you will, please post the result here - Thanks!
_________________
ilya
More input data: http://abaplog.wordpress.com
Sapfans ABAP FAQ: /forums/viewtopic.php?t=94198
R's ABAP Knowledge Corner: http://www.Rard-harper.net/kb/kb.html
Function modules documentation: http://www.se37.com
Answer:
The company I work with has a global installation (in one system!). So, I'm for example working from Germany using a "normal" single-byte character set and am writing a program to process data. That data may have been entered by somebody working from China using a double-byte character set, so all I'm seeing is "gibberish" with lots of special or even un-printable characters (as "my" code-page can't translate it properly). If I log-on at another PC which has been configured for Traditional Chinese and the corresponding code-page, the data is displayed with the proper signs.
The base-data in the field - down on the hex- or binary-level is identical if I display it either during debugging or on a report. Just the "converted" signs are different for single- or double-byte display because the respective code-page used on the two PCs "knows" what to make of them.
One area where we do have an issue is this:
Address-data can be found in either single- or double-byte languages on the same table (depending on who entered it where and with which code-page). That data is processed by globally run programs and in some cases it sure would come in handy to tell from the field content if it is single- or double-byte (e.g. when extracting data for an older system which doesn't take well to double-bytes and we'd like to exclude those records).
This is rather difficult to describe, so I'm sorry if it doesn't make a lot of sense...
Baerbel
Answer:
The base-data in the field - down on the hex- or binary-level is identical if I display it either during debugging or on a report. Just the "converted" signs are different for single- or double-byte display because the respective code-page used on the two PCs "knows" what to make of them.
Yes, but if you use SET LOCALE to set Chinese locale during the runtime of your program, even if you are on your PC (and logged in in German), the umlauts should be treated as Chinese double-bytes. Look what I've done in SE37:
Test für Funktionsgruppe CSTR
Funktionsbaustein STRING_SPLIT_AT_POSITION
Klein-Groß-Schreibung
Laufzeit: 32 Mikrosekunden
Import-Parameter Wert
STRING ÜÄüäßöÖ
POS 5
LANGU DE
Export-Parameter Wert
STRING1 ÜÄüäß
STRING2 öÖ
POS_NEW 5
But:
Test für Funktionsgruppe CSTR
Funktionsbaustein STRING_SPLIT_AT_POSITION
Klein-Groß-Schreibung
Laufzeit: 673 Mikrosekunden
Import-Parameter Wert
STRING ÜÄüäßöÖ
POS 5
LANGU ZH
Export-Parameter Wert
STRING1 ÜÄüä
STRING2 ßöÖ
POS_NEW 4
That means, you can use the technique I described above to determine what you are dealing with, provided that the current locale is set correspondingly.
_________________
ilya
More input data: http://abaplog.wordpress.com
Sapfans ABAP FAQ: /forums/viewtopic.php?t=94198
R's ABAP Knowledge Corner: http://www.Rard-harper.net/kb/kb.html
Function modules documentation: http://www.se37.com
Answer:
Ilya, thanks for your feedback and patience!
Wouldn't I still need to know beforehand, which locale/language I'd have to use? The issue we have, is that a program runs for "everyting" and extracts changed customer-addresses. Not all of the addresses will have double-byte content but some may have it and we'd like to identify the ones which do to not send them to another system.
I still think that this is a catch-22: without knowing how to interpret the data, there's really no way to know what you are looking at - which would explain why SAP doesn't seem to have a "simple" function module available which we could use to determine the type of a field's content .....
Baerbel
Answer:
Baerbel, you are right. However, I think what OP wants to do is, assuming that he's logged in Chinese, to know how to distinguish the characters in, for example, '1234' - even with Chinese it will be four bytes, or those four bytes are actually two Chinese characters (which can be the case if we have 4 bytes that are umlauts for us - upper ASCII, like 'ÜÖÄß').
_________________
ilya
More input data: http://abaplog.wordpress.com
Sapfans ABAP FAQ: /forums/viewtopic.php?t=94198
R's ABAP Knowledge Corner: http://www.Rard-harper.net/kb/kb.html
Function modules documentation: http://www.se37.com
Answer:
I'm not sure if this will help or not, but cl_gui_frontend_services->Get_SapLogon_Encoding returns the code page in use at logon.
You could also probably using Registry_Get_Value from the same class to get at some sort of language descriptive value.
_________________
Regards
R
Abap KC
SFMDR