[jacorb-developer] jacorb-developer Digest, Vol 131, Issue 7

Gotthard Witsch Gotthard.Witsch at ith-icoserve.com
Fri Feb 14 12:43:32 CET 2014


Hi Nick,

thanks for your reply.

I'm referring to JacORB 3.3 documentation (http://www.jacorb.org/releases/3.3/ProgrammingGuide.pdf). It only says, that enabling jacorb.codeset will do translation on marshalling. We've expected that if client and server both expect UTF-8 and if JVMs file.encoding is UTF-8 there won't be any need to translate. However I saw that the implemenation only expects ASCII if jacorb.codeset is set to off. So you're right we have to enable this option.
It would be very helpful if we could mention that in the documentation at the config's description.

Thanks for the initialization hint: org.omg.PortableInterceptor.ORBInitializerClass.standard_init=org.jacorb.orb.standardInterceptors.IORInterceptorInitializer. I will have a look at it.

So the only problem remaining, is the wrong conversion of UTF-8 4byte characters in CDROutputStream.write_string in line 917 to 920. In detail every character of the string is converted seperatly with the CodeSet.write_char method.
This works correctly for up to 3 byte characters. But the implementation of org.jacorb.orb.giop.CodeSet.Utf8CodeSet.write_char cannot handle 4 byte characters. It handles every character bigger than 0x07FF as a 3 byte character. However 3 byte characters end at 0xFFFF. Everything above is a 4 byte character and has to be converted separately. Hence 4 byte characters are representated with two chars in Java. So converting only one char will result in wrong encoded characters.

There's an article about handling 4 byte characters in Java, that mentions some utitily methods for conversion:
http://www.oracle.com/technetwork/articles/javase/supplementary-142654.html
I am actually working on a patch that can handle this. I should be able to provide some code with examples next week.

Sincerely, Gotthard

-----Ursprüngliche Nachricht-----
Von: jacorb-developer-bounces~gotthard.witsch=ith-icoserve.com at lists.spline.inf.fu-berlin.de [mailto:jacorb-developer-bounces~gotthard.witsch=ith-icoserve.com at lists.spline.inf.fu-berlin.de] Im Auftrag von jacorb-developer-request at lists.spline.inf.fu-berlin.de
Gesendet: Freitag, 14. Februar 2014 12:03
An: jacorb-developer at lists.spline.inf.fu-berlin.de
Betreff: jacorb-developer Digest, Vol 131, Issue 7

Send jacorb-developer mailing list submissions to
	jacorb-developer at lists.spline.inf.fu-berlin.de

To subscribe or unsubscribe via the World Wide Web, visit
	https://lists.spline.inf.fu-berlin.de/mailman/listinfo/jacorb-developer

or, via email, send a message with subject or body 'help' to
	jacorb-developer-request at lists.spline.inf.fu-berlin.de

You can reach the person managing the list at
	jacorb-developer-owner at lists.spline.inf.fu-berlin.de

When replying, please edit your Subject line so it is more specific than "Re: Contents of jacorb-developer digest..."


Today's Topics:

   1. Re: Error in String to byte conversion in CDROutputSream and
      byte to String conversion in CDRInputStream class (Nick Cross)


----------------------------------------------------------------------

Message: 1
Date: Thu, 13 Feb 2014 13:21:10 +0000
From: Nick Cross <jacorb at goots.org>
To: Discussions concerning CORBA development with JacORB
	<jacorb-developer at lists.spline.inf.fu-berlin.de>
Subject: Re: [jacorb-developer] Error in String to byte conversion in
	CDROutputSream and byte to String conversion in CDRInputStream class
Message-ID: <52FCC6C6.6080203 at goots.org>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed


Hi,

My reply is inline.


On 13/02/14 09:57, Gotthard Witsch wrote:
> Dear developers,
>
> As we are having problems with sending and receiving Strings with 
> JacOrb 3.3 in our java application, I started to analyse your code. I 
> got aware of  two classes that cause errors in my opinion:
> org.jacorb.orb.CDROutputStream and org.jacorb.orb.CDRInputStream.
> Let's start with CDROutpuStream: The conversion from String to byte 
> array is done in write_string methode in org.jacorb.CDROutputStream.
> However I think there are several bugs in this method: If we set 
> codesetEnabled to true with the property jacorb.codeset at 
> initialization, we can transmit Strings with ASCII, 2 byte and 3 byte 
> characters. The transmission of 4 byte characters is not possible 
> because the implementation of 
> org.jacorb.orb.giop.CodeSet.Utf8CodeSet.write_char, that does not 
> handle 4 byte characters. We've already opened a bug in your 
> bugtracker but did not receive any feedback -
> http://www.jacorb.org/bugzilla/show_bug.cgi?id=969 Refering to the

I have not had time currently to look in detail at this. I would be happy to look at any pull requests you could provide.

> documentation we are not supposed to use jacorb.codeset as Java

Where are you referring to please?

> receives UTF-8 Strings and the CORBA backend expects UTF-8 encoded 
> Strings. So in my opinion there's no need for conversion. However if

I think if you're expecting to send outside of the ASCII codeset you should enabled codesets.

> we turn jacorb.codeset off we can only transmit ASCII Strings, every 
> other character not being part of ASCII Standard is encoded wrong. I 
> think the problem is produced at line number 925 in
> org.jacorb.orb.CDROutputStream: value.getBytes(0, valueLength, buffer, 
> pos); The used method getBytes is deprecated, due to the fact that it 
> does not properly convert characters into bytes. In the comment the 
> line before it is mentioned that this methode is explicitly used for 
> better speed. Can you tell me why this method is better than 
> String.getBytes(String charsetName)?

I think from memory, it is because if codeset translation was disabled, there was no requirement to utilise String charset conversion - and therefore a direct copy into the result byte array was all that was required.

> On the outer side there could be a logical bug in the class
> org.jacorb.orb.CDRInputStream: In the method read_string the received 
> bytes are converted to a String. In line number 1247 codesetEnable is 
> checked. If set to true a String is created with the following
> statement: result = new String (buffer, start, size, 
> codeSet.getName()); -> JVM decides on how to create the String! If 
> codesetEnabled is set to false than a conversion is done with a loop 
> in line number 1267:  buf[i] = (char)(0xff & buffer[start + i]); After 
> the conversion the String is created with the resulting byte array.

I believe the idea there is that if codesets are disabled we do not want to do any translation. Hence the loop. Alternatively, if they are enabled, the internal String marshalling is utilised.

> So if I understood your code correct the if block should be something 
> like this:
>
> if (codesetEnable) { // convert bytes with loop from line 1267 } else 
> { // create String and let JVM decide how to do this, with code from 
> line 1252 }
>
> Can you give me some feedback if I'm wrong or if that's really a bug.


If you could submit a test (or even better adapt some existing test
cases) to demonstrate any problems that would be great. I'd be happy to look at any pull requests / patches as well


 > At initializing JacOrb we set the following properties:

Should you not also set
org.omg.PortableInterceptor.ORBInitializerClass.standard_init=org.jacorb.orb.standardInterceptors.IORInterceptorInitializer


Regards

Nick



> "ORBInitRef.NameService" "corbaloc:" + iiop:url_to_nameservice 
> "org.omg.CORBA.ORBClass", "org.jacorb.orb.ORB"# 
> "org.omg.CORBA.ORBSingletonClass", "org.jacorb.orb.ORBSingleton"
> "jacorb.codeset", "on" "jacorb.security.support_ssl", "on"
> "jacorb.security.jsse.trustees_from_ks", "on"
> "jacorb.security.keystore", keyStorePath 
> "jacorb.security.keystore_password", keystorepass 
> "jacorb.security.ssl.client.supported_options", "60"
> "jacorb.security.ssl.client.required_options", "01"
> "jacorb.ssl.socket_factory",
> "org.jacorb.security.ssl.sun_jsse.SSLSocketFactory"
> "jacorb.ssl.server_socket_factory",
> "org.jacorb.security.ssl.sun_jsse.SSLServerSocketFactory"
> "jacorb.maxManagedBufSize", "23" "jacorb.reference_caching", "true"
> "jacorb.retries", "30" "jacorb.retry_interval", "2000"
>
> Sincerely,
>
> Gotthard Witsch Syngo Share Software Development
>
> E-Mail:
> gotthard.witsch at ith-icoserve.com<mailto:stefan.daxenbichler at ith-icoser
> ve.com>
>
>
_____________________________________
> ITH icoserve technology for healthcare GmbH a siemens company - H CX 
> HS INT CES ITH Innrain 98, 6020 Innsbruck, ?sterreich - 
> www.ith-icoserve.com<http://www.ith-icoserve.com/> Rechtsform:
> Gesellschaft mit beschr?nkter Haftung - Firmensitz: 6020 Innsbruck, 
> Innrain 98 Firmenbuchnummer: FN 174117f - Firmenbuchgericht:
> Innsbruck - DVR: 0983039
>
> _______________________________________________ jacorb-developer 
> maillist  -  jacorb-developer at lists.spline.inf.fu-berlin.de
> https://lists.spline.inf.fu-berlin.de/mailman/listinfo/jacorb-develope
> r
>


------------------------------

_______________________________________________
jacorb-developer maillist  -  jacorb-developer at lists.spline.inf.fu-berlin.de
https://lists.spline.inf.fu-berlin.de/mailman/listinfo/jacorb-developer


End of jacorb-developer Digest, Vol 131, Issue 7
************************************************


More information about the jacorb-developer mailing list