Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

String parameter of a callback function gets messed when passed from DLL (Rust) to Java #335

Open
Revxrsal opened this issue Sep 17, 2023 · 3 comments

Comments

@Revxrsal
Copy link

Revxrsal commented Sep 17, 2023

o/ I'm migrating from JNA to JNR, and almost everything works fine. However, I've run into a really odd bug when using callback functions. I've built a minimal project that reproduces it. The JNA equivilent works fine.

Note: My native library is written in Rust

How to reproduce

  1. Install the native library (see platform artifacts)
  2. Load the library
  3. Try to call it from Java.

Java:

import jnr.ffi.LibraryLoader;
import jnr.ffi.annotations.Delegate;

public class Main {

    public interface Natives {

        void simple_callback(SimpleCallback callback);

        interface SimpleCallback {
            @Delegate
            void invoke(String value);
        }

        static Natives load() {
            return LibraryLoader
                    .create(Natives.class)
                    .load("<path to the library>");
        }
    }

    public static void main(String[] args) {
        Natives natives = Natives.load();
        for (int i = 0; i < 10; i++) {
            natives.simple_callback(System.out::println);
        }
    }
}

Rust:

use std::ffi::{c_char, CString};
use std::mem;

/// Converts a Rust string to a Java string
pub fn to_java_string(string: &str) -> *const c_char {
    let cs = CString::new(string.as_bytes()).unwrap();
    let ptr = cs.as_ptr();
    // Tell Rust not to clean up the string while we still have a pointer to it.
    // Otherwise, we'll get a segfault.
    mem::forget(cs);
    ptr
}

#[no_mangle]
extern fn simple_callback(callback: extern fn(*const c_char)) {
    let value = "Any string value";
    callback(to_java_string(&value));
}

The output:

Any string value
Any string value
Any string value lG�|�  ���� $�?	 � dRTypeCache �   |�  ���� %�� �|�
Any string value
Any string value
��hNG�|�  �,0�|�
Any string value
Any string value |��|�  A���|���
Any string value
Any string value

(The corruption is different every time)
Any idea what could be causing this?

@Hyperkopite
Copy link

Hyperkopite commented Jul 3, 2024

Same issue here. JNA returns normal result but JNR returns with a small number of corrupted data, involking the same C function.

JNR code:

import jnr.ffi.LibraryLoader;

public interface JNRUtils {
    JNRUtils INSTANCE = LibraryLoader.create(JNRUtils.class).load("QGram");

    public double calc_similarity(String str1, String str2, int q);

    public String purge_duplicated_spaces(String s);
}

C code:

char *purge_duplicated_spaces(char *str)
{
	re_length_t re_match_start;
	struct re_context *re_ctx = (struct re_context *)calloc(1, sizeof(struct re_context));

	while (true)
	{
		re_ctx->match_length = 0;
		re_match(re_ctx, "\\s+", text_args(str), &re_match_start);
		if (re_ctx->match_length == 0)
		{
			free(re_ctx);
			break;
		}

		delete_sub_str(str, re_match_start, re_ctx->match_length);  // Another function to delete some substring from a string
	}

	int p = 0;
	while(str[p] != '\0')
	{
		if (str[p] == '\1') {
			str[p] = ' ';
		}
		p++;
	}

	return str;
}

@rorueda
Copy link

rorueda commented Aug 9, 2024

I assume the conversion is done by StringResultConverter. Looking at it, it seems Java default charset is used to determine the width of the string termination.

It is just a guess, but maybe your default charset results in a width > 1.

@Trivaxy
Copy link

Trivaxy commented Oct 8, 2024

I got bit by this bug as well, but fortunately there's a workaround.

I assume the conversion is done by StringResultConverter. Looking at it, it seems Java default charset is used to determine the width of the string termination.

It is just a guess, but maybe your default charset results in a width > 1.

Yeah, this seems to be issue. On my machine, the default charset is windows-1252 which has a terminator width of 1, but StringUtil#terminatorWidth will say it's 4 because that's its fallback when it doesn't recognize the charset, which in turn leads to nasty bugs like this. JNR should probably add windows-1252 to the cases it checks (or, better yet, throw an exception when it doesn't know the termination width of the charset).

You can work around this problem by telling JNR to use UTF-8 encoding for the String parameter in the callback, e.g.

public interface WrenWriteFn {
    @Delegate
    void invoke(Pointer vm, @Encoding("utf8") String text);
}

This fixed the issue for me, at least.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants