Publishing libraries in target languages

jedwards1211 · May 31, 2020, 4:44am

I write some caving software in JS/Java and a friend of mine writes his own in C++. We’re looking to combine forces and I’m hoping to use Haxe to do that.

For example I want to write some file parsers in Haxe and then publish the output JS as npm packages, output Java as Maven packages that I can use in my projects, and output C++ that he can use in his projects.

However I get the vague impression that people mostly put the bulk of their application code in Haxe and e.g. call Java code via Haxe externals rather than have the bulk of their application code in Java and call to Haxe-generated Java code. Is this the case for most users or do some people use it to share routines between vastly different programs written in different languages?

For instance the hxjava output bundles the haxe/lang stuff into the JAR so practically speaking we can only use one JAR output by Haxe, unless we move haxe/lang to a separate JAR manually. I see the --java-lib option I’m guessing it’s for classes our Haxe externs need to reference rather than haxe/lang.

Has anyone used Haxe for publishing packages in target languages like I’m describing? Have any tips?

kevinresol · May 31, 2020, 4:52am

OpenFL has npm publish. You can have a look at it

jedwards1211 · May 31, 2020, 5:00am

Actually publishing to npm is probably the easiest thing and I certainly wouldn’t need/want to involve OpenFL for that.

I’m mainly wondering if anyone has considered the use case I’m talking about for most other languages, because for example if I compile library A to a JAR and library B to a JAR both contain haxe/lang classes which would conflict. I assume this would also be a problem for C++, Python, etc. anything that has a global package namespace.

I’m basically trying to raise awareness of the need for an easy way to share the Haxe lang shim for Java between multiple output JAR files for Java, share the Haxe lang shim for C++ between multiple output sources for C++, etc.

kigero · June 5, 2020, 6:26pm

I’ve done very similar things for my work and ran into some of the same limitations. I ended up building a maven plugin, which is basically a front end to building an HXML file as needed throughout the Maven lifecycle, with the various configurations that I need listed as configurations in the pom.xml. This let me easily fit haxe builds into our existing CI/CD pipelines as part of my existing java builds, and lets me build to the necessary targets in a pretty repeatable way. The primary artifact of this plugin is a Haxe source bundle, but depending on the target I also create secondary artifacts (e.g. JAR files for java). Most importantly though it centralizes the solutions that I’ve got for the problems that you mentioned. For example:

if I compile library A to a JAR and library B to a JAR both contain haxe/lang classes which would conflict

I get around this in a 2 step process. First, I’ve got a mojo in my maven plugin that can generate a JAR file for the standard Haxe library (it generates a simple class that includes most of the files in the standard library, then builds it) which is included in our internal archiva installation. Next, when I build a project that’s intended to be a library, the maven plugin builds it like normal but in the packaging phase it strips out any classes that didn’t come from a source file in the current project. This is also uploaded to our archiva instance. So now a project that depends on multiple JAR artifacts can include them as regular maven dependencies, along with the standard JAR. The only major drawback that I’ve found with this process is that everything has to be built with the same version of Haxe, but since we’re running everything through CI/CD anyway this hasn’t been too much of an issue for my work.

Distributing to e.g. NPM is as easy as configuring a javascript target and creating a package.json - in my CI/CD pipeline it does a maven build to generate the JS files, and assuming tests are passed it publishes them to our internal NPM server.

So it’s definitely tricky, but ultimately very doable.

kevinresol · June 7, 2020, 8:23am

You don’t need to involve OpenFL.

cedx · June 7, 2020, 8:21pm

We need this! Haxe was probably designed to generate applications, but we are some that want to generate native libraries.

I don’t work with Java, but faced the same issue with JS and PHP.

In JS, we probably can’t avoid duplication of the Haxe standard library because the compiler generates only one file. This does not pose any particular problem apart from “instanceof” checks (you could end up having several definitions of the same class: the same problem you have to deal when using several JS contexts in the same application). Just don’t use the shallow-expose compiler flag to avoid name collisions.

In PHP, we can’t avoid name collisions (apart from the php_prefix compiler flag, but if we use it, we can’t avoid code duplication… so don’t use it). Creating PHP libraries with Haxe is pretty difficult. I ended up creating a dedicated Composer package providing only the Haxe standard library, and a build step removing the Haxe shim from the generated sources of the target library. This is exactly the same procedure as @kigero, but for PHP.

This only partially solves the problem: if you have Haxe dependencies in your library, these can also be duplicated between several PHP libraries coded in Haxe and used in the same application. So you must also put your Haxe dependencies into the shared package containing the standard library.

EDIT: for those who wish to use the Composer package I created, it is available on Packagist:

NoRabbit · June 8, 2020, 8:37pm

+1

I’m also looking for a solution to this problem.

Basically: being able to remove classes and packages from the compiled output.

Does not seem to be working when using --macro exclude(‘Std’) or --macro exclude(‘haxe’, true)
Maybe I’m using this badly…

Actually, the reason why I need this is I want the outputed hl bytecode to only contain what’s inside Main.hx (and all other classes linked to my main class)
I don’t want the iterators or whatever classes that I’m not using.
I also tried dce but it doesn’t change a thing in the output.

jedwards1211 · June 26, 2020, 9:01pm

After looking into this more, my friend and I decided to look into writing our libraries in C++ and using SWIG and Emscripten to connect them to other languages. Definitely not the best, as it would be nice to avoid compiling C++ for each platform we need to support when using the library in a VM language like Java or Python. But it seems like an easier option for now.

The primary reason we decided against using Haxe was that the output C++ code isn’t very idiomatic, as described in this comment. Especially regarding arrays and the use of garbage collection instead of shared pointers.

Im tempted to start writing tools to transpile a carefully limited subset of either Java or TypeScript to source code for other common languages. I think it would be the cleanest approach in the long run, but it would take a lot of work.

As far as Haxe, I like a lot of things about it, but I get the impression it would require major surgery to make it convenient for publishing target language libraries.

back2dos · June 27, 2020, 9:59am

If you’re going to use a non-managed subset of either Java or TypeScript, you might as well use Haxe. Using macros to restrict the subset is easily accomplished (e.g. error on anonymous objects, arrays of primitives, Dynamic or whatever), using them to tweak gencpp’s output (adding @:structAccess, @:stackOnly, etc. as appropriate) is fairly straight forward and if all that fails, you can still use a custom generator, e.g. GitHub - ianharrigan/hxArduino: hxArduino - custom haxe generator to create arduino specific c++

Haxe is extremely customizable. Given “a carefully limited subset”, it is quite manageable to generate idiomatic code in any relevant target language. In contrast, it’s doubtful that idiomatic C++ will be particularly pleasant to use from Java for example, or that calling into a C++ lib through SWIG from Java (thus allocating Java wrappers for C++ structs and such) will perform particularly well.

As far as Haxe, I like a lot of things about it, but I get the impression it would require major surgery to make it convenient for publishing target language libraries.

Hmm … considering what you’re going to use instead, I guess I’d pick that surgery. In any case, good luck

tokiop · June 27, 2020, 11:28am

The subject is coming up regularly, Haxe sounds great for this but in practice some example projects / workflows / documentation and maybe tooling (compiler/libs?) could help to streamline the process.

For reference, Daff data diff library is written and Haxe is built for multiple targets, didn’t test how usable it is from target’s point of view. HSLuv was also based on Haxe, not sure how much it is still the case.

ianharrigan · June 28, 2020, 7:27am

Its worth noting that if i were to write hxArduino again from scratch now, i would do it very differently: for some reason i chose to turn the AST in to another set of objects and then use that, rather than just using the AST that haxe produces directly (which would likely make much MUCH more sense). I think it was probably just because it was easier for me to understand and the time, but it makes its pretty brittle and hard(er) to update.

I think the biggest issue to creating libs for other langs has been c++, or hxcpp, obviously it comes with its own GC and that potentially can mean alot of things for using the said lib, not only will the interface likely be a little “strange”, ie, you dont new stuff as the GC will need to know about it - it also has implications for the lifecycle of objects.

It would be nice if you could say to hxcpp “dont GC anything, ill handle memory”, but in reality i dont know much at all about the internals of hxcpp to know if thats even doable, and if it is, its likely a pretty big feat.

Ian

Java, C#, JS and others use the GC of the target language, so libs there arent an issue

jedwards1211 · July 18, 2020, 3:11am

Huh interesting, are those macros documented somewhere beyond the descriptions in haxe/meta.json at 6ce45785cc818d0707f5fcdf447115498af93816 · HaxeFoundation/haxe · GitHub?

stackOnly is interesting but we will eventually be working with graphs, and unless I misunderstand something I don’t know how circular Haxe data structures could get allocated on the stack, especially if nodes are created in a loop for instance. So we would need reference types to get output as C++ shared pointers. Is there any way to make gencpp output shared pointers?

Also you’re saying there’s a way to output idiomatic C++ arrays instead of Dynamic? Especially for instance if the source type is an array of references…would need a C++ array of shared pointers to get generated.

I was already able to call C++ from Java and JS via SWIG and it was pretty clean and easy. The only concern at the moment is how multiple SWIG-ified libraries will interact, but parts of their documentation seem to address that need.

nobody123 · July 19, 2020, 2:41am

You might want to check out Ć (cito) first, it was designed for this use-case. Not as powerful as Haxe, but might be sufficient for the “carefully limited subset” you already had in mind …

Incidentally, it seems like we have similar desires to use Haxe (or something) to make language-agnostic libraries. I wrote this post recently (unaware of yours here). Absent special syntax to tell Haxe I want to export clean C entry points (and a single function for initializing the ‘runtime’), I’ve decided I’m not going to mess with trying to interface with hxcpp output, but instead am going to try the embedded HashLink approach. Based on what I’ve skimmed I think that’s going to be cleanest and the least effort.

But really the holy grail is something like what SWIG does for C++ - being able to export actual classes and instantiate them / call methods from other languages – essentially what COM tried to solve decades ago (and mostly failed, IMO). Ć/cito is even better than that, since there are no language boundaries to cross with the final output. But the language itself is unfortunately a bit primitive (probably of necessity given how many languages it’s idiomatically targeting).

jedwards1211 · July 25, 2020, 6:07am

Yeah that’s pretty cool, and surprisingly obscure, thanks for the link. Exactly the type of thing I was imagining building - a simple language that can output idiomatic code in other languages. I wish Ć would treat all strings as unicode and abstract away differences in representation across languages, though I know it’s a tall order (I’m sure dealing with charsets is difficult in SWIG as well). But nice to know about!