Performance CPP vs HL with Threads

I have created a program that runs some isolated tasks in threads. Since there isn’t much documentation on multithreading, I’m using messages between threads to send tasks and receive task results (workerthreads I guess?).
When testing my program with the CPP backend it takes about 5 seconds, when using the C# backend it takes about 90 seconds and with the HL backend it takes 800+ seconds.
Is this an expected gap? Is HL really that much slower than CPP and even C#? I get that HL is a VM and CPP is native code but damn.
Note that I’m not bashing HL, I just want to know if using messages for communication between threads slows HL down or something like that.
Thanks for any insight!

EDIT: I can’t paste the real code, but this is conceptually what I do (I forgot to cache the initial length of the tasks variable in this code, I do check that in the real code, so it does actually work) :

import sys.thread.Thread;

typedef StringTask = {
    stringToAnalyze: String,
    references: Array<String>

typedef StringResult = {
    result: String,
    workerThread: Thread

class ForumSample{

    static function main() {
        // This is shared by all worker threads, but they never write to it,
        // so no mutex is ok I assume?
        var references = [for (i in 0...4) ""];

        var tasks: Array<StringTask> = [for (i in 0...20) {stringToAnalyze: "", references:references}];
        var mainThread = Thread.current();
        for (i in 0...8) {
            var t = Thread.create(() -> {
                // Ignore that this will not properly terminate the thread
                while (true) {
                    var task: StringTask = Thread.readMessage(true);
                    // Do some elaborate stuff

                    var res: StringResult = {result: "", workerThread: Thread.current()};
            // "Start" the thread with first task

        var resultsCalculated = 0;
        while (resultsCalculated < tasks.length) {
            var res: StringResult = Thread.readMessage(true);

        // Done

It might help if you could share a small code snippet / benchmark.

According from my benchmark test,HL (jit/hlc) is slowyer than cpp.

because it’s gc system is not well now!.

I hope Hl can improve gc system next version.I love HL.

I posted some sample code, sorry it took so long. Also the highlighting in the forums seems to not work too well on in :slight_smile: