InflateImpl Performance

I’ve got a function implemented to read a zip file, which is using haxe.zip.InflateImpl to read compressed entries. The performance of this class on different targets is pretty varied - for a ~70mb zip file on my machine, C++ can unzip it in about 2-3 seconds, Java about 10 seconds, but python takes almost 8 minutes. Is there something that I’m doing wrong, or some way to speed this up on python? I’m not using any particular build flags - just building with haxe --python test.py -m ZipTest.hx. The bit of my function that’s causing the performance hit looks like this:

var _entry:haxe.zip.Entry = ...;
var data = null;
if(_entry.compressed)
{
	var inf = new haxe.zip.InflateImpl(new BytesInput(_entry.data), false, false);
	var output = new haxe.io.BytesBuffer();
	var bufsize = 64*1024;
	var buf = haxe.io.Bytes.alloc(bufsize);
	while( true ) 
	{
		var len = inf.readBytes(buf,0,bufsize);
		output.addBytes(buf,0,len);
		if( len < bufsize )
			break;
	}
	data = output.getBytes();
}
else 
{
	data = _entry.data;
}

I think you should report this on GitHub, preferably with a small reproducible example and the test zip file. It could be something related to Python code generation that we could improve!

If it takes multiple seconds in C++ I’m not even surprised it would take minutes in python.

I did expect Python to be somewhat slower, but not this much slower. I’ve submitted an issue, but have worked around my issue with the following in my unzip function:

#if python
python.Syntax.code("import zipfile");
python.Syntax.code("zip = zipfile.ZipFile(path)");
python.Syntax.code("zip.extractall(dest)");
#end

It’s working well enough for me.