-
Notifications
You must be signed in to change notification settings - Fork 201
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cython runtime and compile target #311
Comments
Yes, that would mean Kaitai/Python would require cython, and indirectly gcc. But these are not outragous dependencies. To run Kaitai Python compiler, user already needs what, scala and whatnot? Cython is autoinstalled by pip, and gcc is on pretty much every machine. I agree that benchmarks are important. |
To run ksc a one only needs jre, ksc itself and probably a library of formats. |
So end user already needs a java compiler, kaitai compiler, and some libraries, not to mention python and pip itself? But you complain about adding c++ compiler to the list? |
java compiler is not needed, only runtime, on the machines are going to run KSC but kaitai runtime library is needed and cython will be the dependency of it and/or the ksc-compiled python code |
So those are dependencies for developer's machine, not end user's. Fair point. I just stumbled upon the ticket where XOR processing is benchmark, and the guy essentially points out that to make it faster we need Cython/CFFI or Numpy. |
Depends on the way of distribution. If you want to compile Python from ksy all the time, then you need those indeed. Otherwise if you want to use the generated Python code directly (which can be generated with a browser alone) and then you need basically zero dependency. You still need the Kaitai Python Runtime, but that is enough small to include into your source instead of using the pip version (if you cannot use pip somehow - eg. if in an Enterprise you have to get approval for every dependency). I am not familiar with cython, but if using it would change this zero-dependency setup, I would recommend creating a new target separately from Python (called eg. Cython). |
If you look at mitmproxy setup.py, they just pip install kaitai. They would not even notice the difference, except in benchmarks that is. |
Distribution is a major concern for @mitmproxy, we are shipping pre-compiled packages for Windows, Linux, and macOS. We also have wheels that should work out-of-the-box. Maybe @mhils could offer some more details on how we currently build & manage our binaries and dependencies. |
Thanks for the ping, @GreyCat! I agree that Cython is definitely interesting from a performance perspective, with the disadvantage that users need a compiler and Cython. From our @mitmproxy perspective, things look as follows:
Given 1 and 2, we'd definitely prefer if Kaitai stayed a pure Python dependency. That should not stop you guys from adopting Cython (other Kaitai users may have a need for performance), but frankly speaking I haven't seen anyone complain about Kaitai's Python performance yet. Not sure if you guys want to deal with the pain that is shipping wheels for a whole bunch of platforms (see e.g. https://pypi.python.org/pypi/Pillow/5.0.0) :-). For mitmproxy, we'd probably vendorize a pure-Python version if we get user complaints. |
Construct is going to fork versions, so to speak. 2.8 will remain pure Python, 2.9 will use Cython. |
@arekbulski, from the pyximport page you linked:
|
pyximport works by hacking python, in that sense, they dont recommend it. Ehh. |
From what I'm understanding, there's no universal agreement and there's demand for both versions — both pure Python and Cython. What would it take to support two of them? Would we just need 2 different runtimes, or we'll need 2 targets (i.e. different code to be generated)? |
I think its what you mean by runtime. It would require different KaitaiStream.py. It could use (not need) a different generated code, that would affect performance (not correctness). |
Construct just finished its compiler implementation, using Cython for speedup. There are 2 separate conclusions to be made:
|
New benchmarks prove that pypy does much more speedup than cython, therefore I withdraw the proposal. |
I have been looking at Python runtime and examples for a while, and came to conclusion: probably best way to improve its performance by a significant factor is to cythonize it.
Drawbacks: pypi manifest would need to include
cython
as dependency. This is a simple and quite reasonable tradeoff, little code added and much performance gained.Construct is currently preparing to get into 2.9 version and cythonize itself. You might want to wait until then, to see benchmark comparisons and so I can learn/refresh how to code it. Then it will be easier for me to implement it in Kaitai as well. Kaitai benchmarks also need complete overhaul before this.
The text was updated successfully, but these errors were encountered: