Beta

×

Welcome to the Slashdot Beta site -- learn more here. Use the link in the footer or click here to return to the Classic version of Slashdot.

Thank you!

Before you choose to head back to the Classic look of the site, we'd appreciate it if you share your thoughts on the Beta; your feedback is what drives our ongoing development.

Beta is different and we value you taking the time to try it out. Please take a look at the changes we've made in Beta and  learn more about it. Thanks for reading, and for making the site better!

Project Aims For 5x Increase In Python Performance

ScuttleMonkey posted more than 5 years ago | from the do-stupid-things-faster-and-with-more-energy dept.

Programming 234

cocoanaut writes "A new project launched by Google's Python engineers could make the popular programming language five times faster. The project, which is called Unladen Swallow, seeks to replace the Python interpreter's virtual machine with a new just-in-time (JIT) compilation engine that is built on LLVM. The first milestone release, which was announced at PyCon, already offers a 15-25% performance increase over the standard CPython implementation. The source code is available from the Google Code web site."

cancel ×

234 comments

Sorry! There are no comments related to the filter you selected.

Unladen Swallow (0, Troll)

sharp3 (1195261) | more than 5 years ago | (#27362995)

I like the clever name, but what is 5 pounds off of Oprah's ass? It's still huge!

Re:Unladen Swallow (3, Funny)

Rip Dick (1207150) | more than 5 years ago | (#27363013)

What if Oprah's ass got 5x's smaller?

Re:Unladen Swallow (5, Funny)

davester666 (731373) | more than 5 years ago | (#27363073)

It would still be huge! :-)

insignificant (0)

Anonymous Coward | more than 5 years ago | (#27363413)

Multiplicative changes mean nothing when dealing with things like zero and infinity.

In this case, there's just no escape: you still have a black hole.

(tempting fate on that whole existence-of-Hell thing)

essentially killed the XO (0)

Anonymous Coward | more than 5 years ago | (#27363607)

One can write a GUI in Python. With enough effort, it may even be usable for light tasks, at least with an Intel Core i7 or similar. In other words, it's kind of like JavaScript, but not as fast.

Speed ups for EVE online, perhaps? (5, Insightful)

KnightElite (532586) | more than 5 years ago | (#27363011)

I hope this translates into further speed ups for EVE online down the road.

My Crow is just fine thank you (0)

Anonymous Coward | more than 5 years ago | (#27363863)

My head is already spinning when I hit the MWD so no further speedups please! Also, get out of Jita etc.

Re:My Crow is just fine thank you (1)

Rakshasa Taisab (244699) | more than 5 years ago | (#27364991)

Not even the crow beats station-spinning the Orca.

Re:Speed ups for EVE online, perhaps? (1, Interesting)

Anonymous Coward | more than 5 years ago | (#27364037)

EVE uses stackless python, which needs a completely different runtime system (libraries, interpreter, etc) than vanilla python.

Kill the GIL! (5, Informative)

GlobalEcho (26240) | more than 5 years ago | (#27363035)

The summary misses one of the best bits -- the project will try to get rid of the Global Interpreter Lock that interferes so much with multithreading.

Also, it's based on v2.6, which they are hoping will make 3.x an easy change.

Re:Kill the GIL! (3, Interesting)

eosp (885380) | more than 5 years ago | (#27363131)

The summary misses one of the best bits -- the project will try to get rid of the Global Interpreter Lock that interferes so much with multithreading.

Good luck with that. The last time someone tried that, they slowed Python down by half.

slowed it down by half? (5, Funny)

Anonymous Coward | more than 5 years ago | (#27363261)

0.5x slower is like 2x faster, right? Reciprocals?

Re:slowed it down by half? (1)

Your.Master (1088569) | more than 5 years ago | (#27363615)

I interpreted it as "now the Python interpreter takes 150% as much time as it used to". The half being added, rather than multiplied.

Re:slowed it down by half? (1)

jd (1658) | more than 5 years ago | (#27363991)

Or it could mean they used half-and-half in the developer's tea, causing them to slow down.

Re:Kill the GIL! (4, Insightful)

dgatwood (11270) | more than 5 years ago | (#27363339)

The key is to find the right balance of granularity in locking. A big giant mutex is always a bad idea, but having tens of thousands of little mutexes can also be bad due to footprint bloat and the extra time needed to lock all those locks. The right balance is usually somewhere in the middle. Each lock should have a moderate level of contention---not too little contention or else you're wasting too much time in locking and unlocking the mutex relative to the time spent doing the task---not too much contention or else you're likely wasting time waiting for somebody else that is doing something that wouldn't really have interfered with what you're doing at all. Oh, and reader-writer locks for shared resources can be a real win, too, in some cases.

Re:Kill the GIL! (2, Interesting)

Viking Coder (102287) | more than 5 years ago | (#27363557)

Hrmph.

Maybe I'm just drinking kool-aide, but Software Transactional Memory sounds much, much better to me.

The "D" programming language from Digital Mars sounds very interesting, for example.

Re:Kill the GIL! (4, Interesting)

Nevyn (5505) | more than 5 years ago | (#27364017)

Then you probably want to read: Patrick Logan on why SMT isn't "awesomez" [blogspot.com] .

Re:Kill the GIL! (1)

Viking Coder (102287) | more than 5 years ago | (#27364801)

Nice food for thought.

I think all I really need is multi-thread safe Persistence, in my use case, with as little memory duplication as possible, of course.

Hrm - the hamster is definitely running in the wheel right now...

Re:Kill the GIL! (4, Insightful)

jd (1658) | more than 5 years ago | (#27364145)

If developers were working from a clean-slate and didn't have the problems of excessive legacy code to work with, I suspect Digital Mars' D, Inmos' Occam and Erikkson's Erlang would be the three main languages in use today.

If hardware developers were working from a clean-slate, you'd probably also see a lot more use of Content Addressable Memory, Processor-In-Memory and Transputer/iWarp-style "as easy as LEGO" CPUs.

Sadly, what isn't patented was invented 30 years too late and 20 years before the technology existed to make these ideas really work, so we're stuck with neolithic monoliths in both the software and hardware departments.

(Remember, Y2K was worth tens of billions, but wasn't worth enough to get people to stop using COBOL, and that was practically dead. To get people to kick their current habits would need a kick in the mind a thousand times bigger.)

language independence (1)

Gary W. Longsine (124661) | more than 5 years ago | (#27364693)

The fascinating thing about the LLVM architecture is that you can bolt any language on the front end, and still benefit from a mountain of hardware-specific optimizations on the back end, without the need to figure them out and implement them yourself. Erlang, D, and Occam front ends for LLVM are just some code away... just a shout away, just a kiss away... kiss away... kiss away, hey, hey-ya...

Re:Kill the GIL! (0)

Anonymous Coward | more than 5 years ago | (#27363563)

I think locks are the wrong paradigm, no matter the granularity. Software transactional memory is much more pythonic.

Re:Kill the GIL! (2, Interesting)

Secret Rabbit (914973) | more than 5 years ago | (#27365063)

Or one could keep *A* GIL and largely ignore it. Here's the model I would use.

Separate python into per thread instances yet keep a larger overall memory space to be shared between threads. But, one must explicitly state that they want to go to the global space. That way, when one uses a single threaded application, everything is as it should be, nothing in its way to slow it down. So, those locks won't even get invoked. However, when one is programming a multi-threaded application, then one has the *choice* to either keep them separate or to make them aware of each other and start using the GIL.

With that, I believe that one can largely have his cake and eat it too.

Re:Kill the GIL! (4, Insightful)

Just Some Guy (3352) | more than 5 years ago | (#27363757)

Good luck with that. The last time someone tried that, they slowed Python down by half.

Yes, good luck with that! Because the current implementation slows it down by 7/8ths on my 8-core server.

Re:Kill the GIL! (3, Informative)

Nevyn (5505) | more than 5 years ago | (#27363993)

That's funny, because os.fork() etc. work fine on my version of python.

Re:Kill the GIL! (1)

Just Some Guy (3352) | more than 5 years ago | (#27364127)

They work great here, too, but each process model has its place. There are times when I really, really wish I could use effective threading.

Re:Kill the GIL! (4, Informative)

Red Alastor (742410) | more than 5 years ago | (#27363925)

Good luck with that. The last time someone tried that, they slowed Python down by half.

Only because Python uses a refcounting garbage collector. When you get many threads, you need to lock all your data structures because otherwise you might collect them when they are still reachable. This project plans to change the garbage collection strategy first. Once it's done, killing the GIL is easy.

Re:Kill the GIL! (1)

CarpetShark (865376) | more than 5 years ago | (#27365237)

The summary misses one of the best bits -- the project will try to get rid of the Global Interpreter Lock that interferes so much with multithreading.

Thanks for that. I was about to say, that the main issue for me is the GIL, not interpreter performance. Improvement of both is good, of course, but the GIL can be a show-stopper much more easily.

How fast is five times faster really? (5, Funny)

LingNoi (1066278) | more than 5 years ago | (#27363041)

They say five times faster however it really depends on if they're talking about a European or African Python Interpreter.

Re:How fast is five times faster really? (4, Funny)

ArsonSmith (13997) | more than 5 years ago | (#27363079)

Java spokes person: "5x faster? We already do that."

Java spokes person to other java people: "(whisper)Hehe, I told them we already do that. Hehe."

Re:How fast is five times faster really? (3, Informative)

rackserverdeals (1503561) | more than 5 years ago | (#27363345)

I know you're trying to be funny but... If you're talking plain Java vs Python [debian.org] , Java looks to be quite a bit faster. You don't have to look hard to find benchmarks that show java is faster [timestretch.com] .

Jython [jython.org] seems to be about 2-3 times faster than CPython [warwick.ac.uk] according to those test.

This could give CPython the performance edge over Jython, but it still has a way to go to catch up to Java.

Re:How fast is five times faster really? (1)

FishWithAHammer (957772) | more than 5 years ago | (#27363419)

IronPython too. Not quite as fast as Jython last I eval'd it, but at the time it had plenty of room to improve.

The only place I currently use Python is embedding the IronPython system in a Mono app, though, so I'll take what I can get.

Re:How fast is five times faster really? (1)

0xABADC0DA (867955) | more than 5 years ago | (#27363909)

This could give CPython the performance edge over Jython, but it still has a way to go to catch up to Java.

Except that jdk 1.7 is getting all sorts of improvements that will help with Jython speed... like a dynamic method call opcode and stack-allocated objects.

So it's doubtful that llvm python will be faster than Jython, or at least not for long.

Re:How fast is five times faster really? (1)

fredrik70 (161208) | more than 5 years ago | (#27364253)

oh, I thought the jvm already put local new'ed objects on the stack rather than the heap if it thought it was possible (i.e. for local objects in a method, etc)

Re:How fast is five times faster really? (2, Interesting)

kpainter (901021) | more than 5 years ago | (#27364203)

If you're talking plain Java vs Python [debian.org], Java looks to be quite a bit faster

The first link above refers to Java used with "Hotspot" and it is really fast. If you select the Java Xint, they are a lot closer although Java is still faster. But that "Hotspot" option looks to me to provide about a 10x speed improvement over plain interpreted Java. http://shootout.alioth.debian.org/u32q/benchmark.php?test=all&lang=javaxint&lang2=java&box=1 [debian.org] If Python were to do something similar, I would expect a significant improvement in its performance too.

Re:How fast is five times faster really? (1)

rackserverdeals (1503561) | more than 5 years ago | (#27364889)

The Xint option is used in very rare cases if you encounter a bug with the compiler. I have never run into one case where I needed it.

I think some people are working on JIT compilers for Python and other interpreted languages but I'm not sure of the status.

Re:How fast is five times faster really? (5, Funny)

meringuoid (568297) | more than 5 years ago | (#27363749)

Joking aside, though, I find this target to be overambitious. Speeding up by a factor of three would be plausible; two would be OK, but I'd hope they'd keep working on it to get it up to three. Four strikes me as unlikely, and five is right out.

Re:How fast is five times faster really? (1)

ArsonSmith (13997) | more than 5 years ago | (#27364041)

I mean if I went around claiming to be faster just because I was hard coded in C they'd put me away. We have to take it in terms that a JIT engine can optimize code in real time much better than a precompiled binary. Now you see the slowdown inherit in the system.

Re:How fast is five times faster really? (0)

Anonymous Coward | more than 5 years ago | (#27364785)

Joking aside, though, I find this target to be overambitious. Speeding up by a factor of three would be plausible; two would be OK, but I'd hope they'd keep working on it to get it up to three. Four strikes me as unlikely, and five is right out.

Three time faster? That's easy just change the colour of the text to red and add an antenna

This is a very interesting project (5, Interesting)

Max Romantschuk (132276) | more than 5 years ago | (#27363043)

I read about what they intend to do, and they seem to have quite a few interesting ideas... But there are also major drawbacks:

- No Windows support (apparently a Linux-only VM in the plans)
- No Python 3.0 support

And thus no guarantees most of the work will merge back into CPython.

But competition is good, I can't really see a problem with having an alternative faster Python runtime, even if it's not as compatible as CPython. :)

Re:This is a very interesting project (2, Informative)

ianare (1132971) | more than 5 years ago | (#27363141)

- No Python 3.0 support

They are using v 2.6 which has been designated as the official migration step towards 3.0. So it should be easiy to port over to 3.0, anyway right now very few projects are using 3.0.

Re:This is a very interesting project (4, Informative)

maxume (22995) | more than 5 years ago | (#27363423)

It might be easy to port over to 3.0, but not because it is using 2.6. Basically, they are planning on ripping out a big chunk of the internals of 2.6 and replacing it with a LLVM based system. To the extent that those internals changed for 3.0 (there wasn't necessarily effort put into making them compatible across 2.6 and 3.0...), the code would need to be updated for 3.0. The python level portability between 2.6 and 3.0 isn't a huge factor for something like this.

They are targeting 2.6 because that is what made sense for Google (who is paying for the work). Or so they say:

http://code.google.com/p/unladen-swallow/wiki/FAQ [google.com]

Re:This is a very interesting project (2, Interesting)

FishWithAHammer (957772) | more than 5 years ago | (#27363147)

I'm not quite sure what benefits this gives that Psyco doesn't already.

Re:This is a very interesting project (4, Informative)

MightyYar (622222) | more than 5 years ago | (#27363273)

Psyco is x86 only and uses a lot of memory. It also requires additional coding... you have to actively use it, so you don't automatically get the speedup that a faster interpreter gets you. You also have to pick-and-choose what you want to get compiled with Psyco - the extra overhead isn't always worth it.

To be fair, I don't know what the memory requirements of this new project are.

Re:This is a very interesting project (1)

FishWithAHammer (957772) | more than 5 years ago | (#27363383)

Psyco may be x86-only, but this is Linux-only. That kills a lot of the appeal this might have in much the same way.

Re:This is a very interesting project (1)

schmiddy (599730) | more than 5 years ago | (#27364195)

Psyco is x86 only and uses a lot of memory

Even worse, Psyco is 32-bit only [sourceforge.net] : Psyco does not support the 64-bit x86 architecture, unless you have a Python compiled in 32-bit compatibility mode. There are no plans to port Psyco to 64-bit architectures. This

However , as far as "requires addition coding", I think you're a little off-base.. unless you consider "import psyco" to be a lot of work.

Re:This is a very interesting project (0)

Anonymous Coward | more than 5 years ago | (#27364549)

Even worse, Psyco is
32-bit only [sourceforge.net] : Psyco does not support the 64-bit x86 architecture, unless you have a Python compiled in 32-bit compatibility mode. There are no plans to port Psyco to 64-bit architectures.

I hear you. I don't have any more 32-bit boxes to my name, and I'm hesitant to upgrade a few of our servers at work to x86_64 because, without psyco, it'll slow things down drastically.

JITting python code can be a memory hog as well (1)

boorack (1345877) | more than 5 years ago | (#27364357)

Python code has no explicit type declarations. That means that generated (byte)code has to be type agnostic or the VM has to be able to generate concrete specializations on the fly. Type agnostic code kills performance (be it an interpreter or JIT generated code). Generating specializations consumes memory. This is why cpython interpreter is slow and psyco is a memory hog. Jython/IronPython variations propably have both drawbacks to some extent - faster than cpython but nowhere near fully optimized native code and quite memory hungry (as Java and .NET apps tend to be).

Porting python to LLVM will be a quite ambitious step with lots of work. I suppose they'll end up with a virtual machine having similiar performance characteristics to Jython/IronPython without overhead of Java/.NET/Java_programming_style. It will be suitable for server environments and this is what Google is paying for ;)

Re:This is a very interesting project (1)

bnenning (58349) | more than 5 years ago | (#27363421)

Psyco only works for 32-bit x86, and many Python features are unsupported [sourceforge.net] .

Re:This is a very interesting project (4, Funny)

Tumbleweed (3706) | more than 5 years ago | (#27363567)

I'm not quite sure what benefits this gives that Psyco doesn't already.

It doesn't get as stabby.

No windows (1)

nurb432 (527695) | more than 5 years ago | (#27363235)

Or BSD, or several other important platforms.

Re:No windows (4, Informative)

Anonymous Coward | more than 5 years ago | (#27363461)

Quite to the contrary, the FreeBSD guys have been building with clang [llvm.org] +llvm [llvm.org] for a while now, and they seem to like it [freebsd.org] . The kernel boots, init inits, filesystems mount, the shell runs.

What other platforms, Darwin? Apple employs the largest number of LLVM developers. Windows? Both MinGW and Visual Studio based builds are tested for each release.

It's still not as portable as the python interpreter, but that will come if and when developers who are interested in working on it start to contribute.

Re:This is a very interesting project (2, Informative)

orclevegam (940336) | more than 5 years ago | (#27363271)

- No Windows support (apparently a Linux-only VM in the plans)

The article says it's going to be based on LLVM which most definitely is cross-platform (and being touted as the logical successor to GCC). Unless they go out of their way to use some Linux only calls while implementing their Python VM on top of LLVM it should be trivially easy to get it running in Windows.

Re:This is a very interesting project (1)

negative3 (836451) | more than 5 years ago | (#27363731)

From what I've seen, Python 3.0 is not supported by a good number of Python packages whereas Python 2.6 is which would make the "no Python 3.0 support" a minor issue for me. Python 3.0 is also not shipping as the default interpreter for Fedora, Ubunutu, or openSuSE yet so it won't really affect basic users for a while. I have also seen benchmarks (but I don't have references, so I welcome contradictions and corrections) that show that 3.0 is considerably slower than 2.6 so if the speed of Python is an issue to people they shouldn't be using 3.0 (I take issue with people who grumble about Python's execution speed anyway - if speed is that important stick to C/C++). If you have a good amount of existing Python apps that work under 2.5 getting them to work in 2.6 isn't hard. Moving to 3.0 is a much bigger step, especially if you relied on built-in modules that are either different now or removed.

I see the "no windows support" as a much bigger negative - if one of the biggest strengths of Python is cross-platform support and you need your programs to work on both Windows and Linux (as I do) that's going to be a problem and I'm only half interested (because half of my apps never leave Linux).

Re:This is a very interesting project (1, Funny)

Anonymous Coward | more than 5 years ago | (#27363903)

- No Windows support (apparently a Linux-only VM in the plans)

They're trying to atone for their Chromium sins. You Windows lusers* will get a pre-alpha version ... eventually. The import statement won't work and every function call will print 'Stop! This VM isn't ready yet!' But you'll get something.

* And I say this without any animosity.

Re:This is a very interesting project (1)

samkass (174571) | more than 5 years ago | (#27365137)

Now that JDK7 is adding invokedynamic, it would be interesting to see this target the JVM instead of LLVM. The JVM is ported everywhere and is extremely fast. I smell some upcoming bake-offs...

They should make it 24 times faster (0)

MrEricSir (398214) | more than 5 years ago | (#27363045)

Because 24 is the biggest number there is. That's alls I'm sayin'. Forget about it.

They should make it 42 times faster (0)

Anonymous Coward | more than 5 years ago | (#27363381)

Because 42 is the only number there is.

Is that an African or European swallow? (1)

mamono (706685) | more than 5 years ago | (#27363081)

While you're at it, what is the capitol of Assyria?

It's part of Iraq now (1)

tepples (727027) | more than 5 years ago | (#27363995)

Nineveh then, Baghdad now.

Re:Is that an African or European swallow? (0)

Anonymous Coward | more than 5 years ago | (#27364043)

Um, some ruined building I would think.

Perhaps you meant capital.

What about Parrot? (0, Offtopic)

imacpro (471075) | more than 5 years ago | (#27363211)

The Parrot project's VM (mostly oriented towards Perl 6) would be a much better target, especially since work on that dynamic language register-based virtual machine has already been going on for some time, even for Python.

Re:What about Parrot? (0)

Anonymous Coward | more than 5 years ago | (#27363245)

Well, sure, but then Google wouldn't be able to say they did it all themselves...

Re:What about Parrot? (1)

FishWithAHammer (957772) | more than 5 years ago | (#27363433)

Parrot's a lot harder to use to interact with other languages. LLVM at least makes it possible for Python code to play nicely with C compiled via LLVM, for example.

Re:What about Parrot? (1)

Abcd1234 (188840) | more than 5 years ago | (#27363821)

Parrot's a lot harder to use to interact with other languages.

Uhh... wha? That's one of the entire reasons Parrot exists. Any language that's compiled to Parrot can interact with any other language compiled to Parrot.

LLVM at least makes it possible for Python code to play nicely with C compiled via LLVM, for example.

Huh? I *really* don't see how LLVM provides a mechanism for languages to interact with one another. It's IR is really just machine code, it's just that the machine doesn't actually exist. In that sense, compiling to LLVM IR is absolutely no different than compiling directly to, say, x86, and it's pretty clear that Perl, compiled to x86, can't interact with Python, compiled to x86, so why would that be any different for Perl compiled to LLVM IR and Python compiled to LLVM IR?

Remember, language interaction requires a whole host of things, including a common underlying framework for how objects are represented, how methods are called, etc. As far as I know, LLVM provides none of that (unlike the JVM, CLR and Parrot). Heck, it only offers a few types of primitives, including basic numbers, pointers, and lists. It has no concept of objects at all... so how is a Python object supports to interact with a Perl object, for example?

That said, you could certainly build something like that *on top* of LLVM (eg, a CLR, JVM, or Parrot backend that compiled down to LLVM IR, which then provides the necessary infrastructure for languages to interact), but LLVM itself does not, as far as I can tell, directly facilitate such a thing.

It's probably pining for the fiords. (1)

smcdow (114828) | more than 5 years ago | (#27363241)

FTFA:

Adopting LLVM could also potentially open the door for more seamlessly integrating other languages with Python code, because the underlying LLVM intermediate representation is largely language-neutral.

So much for Parrot.

Re:It's probably pining for the fiords. (4, Informative)

Abcd1234 (188840) | more than 5 years ago | (#27363471)

Not really. Parrot is a much higher-level VM, providing things like closures, multiple dispatch, garbage collection, infrastructure to support multiple object models, and so forth, whereas LLVM really models a basic RISC instruction set with an infinite number of write-only registers.

In fact, it would make a fair bit of sense to actually use LLVM as the JIT-compiling backend for Parrot...

Re:It's probably pining for the fiords. (1)

chromatic (9471) | more than 5 years ago | (#27364167)

In fact, it would make a fair bit of sense to actually use LLVM as the JIT-compiling backend for Parrot...

You'd almost wonder if Parrot developers were working on something like that....

Re:It's probably pining for the fiords. (1, Funny)

koiransuklaa (1502579) | more than 5 years ago | (#27363551)

So much for Parrot.

No no, he's not dead, he's... he's resting! Remarkable bird, the Norwegian Blue, ay?

Re:It's probably pining for the fiords. (2, Funny)

jd (1658) | more than 5 years ago | (#27364199)

The Parrot Sketch backfired not that long ago when fossils of a parrot (that probably was blue) were found in Norway. Not too far from the Fjords, as I recall. It is, however, quite dead.

Don't you mean (0)

Anonymous Coward | more than 5 years ago | (#27363259)

3 times faster?

Re:Don't you mean (1)

scorp1us (235526) | more than 5 years ago | (#27363817)

Mod up

IronPython speed (2, Informative)

icepick72 (834363) | more than 5 years ago | (#27363459)

Word has it [slashdot.org] that Microsoft created a speedy IronPython implementation on their Common Language Runtime and JIT technology for .NET. Here are benchmarks for it [codeplex.com] . Failing to find similar benchmarks for comparison; can anybody else contribute to this info?...

Too many levels of translation? (2, Interesting)

Theovon (109752) | more than 5 years ago | (#27363475)

It sounds like that they're going to take Python, which is already gets translated to some kind of p-code (right?) and either translate the original Python or the p-code into LLVM code, which is then JIT-compiled to the native architecture.

The translation from Python to LLVM is going to lose some specificity and require that extra code be added to implement whatever needs to be done in Python that isn't trivially implemented by LLVM. Then the LLVM code needs to be compiled to native, introducing yet more "glue" code in the process.

Wouldn't a more direct compile yield a better result?

And don't give me any junk about compiling dynamic languages. LISP and Self are highly dynamic languages, yet they're compiled. If they can be compiled, then so can Python. I mean, the fact that it can be done through multiple levels of translation proves that it can be done, although possibly inefficiently. I just think that a more direct approach would reduce some of the superfluous glue code and a variety of other inefficiencies in translation that result from a loss of knowledge about what the original program was actually trying to implement.

Re:Too many levels of translation? (4, Informative)

Abcd1234 (188840) | more than 5 years ago | (#27363689)

Wouldn't a more direct compile yield a better result?

No, it wouldn't.

The entire point of LLVM is that it provides an easy-to-target machine (it's basically a RISC instruction set) that you can use as your intermediate representation (the p-code you described). You then use the LLVM backends to compile the IR down to machine code. And because of the way the IR is structured (for example, it has write-only registers, which makes certain classes of optimizations much easier), you can do a really good job of optimizing.

Basically, you "direct compile" to the LLVM IR, and then let LLVM take care of the details of generating the machine code. This gives you better abstraction (no more machine-specific code generation in Python itself), portability (to whatever LLVM targets), and you get all the sophisticated optimization that LLVM provides for free. That's a huge potential win.

Re:Too many levels of translation? (1)

Estanislao Martnez (203477) | more than 5 years ago | (#27364881)

The translation from Python to LLVM is going to lose some specificity and require that extra code be added to implement whatever needs to be done in Python that isn't trivially implemented by LLVM. Then the LLVM code needs to be compiled to native, introducing yet more "glue" code in the process.

What do you mean by "lose specificity" here?

It's not clear to me that either the Python bytecode or the LLVM code is "more specific" than the others. Simply, one of them is higher level than the other; there will be many cases where the Python bytecode spells out "what to do," and the corresponding LLVM translation spells out "how to do it." This means that both of them will end up having information that the other one doesn't; the Python source bytecode will imply that some sequences of LLVM instructions "belong together" in ways that the LLVM doesn't represent.

This just means that some optimizations can be performed on one representation, but not the other. The Python bytecode will be susceptible to optimization that eliminate relatively large chunks of LLVM code; the LLVM code will be susceptible to peephole optimizations [wikipedia.org] that span across Python opcode barriers. So to the extent that that extra "glue" code you mention does work that really needs to be done, inlining it into the translations allows the compiler to use the surrounding context to optimize the glue in ways that the interpreter cannot.

Wouldn't a more direct compile yield a better result?

What's a "direct" compile? Optimizing compilers use multiple levels of representation, because each level is suited to different kinds of optimizations. For example, common expression elimination is easier to do in the abstract syntax tree (which represents the structure of the source code at a very high level.); while peephole optimization [wikipedia.org] is best done at a lower-level representation (because you're looking to eliminate things like redundant instruction sequences).

I just think that a more direct approach would reduce some of the superfluous glue code and a variety of other inefficiencies in translation that result from a loss of knowledge about what the original program was actually trying to implement.

If anything, excessively "direct" compilation produces suboptimal code. The optimizations that rely on knowledge of the details of a high-level representation should be done at a high-level representation.

Binspam (5, Funny)

Thelasko (1196535) | more than 5 years ago | (#27363553)

I get emails claiming to increase my python's performance all of the time, I just delete them.

Re:Binspam (1)

oldhack (1037484) | more than 5 years ago | (#27363737)

I get emails claiming to increase my python's performance all of the time, I just delete them.

Then why is your pants smoking?

Any Hope? (2, Funny)

Anonymous Coward | more than 5 years ago | (#27363561)

Is there any hope that we will move away from these boutique programming languages and back to "real languages" that seriously consider size and performance?

I for one am completely sick and tired of 3Ghz multicore processor machines with gigabytes of RAM running like a 486. Languages like Python don;t help in the bloat arena and the scripting languages made out of frameworks on top of other scripting languages are just ludicrous!

Re:Any Hope? (0)

Anonymous Coward | more than 5 years ago | (#27364525)

Virtual machine languages are really cool for smaller projects.

But with bigger ones all the little inefficiencies add up and you end up with apps that take 30s to start up and eat hundreds of megs of RAM.

I'd like to see virtual machines like Java and CLR shifting more work to compile-time and link-time. I'm pretty sure they could get decent performance with static compilation + link-time escape analysis for eliminating heap allocations.

The overengineering of Java frameworks is equally a contributing factor. They might be nicely broken down into multiple layers to increase code sharing, but having to allocate a couple objects to do the work of one C standard library call is sure as hell gonna add up.

Did know it was that bad (1)

kramulous (977841) | more than 5 years ago | (#27363577)

I do my best here not to offend, but I can see clearly now why I don't use Python.

I keep getting pressured by others to adopt it rather than my C or C++ but if they are touting a possible 5x increase, that means it was really, really slow to begin with. And how much further is there to go? I suspect it is not even worth benchmarking it yet.

Since all I mostly do is big matrix and vector work why would I use python? And no, scipy doesn't count as I can get MPI going pretty quickly.

Yes, I realise the right tool for the job argument.

Re:Did know it was that bad (2, Insightful)

zindorsky (710179) | more than 5 years ago | (#27363671)

Yes, I realise the right tool for the job argument.

Exactly. Most applications are not CPU bound. If yours is, then I don't know why others are trying to get you to use Python.

Re:Did know it was that bad (0)

Anonymous Coward | more than 5 years ago | (#27363801)

Most applications are not CPU bound.

Ok, I'll give you that. Provided that you acknowledge that no application is the sole process on any machine, despite that being the thinking of almost all programmers.

The fact is that all applications are constrained in some form or fashion whether it is CPU, disk, memory, bandwidth, whatever. With each application's constraints combined with the others on any given machine, the aggregate IS CPU bound, memory bound, IO bound etc.

If instead everyone took that into consideration and made a concerted effort to make their little application be a good citizen imagine what could be accomplished.

every project starts that way (0)

Anonymous Coward | more than 5 years ago | (#27363827)

You're not CPU bound until you: add all the features, handle the special cases, add the error checking, scale up beyond trivial test data, etc.

Then what? Rewrite?

It's way better to simply avoid going down that path. If you start your project with Python, soon enough you'll hit trouble. If all your skills and personal library of code is Python, it's not reasonable to escape.

If you start in C, you know your performance will never really be limited by the language.

Re:every project starts that way (3, Insightful)

mkcmkc (197982) | more than 5 years ago | (#27363979)

You're not CPU bound until you: add all the features, handle the special cases, add the error checking, scale up beyond trivial test data, etc.

Then what? Rewrite?

Yes. If you didn't know all of that was going to happen, you're prototyping. If you're prototyping, you should be doing it in a prototyping language.

Rewriting from Python to C++ is not particularly difficult. Completely overhauling the design of a project written entirely in C++ is really unpleasant and takes a long time. So much so that many early design decisions on large C++ projects simply cannot be undone.

Model in clay first, then in stone later if you have to.

Re:every project starts that way (0)

Anonymous Coward | more than 5 years ago | (#27364505)

But, no one does!

They pump out a half baked app in Python or whatever language du jour and call it done.

Since it is not going to be rewritten because of time, budget, it's good enough, [insert-your-own-excuse-here], let's opt to write it correctly and in an appropriate language from the onset.

Re:every project starts that way (1)

bnenning (58349) | more than 5 years ago | (#27364805)

Since it is not going to be rewritten because of time, budget, it's good enough, [insert-your-own-excuse-here], let's opt to write it correctly and in an appropriate language from the onset.

If they're going to do a half-baked job in Python, then it would be tenth-baked in C. And if Python's performance is universally unacceptable today, I'm curious as to how you think we accomplished anything at all 10 years ago.

Re:Did know it was that bad (0)

Anonymous Coward | more than 5 years ago | (#27363747)

Python's C API is pretty easy to use. You can program most of your application in easy to read/write/maintain Python, and then jump over to C/C++ for the preformance bottleneck components.

Re:Did know it was that bad (1)

maxume (22995) | more than 5 years ago | (#27363767)

You should use it if you think it would make your life easier. Numpy/Scipy are both supposed to make doing matrix stuff faster (they are written in C) while providing a Python like syntax and the ability to use python for the more mundane parts of the program. If that doesn't look better to you than C and C++, you shouldn't change.

I guess the tool that you are used to can be better than the tool you don't know (this doesn't quite work for a hammer and a screwdriver, but dammit, programming languages and libraries aren't anywhere near that simple).

It all depends (5, Insightful)

mkcmkc (197982) | more than 5 years ago | (#27363905)

I find Python is about 20x slower (and about 10x faster to implement) than C, with the number varying quite a bit depending on how CPU-bound the code is. Given the speed of modern processors, this is plenty fast for many tasks.

Beyond that, many Python programmers employ a strategy of writing just the CPU-intensive inner loops in C or C++. This gives you most of the speed of an all-compiled solution but with much of the easier programming (and shorter programs) of the all-Python approach.

My particular scientific application runs on 1500 cores, is about 75% Python/25% C++, is 4-5x smaller than similar all-C/C++ programs, and runs at about 95-99.99% of the speed of an all C++ solution.

(Somewhat ironically, some of the worst performance bottlenecks in this app had to do with the overhead of some of the STL containers, which I ended up having to replace with C-style arrays, etc. to get best performance.)

Not all apps will fall out this way, but you definitely can't assume that just because something's written in Python that it will be slow.

(Going beyond that, we all know that better algorithms usually trump all of this anyway. If writing in Python gives you the time and clarity to be able to use an O(n)-better algorithm, that may pay off in itself.)

Re:It all depends (0)

Anonymous Coward | more than 5 years ago | (#27364629)

i like python a lot
i just do like that and everyone else (as you point out):

- a lot of python stuff can be run as fast as native if you use the proper functions/etc (ok thats a bit of "blackart-alike"
still that can give tremendous speed gains, to use the proper function so stuff is executed in C by the interpreter

- if something is a bottleneck and theres no way to process it via these python functions, only slow stuff, code a module. even in pyrex its fast and easy.

bottom line, very easy to maintain and very fast applications (just take a bit more of deps because of python and a bit of memory, to load the interpreter)

and i love it.

Re:It all depends (1, Flamebait)

master_p (608214) | more than 5 years ago | (#27364775)

(Somewhat ironically, some of the worst performance bottlenecks in this app had to do with the overhead of some of the STL containers, which I ended up having to replace with C-style arrays, etc. to get best performance.)

I smell bullshit. There is no overhead from using STL containers.

If you used an std::list or an std::map for random access, then you certainly had a bottleneck, because those containers are not for random access.

If you used an std::vector, you couldn't have a bottleneck, for the simple reason that the std::vector is an array.

stackless (1)

Tumbleweed (3706) | more than 5 years ago | (#27363597)

So whatever happened to 'Stackless' Python? Is that ever going to be merged into CPython? And would it work with this?

Incorrect this is. (0)

Anonymous Coward | more than 5 years ago | (#27363603)

It is not a project by Google's engineers, it's an independent project hosted by Google.
Also, 5x speedup is insignificant. Psyco already provides speedups much larger than that, depending on the type of code (algorithmic code could be improved 60x or more).
By the way, Pypy is much more ambitious than this one.
And finally, their goals and timeframe seem a little bit unrealistic. I'd love to be proved wrong though...

Re:Incorrect this is not (1)

Martin Soto (21440) | more than 5 years ago | (#27364701)

It is not a project by Google's engineers, it's an independent project hosted by Google.

The project is indeed sponsored by Google. See the last question in their FAQ [google.com] .

Also, 5x speedup is insignificant. Psyco already provides speedups much larger than that, depending on the type of code (algorithmic code could be improved 60x or more).

You're saying it yourself: depending on the type of code. Psycho may achieve impressive speedups for certain algorithms, but the gains are not has high in general. These guys are aiming at speeding all Python code up by a factor of about five, which would be far from insignificant if they suceeded.

By the way, Pypy is much more ambitious than this one.

Pypy is an interesting project. Unfortunately, though, they are progressing very slowly.

And finally, their goals and timeframe seem a little bit unrealistic. I'd love to be proved wrong though...

You may be right here. Only time will tell.

Re:Incorrect this is not (0)

Anonymous Coward | more than 5 years ago | (#27365079)

You are right. I read it again and it seems the project is lead by two Google engineers working full time, plus the collaboration of other googlers on their 20% free time. However, Google doesn't own the project, they simply sponsor it.

As for the intended speedups, I'm still confussed. Psyco gets typically 4x speedup for common code, and shines when the code is cpu intensive (being theoretically on par with c).

So making such a big effort to get, again, a similar speedup is nonsense...

what about pypy? (1)

gilleain (1310105) | more than 5 years ago | (#27363719)

They (http://morepypy.blogspot.com/) have noticed the project, it seems.

We were a bit confused about usage of the term JIT, because as far as we understood, it's going to be upfront compilation into LLVM. In the past we have looked into LLVM - at one point PyPy extensively use it but it wasn't clear how we could make good use to it.

They seem a bit sceptical.

i submitted this story to slashdot before you (0, Offtopic)

karlzt (1410199) | more than 5 years ago | (#27364757)

i submitted this story to slashdot before anyone yesterday

Re:i submitted this story to slashdot before you (2, Funny)

maxume (22995) | more than 5 years ago | (#27365027)

Here's your cookie:

/^\
\_/

Re:i submitted this story to slashdot before you (1)

karlzt (1410199) | more than 5 years ago | (#27365209)

:confused:

i submitted this story yesterday but... (0, Offtopic)

karlzt (1410199) | more than 5 years ago | (#27364931)

i submitted this story yesterday but obviously i didn't do it right
Load More Comments
Slashdot Login

Need an Account?

Forgot your password?
or Connect with...

Don't worry, we never post anything without your permission.

Submission Text Formatting Tips

We support a small subset of HTML, namely these tags:

  • b
  • i
  • p
  • br
  • a
  • ol
  • ul
  • li
  • dl
  • dt
  • dd
  • em
  • strong
  • tt
  • blockquote
  • div
  • quote
  • ecode

"ecode" can be used for code snippets, for example:

<ecode>    while(1) { do_something(); } </ecode>