Slashdot Log In
Scaling Large Projects With Erlang
Posted by
Soulskill
on Sun Jul 06, 2008 09:28 AM
from the right-tool-for-the-right-job dept.
from the right-tool-for-the-right-job dept.
Delchanat points out a blog entry which notes,
"The two biggest computing-providers of today, Amazon and Google, are building their concurrent offerings on top of really concurrent programming languages and systems. Not only because they want to, but because they need to. If you want to build computing into a utility, you need large real-time systems running as efficiently as possible. You need your technology to be able to scale in a similar way as other, comparable utilities or large real-time systems are scaling — utilities like telephony and electricity. Erlang is a language that has all the right properties and mechanisms in place to do what utility computing requires. Amazon SimpleDB is built upon Erlang. IMDB (owned by Amazon) is switching from Perl to Erlang. Google Gears is using Erlang-style concurrency, and the list goes on."
Related Stories
This discussion has been archived.
No new comments can be posted.
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
Full
Abbreviated
Hidden
Loading... please wait.

Erlang: The Movie ! (Score:5, Funny)
Sufficiently? (Score:5, Interesting)
hard to read after (Score:2)
"you need large real-time systems running as sufficiently as possible."
Should that not be efficiently as possible?
Re:hard to read after (Score:5, Funny)
"you need large real-time systems running as sufficiently as possible."
Should that not be efficiently as possible?
You obviously haven't looked very closely at any of the "market leader" software lately.
Software from the Big Guys is more and more designed to sell (think forced upgrades) bigger, faster systems. You don't do this by making your software efficient.
The logic behind many software updates these days is "Will this release require sufficient resources that customers will be persuaded to upgrade to new hardware?"
Parent
Re: (Score:2, Funny)
You obviously don't read summaries, articles, or headlines. The logic behind your post is "rant about something for no reason at all."
Huh? (Score:5, Insightful)
"The two biggest computing providers of today"?
What the hell does that mean?
Also, is it just me or does the article intro sound like it was written by someone who has taken way too many marketing classes?
Re:Huh? (Score:5, Interesting)
Parent
Re: (Score:2, Insightful)
Also, is it just me or does the article intro sound like it was written by someone who has taken way too many marketing classes?
Too many marketing classes and not enough English classes.
Re:Huh? (Score:4, Insightful)
For instance, by dropping the imdb name, it is now my impression that this Erlang thing is best at destroying otherwise useful sites by making them less reliable and more annoying to users. Who in their right mind would want to do that. Oh, marketing people, thats who!
Parent
Proprietary? (Score:2, Insightful)
Who wrote the summary? GWB? (Score:5, Funny)
"running as sufficiently as possible"?
Sometimes as a nation we must ask ourselves, is our children learning?
Scala (Score:5, Informative)
People may also want to check out Scala at:
http://www.scala-lang.org/ [scala-lang.org]
It also uses the Erlang style concurrency approach and runs on the JVM with class compatibility with other JVM languages, ie Java, Groovy, etc.
Re:Scala (Score:4, Informative)
There is a significant difference between Scala and Erlang.
Erlang uses green threads. And green threads have advantages and disadvantages over native threads.
For instance Erlang is bad at IO but on the other hand it can spawn millions of threads, something that the JVM has a hard time doing because native threads are limited by the kernel.
Parent
Re:Scala (Score:4, Informative)
Scala has actors, which are allow you to do something _like_ green threads: http://lamp.epfl.ch/~phaller/doc/ActorsTutorial.html [lamp.epfl.ch]
Parent
Re:Scala (Score:4, Informative)
Modern JVMs on the modern Linux Kernel can spawn quite a hellacious amount of threads these days, actually.
The problem with Java is the shared-state synchronization that is often necessary, and the extra work required to distribute state to threads across different VMs. A functional language and programming style could work quite well on top of the JVM, though, and could leverage RMI and some kind of message port facility for the distribution.
Parent
Re:Scala (Score:4, Interesting)
Parent
Re:Scala (Score:5, Informative)
"Last time you checked" was some time last century in that case. Linux kernels have been able to support at least 100,000 threads [wikipedia.org] for ages.
That doesn't mean that using shared memory concurrency is a good idea though. When your computer comes with 10s or 100s of cores you'll realise that maybe SMP wasn't the best model of concurrency to choose. That's where models such as map-reduce, Erlang's shared nothing concurrency, message passing, and MPI come into their own. Even today they are useful because you'll be able to scale your program across multiple machines.
Rich.
Parent
Re:Scala (Score:4, Informative)
I quite agree that shared memory concurrency is a bad idea, however. Unfortunately, until you have message passing instructions in the hardware, you're stuck emulating message passing on top of shared memory, which leads to cache coherency issues and a host of other problems.
Parent
Re:Scala (Score:5, Informative)
Linux threads stopped using the LDT on x86 in 2002. This change went mainstream over subsequent years, and is nowadays always used on x86.
There was once a limit on the number of processes, too, due to each process having an entry in the GDT. That has long been removed too.
Parent
Why Erlang Matters (Score:5, Insightful)
1. Multicore ready.
Erlang will use them. Write your application in Erlang and it's done for you.
2. Scales well.
As an example, http://yaws.hyber.org/ [hyber.org] scales very nicely when loads increase. Your basic LAMP/LYMP setup runs much better on vanilla hardware.
3. Designed for telecom
The architects designed the language to run in a telecom environment so things like upgrades can be done while the application is running.
Yaws in particular needs your help. Failover clustering inside the yaws server would be wonderful. Right now, it uses CGI to process other languages. It does it flawlessly, but a more direct solution might be a nice project.
Why Erlang doesn't matter (Score:5, Interesting)
1. Invariable variables.
This appears to have been done for no reason other than the designer's preference. In fact, it's not strictly true -- variables can be unbound, and later bound. They just can't be re-bound once bound.
2. Weird syntax.
Why, exactly, are there three different kinds of (required) line endings? It seems as though the syntax is designed to be as different from C as possible, while maintaining at least as many quirks. Moreso, even -- when constructing normal, trivial programs, you're going to hit most language features head-on and at their worst. Where's my 'print "hello\n"' that works most other places?
I don't believe the important features of Erlang are mutually-exclusive with the sane syntax of, say, Ruby or Python.
3. Not Unicode-ready.
Strings are defined as ASCII -- maybe latin1. But there's no direct unicode support in the language -- if you're lucky, there are functions you can pipe it through.
There are other things I haven't mentioned, mostly implementation-specific -- things like the fact that function-reloading cannot be done when you natively-compile (with hipe) for extra speed. My plan is to take the features I actually like from Erlang and implement them elsewhere, in a language I can actually stomach for its real tasks.
Parent
Re: (Score:3, Interesting)
Though I agree with you on 2 and 3, I'm not so sure about 1, but I might be wrong on that. As I understand it, you should look at variables in functional programming languages like Erlang more like those in a mathematical formula; such programs can be proven correct a lot easier, and since variables are effectively immutable, it facilitates forking the line of execution in a way that would not be possible without all kinds of semaphores and other concurrency stuff than if variables where not immutable. You
Re:Why Erlang doesn't matter (Score:5, Interesting)
As I understand it, you should look at variables in functional programming languages like Erlang more like those in a mathematical formula; such programs can be proven correct a lot easier, and since variables are effectively immutable
All of this is based on the premise that Erlang is a functional language. It's not purely-functional, and I just don't see the point of doing it half-assedly. Erlang is effectively an imperative language dressed up like a functional language.
And they're not immutable -- they can be unbound. As I understand it, this unboundedness is detected at runtime, not compiletime. If it was detected at compiletime, you'd have a valid point.
it facilitates forking the line of execution in a way that would not be possible without all kinds of semaphores and other concurrency stuff
Except that's not how Erlang does concurrency. It does concurrency with explicit "processes" (green threads) and message-passing.
Now, it does make these very easy, and you can get it to distribute processes among a few real OS threads (one per core) -- so it's still very cool. But you're thinking of languages like Haskell, which can be automagically threaded. Erlang is manually threaded, it's just much easier to think in threads (or "processes") -- they're effectively a language feature.
Parent
Re:Why Erlang doesn't matter (Score:5, Informative)
1) Actually, there are quite a few good reasons for this, largely around the complete elimination of mutexing and locks. Just because you don't understand the purpose doesn't mean there wasn't one.
2) Oooooh, a language is faulty because it has a syntax with which you are not familiar. Immediately kill all non-Java clones!
3) They're just lists of numbers; they're neither ASCII nor Latin1. There is unicode parsing in the XMERL module.
Please wait until you know a language before criticizing it.
Parent
Re:Why Erlang doesn't matter (Score:4, Informative)
Actually, there are quite a few good reasons for this, largely around the complete elimination of mutexing and locks.
...What? No, the elimination of mutexing and locks is made possible by a shared-nothing architecture.
Oooooh, a language is faulty because it has a syntax with which you are not familiar.
Hey, I mentioned Ruby. I don't mind LISP, either.
The point is not that the language is unfamiliar, the point is that it's inconsistent (and unfamiliar) for no good reason. I use English, but I could make a lot of the same criticisms about it.
They're just lists of numbers;
In that case, the argument becomes, "Erlang has very poor text-processing, if any at all."
If Erlang has text-processing functions that are designed to operate on these "lists of numbers", then yeah, it's pretty much going to be ASCII. And how are Erlang source files read? Could be "neither ASCII nor Latin1" if you like, but they can't be Unicode unless the parser is actually Unicode-aware.
Parent
Re:Why Erlang doesn't matter (Score:4, Informative)
Actually, there are quite a few good reasons for this, largely around the complete elimination of mutexing and locks.
...What? No, the elimination of mutexing and locks is made possible by a shared-nothing architecture.
Oooooh, a language is faulty because it has a syntax with which you are not familiar.
Hey, I mentioned Ruby. I don't mind LISP, either.
The point is not that the language is unfamiliar, the point is that it's inconsistent (and unfamiliar) for no good reason. I use English, but I could make a lot of the same criticisms about it.
It's not that it's syntax is /inconsistent/ Erlang is actually incredibly consistent, it's just very different. Once you learn the 3 or 4 quirks that separate it from other languages those 3 or 4 quirks are very consistently applied.
Take for instance the punctuation (not line ending characters as is suggested).
Commas separated arguments in function calls, data constructors, and patterns. Periods separate functions.
Semi-Colons separate clauses. (this is the trickiest, but can be thought of as signifying the existence of multiple cases of pattern matching).
They're just lists of numbers;
In that case, the argument becomes, "Erlang has very poor text-processing, if any at all."
If Erlang has text-processing functions that are designed to operate on these "lists of numbers", then yeah, it's pretty much going to be ASCII. And how are Erlang source files read? Could be "neither ASCII nor Latin1" if you like, but they can't be Unicode unless the parser is actually Unicode-aware.
Parent
Re:Why Erlang doesn't matter (Score:5, Insightful)
Yes.
Where is Lisp today? Smalltalk?
On the other hand, languages that offered the same features with a familiar syntax have taken over the market.
Parent
Re: (Score:3, Interesting)
http://www.paulgraham.com/avg.html [paulgraham.com]
Re: (Score:3, Insightful)
Well no to use the advantages of these esoteric languages today and not 25 years from now. The people using LISP in the 1950s hard garbage collection, reentrant functions, complex data structures....
Re: (Score:3, Insightful)
But there's no direct unicode support in the language -- if you're lucky, there are functions you can pipe it through.
3) They're just lists of numbers; they're neither ASCII nor Latin1. There is unicode parsing in the XMERL module.
Which is exactly the problem that GP discussed. There is a huge difference between a true in-language support of Unicode (such as the one in Perl) and just "the Unicode parsing library". In Perl there is a difference between "string of bytes" and "string of characters", and this distinction is made when the string is created (i.e. in the I/O layer when it is read from the file, or in the source code pragma when it is a literal constant). And then all things work as expected (conversion between upper and low
Re: (Score:3, Informative)
1. Invariable variables.
This appears to have been done for no reason other than the designer's preference. In fact, it's not strictly true -- variables can be unbound, and later bound. They just can't be re-bound once bound.
On the contrary, there are very good reasons for having single-assignment variables. It makes the code more similar to plain mathematics, which makes it easier to reason about, and significantly reduces the number of programming errors. And you don't have to take that from me - there are some 20 years of experience at Ericsson and elsewhere with writing huge telecom applications in Erlang.
2. Weird syntax.
Why, exactly, are there three different kinds of (required) line endings? It seems as though the syntax is designed to be as different from C as possible, while maintaining at least as many quirks. Moreso, even -- when constructing normal, trivial programs, you're going to hit most language features head-on and at their worst. Where's my 'print "hello\n"' that works most other places?
I don't believe the important features of Erlang are mutually-exclusive with the sane syntax of, say, Ruby or Python.
The syntax is certainly different from C, Ruby, or Python, but this is because it is derived from the Prolog syntax. Fur
Comparison of functional languages? (Score:5, Interesting)
I think the summary (and article) are somewhat poorly written, but that doesn't shadow the fact that functional languages are becoming more and more interesting these days with concurrency becoming so important.
I'd like to learn one, but there are several out there.. What I'd like to see is a good in-depth comparison of different concurrent functional languages: why would I choose Haskell, or Erlang, or OCaml, for example? Are they all interpreted? (Does one exist that compiles?) Which ones support concurrency? What language features do they boast, and what are the advantages and disadvantages of these features? Do they have a complete set of libraries?
Anyone know of an article like this? I've been searching for a while. Every article on functional languages I've found seems to concentrate on a particular one, but I can't find something helping me decide which one is most worth learning.
Re: (Score:3, Insightful)
Brief answer:
All three languages have both interpreters and compilers (ocaml is part of the base distribution, haskell has a number of compilers, and Erlang apparently has a compiler)
They all support concurrency, all in slightly different ways. They all have a lot of libraries.
Ocaml is sort of a functional language that includes object oriented features and also has very good performance numbers. It allows mutable updates, including arrays and references. For threading I believe it has the usual mutexes and
Re:Comparison of functional languages? (Score:5, Interesting)
OCaml compiles down to native code, which about 10-20% slower than C. Faster than C in a few (narrow) cases.
Haskell is also compiled to native code, but difficulties with the execution model mean it's pretty slow for any practical use.
Erlang is interpreted - the execution model is similar to Perl or Python - which means its slow on single cores, but of course the whole point of Erlang is to run in highly concurrent, distributed machines. There is a project [google.com] to use OCaml for the performance-critical, single threaded parts, and Erlang for coordinating the parallelism.
Of course, this is probably missing the point. Unless you're doing intensive numerical work, you probably don't need the performance. The real advantage of these languages is how your code will be much smaller, easier to understand, safer, and faster to write.
Rich.
Parent
Re: (Score:3, Informative)
Deceptive (Score:2)
>The two biggest computing-providers of today, Amazon as well as Google, are building their concurrent offerings on top of really concurrent programming languages and systems
Google is largely a C++ company, a language that doesn't include explicit support for concurrency (although the next version, C++0x, will).
They mention erlang only being used in a relatively small project that most of google's own software doesn't support yet.
Note, that google gears is used in the excellent google reader software (al
Re:Deceptive (Score:5, Insightful)
Actually, Gears doesn't use Erlang either. What he means is that Gears threading doesn't allow for shared state (is it really threading then?). Instead threads communicate back to the browser by message passing.
It's remarkably deceptive indeed to even imply that Gears and Erlang are connected. Message passing based concurrency isn't exactly new or limited to Erlang, and can be implemented in any language.
I'm not sure what the point of this piece is. I've looked at Erlang and didn't see much of anything to get me excited. It's a functional language, which like most of them have unnecessarily weird syntax and force immutable state. I don't really see what this buys you over a language like D 2 (or hell, even C++) in which you can write in a functional message passing style if you like, but then still use imperative shared state whenever useful, convenient or performant.
Parent
Gibberish (Score:5, Insightful)
If you want to build computing into a utility, you need large real-time systems running as sufficiently as possible.
But if you want to build sprockets into a weasel you need small batch-mode systems running as necessarily as possible.
If the poster had anything interesting to say (I'd guess not, but who knows!), it was totally obscured by his lack of grasp of the English language.
Too late (Score:4, Funny)
Given that this statement appears almost halfway through the blog post, I would say that it was already too late for that.
Stupid article (Score:5, Informative)
Wow, it's not often I strongly criticise articles around here, but that was total garbage.
For the smart ones that didn't RTFA, here's a quick summary:
For the record, I work for Google and we don't use Erlang anywhere in the codebase. Google Gears restricts you to message passing between threads because JavaScript interpreters are not thread-safe, so it's the only way that can work. Visual Basic threading works the same way for similar reasons. It's not because eliminating shared state is somehow noble and pure, regardless of what the article would have you believe, and in fact systems like BigTable use both shared-state concurrency and message passing based concurrency.
The article says this:
But in fact the Google search engine, which is one of the larger "industrial-grade, internet-grade" systems I know of, is written entirely in C++. A language which is much the same as it was 10-15 years ago. Thus the central point of his argument seems flawed to me.
Seeing as the article is merely an advert for Erlang, I'll engage in some advocacy myself. If you have an interest in programming languages, feel free to check out Erlang, but be aware that such languages are taking options away from you, not giving you more. A multi-paradigm language like version two of D [digitalmars.com] is a better way to go imho - it supports primitives needed to write in a functional style like transitive invariance, as well as a simple lambda syntax, easy closures and first class support for lazyness.
However it also compiles down to self-contained native code in an intuitive way, or at least, a way that's intuitive to the 99.9% of programmers used to imperative languages, unlike Erlang or Haskell. It provides garbage collection but doesn't force you to use it, unlike Java. It doesn't rely on a VM or JIT, unlike C#. It provides some measure of C and C++ interopability, unlike most other languages. And it has lots of time-saving and safety-enhancing features done in a clean way too.
Re:Stupid article (Score:4, Interesting)
I'm not going to disagree with most of your post, I think you're spot on. However, your suggestion of D is totally off. I like the D programming language quite a bit and version 2 is going to be really cool. However, even version 1 of D is not ready for prime-time [timburrell.net]. Version 2 of D is unstable and not recommended for production by even the author himself. All of the other languages you mentioned such as Erlang or Haskell are much more mature.
Also, "most other languages" have a foreign function interface for C, including Erlang, Haskell, Python, Java, Perl, Ruby, etc... In fact, I can't think of a well known programming language actually used by people other than the author that does not have an FFI... It is true that in most cases the FFI of other languages is more difficult to use than the one in D, but they are there.
Parent
Re:Stupid article (Score:5, Interesting)
Yes, D is very young and has problems. But then again, what language didn't? It's easy to forget but Python was first released in 1991. It took many years before it became mainstream (and some would say it's still not there yet).
The post-mortem is an interesting document, but I disagree with the authors conclusions. The compilers are buggy, well, C++ had exactly the same problem for a long time but still was a huge success. In particular, the trend seems to be basing new compilers on LLVM, which has a pretty robust optimization core. Frontend bugs are by comparison pretty trivial and easy to fix. Another few years and I think this problem will be licked - and besides, lots of C++ code has workarounds for compiler issues. Same thing for class libraries.
You're right about C-level FFIs. However D provides a simple C++ FFI which as far as I know is unique. Such a thing would be very useful for a company like Google which has a lot of C++ code, as it'd simplify binding considerably (I don't mean to imply anything about the future direction of the codebase, by the way).
The argument about parallelism is a more interesting one. But I disagree with that too :) D provides exactly what is needed for automatic sharding of work across cores (or machines). Specifically the combination of transitive invariance, reflection and purity enforcement is a very powerful one.
Essentially, if you can write your code to consist of non-trivial trees of pure functions, then it's perfectly safe to parallelise something like this:
foreach (item; list) {
fooResults[item] = someTransform(item);
barResults[item] = anotherTransform(item);
}
If someTransform and anotherTransform are both pure, by implication their parameters are transitively invariant, and thus they can both be invoked in parallel (because the compiler knows "item" can't be changed). What's more both calls can be invoked simultaneously as well.
Once the compiler knows these things, making this code run in parallel is simply another compiler optimization. That's the whole theory behind how functional languages can be super easy to parallelize. But in fact the key concepts can be applied to imperative languages as well, with the advantage that you can still have temporary mutable state within the function scopes - you just can't modify the heap, or anything reachable through your arguments.
D has keywords that let the compiler know and enforce function purity.
Now as it happens I doubt that any D compiler today implements this optimisation - it's sophisticated and transitive invariance is newly introduced in D2. But all the pieces of the puzzle are there. This also lets the compiler do calculations on data structures available at compile time.
Parent
Re: (Score:3, Insightful)
Sounds great in theory but in the real world you don't get m
no new language needed (Score:5, Insightful)
Erlang is a language that has all the right properties and mechanisms in place to do what utility computing requires.
Well, except that it's darned inconvenient to actually write the applications in it.
Google Gears is using Erlang-style concurrency, and the list goes on."
Yup, and it makes more sense to add "Erlang-style concurrency" to existing languages than to throw out everything and switch to Erlang.
Wings 3D is written in Erlang... (Score:3, Interesting)
Andy
"Cloud computing" is an Xmas artifact (Score:5, Interesting)
The enthusiasm for "cloud computing" may evaporate when Xmas rolls around.
I went to a talk at Stanford by the architect of Amazon's web services. It came out in questioning that the real motivation between Amazon's low-priced web services is that their load in the Xmas shopping season is about 4x the load for the rest of the year. Their infrastructure is sized for the November-December peak, so for ten months of the year they have vast excess capacity. That's why Amazon's web services are so cheap.
Don't expect good response time during the shopping season. Although this Xmas might be OK, due to the recession.
Re:"Cloud computing" is an Xmas artifact (Score:4, Insightful)
While the origin of EC2 in 2006 is certainly related to peak capacity requirements at Amazon, it is certainly way beyond that point now.
Two Christmas seasons have come and gone without major capacity problems on EC2.
The reality is that EC2 has grown far beyond its roots as a way for Amazon to amortize their peak capacity by reselling it and it has turned into a small but growing profit center and publicity success for Amazon.
Parent
Erland textbook: read and adapt (Score:3, Interesting)
Going from Perl to Erlang, eh? (Score:3, Informative)
TFA more or less says that IMDB is switching from Perl to Erlang. So I looked at the link and here's what I got:
(From here [computerjobs.com]
We are looking for developers with experience building web scale distributed systems. We are currently working in Perl but have plans to use Java, Erlang and any other language that we think will suit our purposes. We aren't looking for expertise in any of those, particularly, but we expect that you will be an expert in the systems you know. We do require that you be passionate about testing (unit, integration, fault-injection) and code quality. Experience with relational databases (Oracle, MySQL, etc), embedded databases (BerkeleyDB, CDB, MonetDB, etc) and Linux are a big plus.
I'll leave anyone to draw his own conclusions.
Facebook chat (Score:3, Interesting)