Saturday, August 21, 2004

Friday, August 20, 2004

Why Java isn't cool : triggered by "Great Hackers"

Paul Graham's Great Hackers essay has really touched a lot of people's nerves. The wires are choked with people giving their point of view -

[stolen shamelessly from Erik's LinkBlog]
And of course, no one puts it quite like him whether you agree with him or not .

Yet again, though, I have had to stop and think - what is it about Java that makes people brand it as the most un-cool language on earth? I have had friends look at me like I was a poor sod for "having to" develop in Java. So, let me list all the reasons I can think why people consider Java un-cool.

Java has considerably fewer surprises and prefers not to add complexity to the language for rarely used features thereby resulting in a language where you cannot really make your friends go ga-ga at amazingly brief programming constructs. You need to write something substantial [like Gosling's Huckster] for them be to impressed with your programming abilities and not your language knowledge. This is probably the biggest reason Java is un-cool. It's too easy (although programming or software development remains as tough as ever).
Java was always touted as the language that the "average" IT programmer can use. It's such a language-for-the-masses that yet again, it fails the "geek" test. And if you use Java, so do you.

Java has been considered slow for ages. The earlier allegations (1995) were true. However, with the recent advancements in the JVMs from Sun and IBM, Java runs pretty close to C/C++. Check this benchmark. Contrary to this, there are other benchmarks that prove that Java is slower. All considered, it would be fair to say that Java cannot be considered "slow" anymore, yet its stuck with the label. How cool is to be the jock with the second fastest race-car in the block?

Swing disasters continue to give Java a bad name. Swing is a brilliant, although hard to learn, API. But the vast majority of Swing applications are so bad that they give Swing and therefore Java a bad name.

Java is a strongly typed language therefore you have to tell the compiler exactly what you intend to use. And if you make a mistake in the way you use it, the compiler has the guts to tell you that you were wrong. Too much chaperoning?

Java has a vast library that is available to all Java developers without any ambiguity. Thus, if you wrote yet another Map you would not be considered a data structures guru by Java programmers but a guy who hasn't heard of java.util.*.

Java did not have a good IDE that compared with MS Visual Studio. I think this one was true. I am not so sure it is any more with IntelliJ. The absence of good tools probably pushed away a lot of good programmers.

Java is popular. Anything that is popular has lost its elite status and therefore is not cool.

Java is an application programming platform. You cannot do cool things like device drivers and games, etc (until recently - but Java gaming is coming in a big way).

On a different note take a look at these two projects - I like the direction they are heading in.

Saturday, August 14, 2004

Teaching Object Oriented Programming

I have been training someone in Java over the last couple of weeks. This is my first serious experience teaching programming at a one-on-one level. Although, several times, I have been asked to give training or product demonstrations they were for an audience that had already bought into the concept and were simply looking for a human user manual. Not so here.

The person I am teaching knows C and can "think in C". He thinks of logic as a set of functions into which you pass data (mostly primitive types) which operate upon them and so on. I am trying to teach him Java and I am more interested in getting him thinking about objects, design, patterns, etc. I have, however, been having trouble showing him the benefits of OOP(Object Oriented Programming).

OOP has the following advantages:

  • Abstraction
  • Encapsulation
  • Inheritance
  • Polymorphism
Concepts like encapsulation can be grasped quite quickly by anyone who has burnt his hands with global data. But consider abstraction - this is not a concept that can be explained to or understood easily by someone who hasn't had to maintain and change the same software over a period of several years. Why should a user be represented by a User object and not a String userName? Why should there be a factory pattern used? The answers to most of these problems go like ... "If your specifications change in the future to include this feature also then designing it this way...blah blah blah". It is quite hard for someone whose programs have so far had a shelf life of a few laboratory assignment days to appreciate change management and the power of well designed Object Oriented Software.

Compile time polymorphism (function overloading)'s advantages are quite obvious. Perhaps, runtime-polymorphism can also be understood without too much trouble. It is however, Abstraction that is the core of well-designed software. Too much or too little breaks design completely. So, explaining abstraction continues to remain a challenge.

Proficient C programmers have their own way of making abstractions in C because they realize the value of abstractions to the human mind. I am not criticizing the C language. It is amongst the most expressive of computer languages ever devised. However, writing programs in the OOP paradigm using C is not something that comes naturally nor do most C books train in that manner. "Thinking" the OOP way is hard because one has to start thinking of <object>.operation(<params>) versus operation(<params>) that one is used to and this becomes a reasonably sized challenge for some.

I think there is a link somewhere between the proficiency in C and the ability to appreciate an OOP language. Once one has written programs in C that stretch one's mental programming capabilities where the complexities tend to cause that mist which makes one feel that things have gone horribly wrong, one starts to change one's way of programming (even in C). Resulting in an appreciation of what advantages "C with classes"(C++, Java) brings.

Monday, August 02, 2004

Experiences In Java Performance Engineering

I have been spending copious amounts of time trying to speed up our Web server. The results have been heartening and the experience typical of performance engineering. It's one of the most amazing activities of software development. Incredibly frustrating when you cannot speed it up but an adrenaline-rush when it does and the first person I bother when it works is poor Rajiv.

I am trying to log the various performance enhancements - some are just tips and some are things I discovered or used. Some you know, perhaps, some you don't.

    Sockets

  • Socket Writes : The lesser the better. Buffer your socket writes so you never inadvertently write multiple times to the socket. Copying to the buffer is much less expensive than a socket write especially for small data.
  • Socket Reads : Similarly, read as much as you can from the socket in one shot. Smaller reads will result in lesser performance. These rules are true of File streams as well.
  • Socket Connections : Again the lesser the better. Creating a connection is extremely expensive. You can do several ten times of requests/second more on a kept-alive connection than if you have to create connections.
  • Socket properties : Setting socket properties is expensive. e.g. java.net.Socket.setSoTimeout. Avoid setting socket properties per request, see if you can change them to per connection? [multiple requests in a kept-alive connection]
  • Data Copies

  • String Operations : If you are writing extremely performance sensitive code such as a highly benchmarked web server :-) then keep Strings to a minimum. I cannot stress this enough. If you have a large character stream that you need to tokenize, parse, etc, use references into character arrays to reduce String operations. Create a class that takes a pointer to this character array, its start and length and you can create lots of instances of this class to point to pieces of the character stream instead of String objects which make copies of the characters. String concatenations can be quite expensive too.
  • Avoid data copies : This is obvious, isn't it? However, there are so many methods that try to be safe and make data copies it happens without even one knowing it. Examples of classes that do it are - String , ByteArrayOutputStream.toByteArray(). Don't get me wrong - there's nothing wrong with these classes, it's just that sometimes one doesn't realize that these methods are causing data copies which is affecting performance.
  • Data Structures

  • Set instead of List : A common programming mistake is the wrong choice of data structure to do a contains() . List.contains() is an O(n) operation versus Set.contains() for HashSet has a best case behaviour of O(1). So, if order does not matter to you, use Set.
  • Object Pooling : Some classes are expensive to create. eg. large arrays. Use pools of these objects so they can be reused, there by preventing GC lags for these, and primarily creation costs. Interestingly, PrintWriter, if pooled, shows considerable performance improvement. The creation of the object is expensive because of a call to get the line separator in its constructor.
  • Miscellaneous

  • You can buffer at unexpected places. e.g. You may have a logger thread that asynchronously logs certain frequently running activities (e.g. access logs) and notifying the writing thread every time will be expensive. Might help considerably to collect a few and then notify the thread.
  • Integer.toString() is much much faster than integer + "".
  • try-finallies result in interesting performance degradation in Sun JVMs. Read about it here and Rajiv's excellent follow-up on it here.
  • Lazy instantiation : Don't create something until you need it.
  • The usual optimizations always apply - loop optimizations, moving loop-invariant code out of the loop, unused variables, repetitive processing, etc.
  • Perceived performance : [I would love to collect my thoughts on that one sometime - we have had some very interesting experiences with perceived performance enhancements over the years ] The faster you respond so that the browser starts to refresh [if memory serves me right, a 50 milli- second response is perceived to be instantaneous by a human being] the more responsive the web-server looks although the total amount of time for the response might be the same as in the case if all data was returned in a single write.
Oh well, those are all I can think of right now. If I think of any I missed out, I'll post an addendum.