Sunday, September 19, 2010

Java is not a good first language

I’ve found that the only way to get back to blogging is by writing some random posts. After a hiatus, it somehow becomes very difficult to write cohesive posts on a single topic. The words don’t come out as easily, the sentences don’t seem right, and I spend inordinate amounts of time trying to come up with enough material for my post. Finally, unable to come up with something I like, I abandon the post and add it to an ever-growing list of drafts that I never re-visit.

Anyhoo, for today’s random post, I talk about the threat posed by Java schools to the software profession and the threats that human rights activists, perverse secularists, and the UPA pose to this country.

First, Java schools. While there is long on-going debate in sociology on whether language determines thought (what is called the “linguistic relativity” principle, or the Sapir-Whorf hypothesis), that question in Computer Science is settled. In the world of programming, language does determine thought. Which is why Stroustrup had put in warnings for C programmers to think object-orientedly. Which is why most imperative programmers take refuge in the := and ! operators of SML when they first encounter functional programming. The language you program in makes you think a certain way. The trouble with teaching Java as the first language is that there is no way teachers can teach the way to think in Java (the object-oriented way) without the students undergoing a course in procedural languages. As a result, students end up thinking in way totally unsuited to the language – a class becomes something you create because the language forces you to, and a place where you dump everything that your program needs. Objects become mechanisms to get at the members without the notion of what it means to instantiate something. Access control mechanisms become a unnecessary distraction – if you want a class to access a member in a different class, make the member public. The signature of “main” - “public static void main (string [] args) introduces problems of its own that have been detailed in ACM papers here and here. Beyond these problems, Java abstracts away some of the most critical concepts in Computer science – pointers, memory management, the idea that resources are finite, and the dilemmas programmers face in using and implementing data structures.

The trigger for this rant is a bunch of interviews that my teammates and I are conducting to hire summer interns. Our target pool are the students from the IITs, most with very high GPAs (think 8.8+/10) and two years into their B.Tech courses. All of them have done 3-4 academic projects, and some of them, internships in places like Adobe. Due to their relative immaturity, what we look for are programming basics – choosing data structures, programming recursive algorithms, and some Computer science fundamentals in areas like networking, OSes, or OOAD (depending on the students’ coursework). To our surprise, we found that while most of the students knew the commonly used data structures, very few could choose the appropriate one for the problem at hand. Still fewer got the implementations right, and most wrote code as though they were in programmer utopia – infinite memory, infinite resources, and nothing bad ever happening in the environment. The bigger problem was that not one bloke I interviewed was able to find and debug problems in their code.

There also seems to be this idea gaining ground, particularly amongst the Phd-types that teach courses at the IITs, that programming is secondary to Computer Science, and what matters is “researchy” topics, like IR, machine learning, large scale algorithms or such. At the same time, there is tremendous pressure on the IITs to evolve from being teaching institutions into research institutions. Because they cannot attract talent at the graduate level, it seems professors are taking the route to make the undergraduate program a research program. Good programming techniques and sound concepts of software engineering become “implementation detail”. And writing good code, tests, and having the ability to debug programs is lost on a whole generation of Computer science undergraduates.

Or maybe I’m just being paranoid.

[Postscript: I realized rather late, that this post had become too huge for me to add my second rant. I’ll be posting it separately.]

No comments: