Random Thoughts

Tamil Nadu Board also "moderates" scores and inflates them to grab BCom seats in DU

posted Feb 12, 2017, 10:49 PM by Prashant Bhattacharji   [ updated Feb 12, 2017, 10:51 PM ]

Nice data presentation at ReportBee.com

Score distribution in Tamil Nadu Class 12 HSC Examination (Board exams of 2016).
This is the story of how Tamil Nadu out-gamed CBSE at their own game in bagging DU BCom seats at premier colleges like SRCC.
Check the shape of the Commerce/Accounts distributions vs Chemistry, CS.
Early spike I can understand - that's just before the pass mark. This is how we inflate away our bad news.
Then we will complain of "cultural academic bias" when we rank 70 out of 71 in a basic elementary school test like PISA.
And our NGO/Activists will write ramblings about how not to benchmark ourselves using "stress-causing" tests.
I am okay with liberal grading and assessment on a non-overloaded syllabus - just that the numbers shouldn't be edited and inflated arbitrarily else there's no good way to be sure about what a thermometer is telling.

Computer Science

Padma Vibhushan for Murali Manohar Joshi: A very well deserved award

posted Feb 9, 2017, 12:18 AM by Prashant Bhattacharji   [ updated Feb 9, 2017, 12:31 AM ]

Murali Manohar Joshi might have had his share of some silly ideas for sure (Astrology) but few people contributed to UG capacity creation in the 2000s the way he did though he drew much flak for the hurry. And everyone has their own fair share of bad or silly ideas - we should look at their broader contributions.
Awarded deemed university status and liberalized the regime and allowed the private sector to open up capacity creation for tens of thousands of students who probably don't know remember to thank him - SRM, VIT, DAIICT, Amity, Amrita, Sharda, Nirma, BITs Pilani Goa/Hyderabad campuses etc.
Many players were terribly substandard. But when even our top universities have extremely questionable quality most of the time, that's a very different issue. Even a poor quality degree can give placebo confidence versus sending a teenager into panic mode.
Tens of thousands of students from these universities went for their MS etc. to good US universities and benefited from jobs which they otherwise wouldn't have landed up with.
Even the IT services and startup expansion, product biggies like MS, Google, Oracle office expansion post 2007 or so wouldn't have been possible without the critical mass of workforce from universities created in his regime. This was one award very well deserved. But now there's a problem. Players setup in that time have a monopoly and no incentive to compete or improve because the liberalised regime went with him.
Sometimes recall matters more than precision.

This is a very well deserved Padma award.

In support of Op-India and Rahul Raj

posted Feb 4, 2017, 10:43 PM by Prashant Bhattacharji

Since 2013 or so, thanks to a traffic build up on this site, I ended up drawing attention of journalists from several media houses. 
For the first month or two I didn't mind the attention in fact I quite liked it for the sudden traffic surge it brought to my site. 
Low integrity news-traders is an apt description for most of the journalists I dealt with. 
I was wondering whether I should name the media houses but I probably should - otherwise I end up throwing everyone under the same bus. 
On several occasions reporters from TOI, Hindustan Times and CNN-News18 have either got in touch with me about data on my site, or covered it without any kind of verification or spoken to me. Most of the time I make it quite clear - use whatever you like, but make sure to give the reference link. They don't. 
The ones who were honest enough to credibly cite the source and links, were DNA, FirstPost, News Minute (slightly lesser known than the first group). 
None of this is a big deal to me but it just a good reflection of the integrity levels of the journalists. 

The UPSC Exam is not just an inefficient process: It is statistically flawed and biased

posted Jan 18, 2017, 10:21 PM by Prashant Bhattacharji   [ updated Jan 18, 2017, 11:20 PM ]

Poor quality of exam design in India 

Very little attention is paid to the design of examinations and entrance criteria in India despite the fact that these exams have a huge economic cost as they take up a significant portion of preparation and study time in a young person's life. And at the individual level, it boils down to an issue of fairness. And the issue of optimizing the exam such that testing is done at rigorous or challenging level, but it shouldn't lead to an energy drain (as is often the case with the JEE). 
This holds true in different ways for different exams. It could be brazen cheating in the board exams (which gave us our celebrity toppers from Bihar) or it could be institutional level manipulation of scores such as the CBSE or ICSE examinations, or the IIT JEE examination which is conducted with fairness but encourages an ecosystem of questionable academic pressure
Certain kinds of bias are unavoidable. For example: a product of the CBSE/ISC system has a significantly better shot in an entrance exam after Class 12 and that is a function of his/her access to higher quality schooling which led to better academic preparedness than a state board student. The geographical distribution of those who qualify the JEE are an indicator that the exam currently requires a little too much of third party help. Analysis of CBSE scores shows blatant and selective inflation of Delhi scores presumably to facilitate their admission to DU
Poor attention to the statistical aspects of the scoring process lead to disastrous consequences in the long term. 
For example, grade distributions across elective subjects should not be dramatically different. With inflated scores in a few subjects like Physical Education, you find a sudden rise in the number of Class 12 students choosing it over subjects like Economics where scores aren't awarded so generously. Consequences of unnoticed statistics explode in the long term. From less than 1 percent in 2004, the latest CBSE Class 12 exam had 8% students scoring 90 and above in the recent 2016 Grade 12 exams, also sending all other boards into a disastrous spiral of grade inflation and dilution. One in 8 students scored 10/10 in the CBSE Class 10 exam in 2016. 

Where exactly does the UPSC go wrong

In 2016 when the whole JNU fiasco took place one comment I repeatedly saw bandied out by those rushing to the defence of JNU was the number of IAS officers it contributed to the country.
This isn't to take a dig at JNU - I didn't think much of the government overreaction and crackdown on a band of ruffians but that isn't the point of focus here. 
The question which came to my mind is this: it was odd that JNU should be a fore-runner in the UPSC process. Unlike IITs or IIMs or top DU colleges, JNU is not a standard destination for those with a good academic record. Why was the UPSC exam throwing up such a distribution? 

A few of my batchmates took the UPSC and I remember some of them saying that they dare not opt for Physics or Engineering subjects as a subject as those are extremely tough. 
Clearly, success at the exam was a function of which elective subject folk opted for and that is quite evident in the list over here. So apart from some compulsory subject(s) X one candidate opts for elective subject  B and another opts for elective subject C. 

How does UPSC award scores in the elective subject component? Physics, History, Literature etc. are not just different subjects with different script markers - they also have very different student populations appearing for them. This last bolded part is something which I am starting to suspect is wilfully ignored and hidden for scrutiny from plain sight by the group which dominates the exam and also controls the narrative. 

I couldn't find anything clear or transparent about what kind of statistical process the UPSC was using. This is not an easy problem to solve for multiple reasons. 
A fairly complex curve anchoring process is required to assign ranks in a linear list to students who have different subjects. 

The Great RTI Law and Smriti Irani's marksheet

posted Jan 18, 2017, 12:05 AM by Prashant Bhattacharji   [ updated Jan 18, 2017, 12:26 AM ]

I don't think much of the RTI.
All sorts of trivia and frivolous information like this gets unearthed in a very inefficient and costly process on both sides via RTI.
MMS got Bose speakers as gift and what not. While real data and info like updates which the RBI should give out, are hidden from public scrutiny.
The RTI is actually a horrible law passed by the UPA and loved by the NDA and promotes information hiding more than anything else.
Most of the data requested via RTI falls under very predictable categories (school results, roads under construction, property ownership, government scholarship schemes, contracts) and should be out in the public domain in any case but I guess it has been deliberately hidden from pesky folk like me who will try to pry into it to dig up dirt :)
Can you imagine, people get killed and murdered for filing RTIs and then they get hailed as activists who "gave their life" for information and data which was supposed to be public anyway. And then the government will release data after they were killed!
You won't find a single good critique of this law by Indian academics who sit and do some worthless hocus-pocus research in the name of public policy research whether it is DU or JNU or IIT or IIM. That does not surprise me one bit because they will all be subject to the kind of scrutiny which will make them extremely uncomfortable. Academics go to all sorts of extremes to hide their data.
Someone suspected an issue in IIT admissions and requested for their JEE data and IIT provided him a hard copy of several hundreds of printed pages of their data to reduce all chances of his processing it.
Similarly there was some case in which the student got grades after his class 10 CBSE exam, but his parents wanted to see the underlying marks and used RTI for it, and then the board sends a marksheet without any marks because the RTI only asked for the marksheet.
Repeal this law and come up with a new law around open public data and public discourse. That'll be far more efficient. Or, modify the current law by identifying the most common classes of information currently sought and keep that data, anonymized if required, open to the public. Statisticians and researchers might be able to deliver useful insights from this.

Girls Quota at IIT: Immediate fixes required to the exam itself

posted Jan 16, 2017, 8:52 PM by Prashant Bhattacharji   [ updated Jan 16, 2017, 9:19 PM ]

This is the first time any government has shown a genuine interest in solving a very real problem. We need to appreciate this part.
They're trying to push a 20% horizontal quota for girls. In a case like this, I prefer some hurried action to none at all.
Only thing is, how do you go around this. They should think through the consequences of super-imposing a horizontal quota route on a maze of both horizontal and vertical quotas which is already very complicated.

It is high time that a critical analysis be done of the statistical distribution of the JEE scores.There is generally an extreme bunching just around the cut which is often quite low (40 percent). A bit of toning down of the exam will do no harm and will possibly shift the curve both rightwards, result in a better spread, make the process less brutal and leave students with more energy at the end of the day. The coaching classes exist because of the nature of the JEE.

Operational issues with implementing a quota

This will give fractional seats under some headers and categories. This will also mess up all sorts of edge cases: What if 6 girls are getting BioTech at IIT-X and the quota has only 3 seats specified?
What if, for whatever reason, the exam naturally throws up 25% girls in the merit list?

The Short Term Fix and the current biases in the IIT JEE
The easier way in the short term is to do something simple like bump up all girls by X marks till the point that their selection rate balances out with the boys, or reaches a certain target %. But that is only a temporary hack masking some serious flaws in the process.

The JEE process does have certain stark but avoidable biases. There are some unavoidable biases - eg. someone once write a piece about CBSE/ISC students having 10x the success rate of local boards. That you can't do much about, because its a function of academic preparedness at the end of Class 12 which central boards, Andhra, WB etc. possibly do a far better job with. And socio-economic strata of central boards.
But when the current pool sees an over-representation of specific schools, a few big cities (Delhi, Hyderabad, Jaipur, Lucknow) or areas with access to expensive classes (Vizag, Kota) there is indeed a real problem. And that is the need for extra classes or guidance or concentration camps for preparation.

There are indeed some very real problems by using a quota route for an exam where there are some genuine underlying problems: you could inject unfair bias against and unintended group. For example, Kerala has very little representation geographically. So by having a girls quota you could end up making the process biased in favor of (say) Hyderabad or Delhi Girls over Kerala or North-East boys.

Saying that the girls seat are "extra" doesn't change anything - there is no point pretending that the seats are 100 when they are really 120. The noticeably biased trends of the JEE lie in the nature of the exam itself.
Apart from the burnout the current process causes the reason why this extra class affair is problematic because it implicitly creates privately guarded gates to a taxpayer funded institution.
Now you have the absurdity of folk making a killing out of what is literally a child abuse process by advertising classes for Class 6-8 and what not.

How can the JEE be upgraded?

Simple fixes to make this process less inefficient and unfair without compromising on the strong point of the JEE (forcing people to think and apply first principles):
- remove this communist regime level of secrecy. Right now folk need to go to these coaches to get an idea of what needs to be studied. Why not just upload a 500-1000 page book with the required topics and problems and let folk download it and print it. Half the need for middlemen goes right there.
- length of the question paper. Is it really necessary to have so many questions that even those making the cut attempt barely half of it. This creates a requirement for all sorts of tips and tricks and strategy (and again a role for middlemen). And none of these are likely to be academically desirable traits. What serious engineering or research gets done by super-quick thinking.
- From what I remember Math/Physics part of JEE was quite reasonable in terms of coverage but I remember the Chemistry part requiring absurd levels of stuff to be memorized. Again, this is wasted academic energy for no good reason.
- Not sure if this has changed, but in this kind of an era Math could do with a few more topics covering basic stats, some form of algorithmic thinking (basic graph theory)
- Language! Though this probably has some political background, how can you not have something as fundamental as language skills and vocabulary tested. Far more important than some weird chemical compound's configuration. Doesn't necessarily have to be an English test.
- While there are huge issues involved in exams requiring narrative or descriptive questions, MCQs have certain limitations. Why not just mark 10 of those MCQs as problems for which candidates clearly need to show their working apart from ticking the correct option. The top 15k can then have those hand written sections manually graded. Along with an essay if possible.
- Proofs. There are ways in which proofs may be reduced to objectively gradable questions. Even if the questions are simple, proof techniques are an important aspect of scientific education and should be evaluated. Any paper setters reading this - do check the automata course on Coursera.
Just because a topic is included in the JEE should not mean that you test outlandish material based on it. By all means, have challenging questions and a few olympiad level questions. But the overall difficulty of the question paper should be a couple of notches more than Class 12 and it should be possible to at least, attempt most of it within the time limit. As of now, a kid needs to behave like a real time calculator to compute answers like a robot.
- Maybe it is time to have a cheat sheet with essential formulas provided along with the question paper? One shouldn't have to recall some one-off formula for a tangent or normal at a particular point on a parabola.

All for the better

Once some turning and tuning of the knobs is done and the necessary demystification is done and the exam comes down to a point where some intense self-study for 12-18 months is all that is required everyone will stand to benefit.
- those who get in won't be so exhausted
- those who haven't got in will have less invested in the process and can pick up and move forward without prolonged dejection
- everyone has more academic and non-academic energy left
- the process will become more accessible and naturally throw up better representations in terms of geography, gender, state boards etc.
- the process will also pull in more kids from higher end schools with better exposure who currently don't see a good reason to grind themselves.
- it might indeed be the case that the students from very similar backgrounds and profiles continue to enter the system, but that is fine as long as they come in with better attributes
- the strong point of the JEE is, that it manages to pick the Olympiad winners of IOI, IMO, InPhO, InChO etc. Nothing in my proposal changes its ability to do so. It will perhaps leave them less tired as well.

And while this may or may not detect a well-rounded student (and it need not). you are likely to have a well-rounded and energetic student body.
Of course, this might not be desirable to those involved with the IIT JEE if they have a nexus with the coaching shops themselves, which I am increasingly suspicious is the case.
Why else would IIT Directors keep complaining about students who can't write a sentence while conducting a purely MCQ driven exam?

The Brazen Unfairness of DU's (earlier) cut-off process

posted Jan 16, 2017, 5:31 AM by Prashant Bhattacharji   [ updated Jan 17, 2017, 2:17 AM ]

Cut-offs for colleges under DU have been sky-rocketing and they're now contemplating moving to an entrance exam.
The bizarre part is that all these years, DU has been comparing and equating raw scores across different boards. So 90% in CBSE is the same as 90% in ISC and 90% in UP Board and 90% in Tamil Nadu board.
The ridiculousness is not to difficult to spot. All of these are different exams. 90% will barely put you in the top 10-15% in CBSE/ISC, it might not even put you in the top quartile of TN board, but it could very well make you the topper in low scoring UP and Bihar boards.
Let me tell you how this process worked quite well for DU all these years. DU is important because there are very limited non-STEM options in India.
From one end: DU deliberately remains naive enough to compare _raw_ scores across boards from one end without any kind of re-scaling.
And the other end: CBSE despite the pretence of a national board has a domination of members from Delhi. It inflates everyone's scores (so that no one complains). But if you dig into the data you can see that it inflates Delhi scores and those of a few other influential schools significantly more than others. So the kid from a not-so-influential CBSE school in WB or Bihar has the die stacked against her even before she starts to write the exam.
All of a sudden in 2015 and 2016, kids from TN and Hyderabad gatecrash this unbelievable party by using the same gaping hole which DU had left open for Delhi kids. South Indian boards are still a few steps ahead of the central boards in this spiral of score inflation.
So now the whole fuss is being kicked up about kids from TN bagging nearly all the premier Commerce seats in 2016.
In reality, the process was always unfair. Only students of 8 boards sent anything more than a single student to SRCC in 2015 and its not like others didn't apply.
It is just that an un-intended set of beneficiaries walked in through this cozy little arrangement in the last 2-3 years.
Gaming at a totally different level.

SRCC Entrants in 2015. Only 8 boards sent a single student or more.

How the seats went, on a board wise basis. 

 CBSE 550
 Tamil Nadu100 
 ISC 90
 Telangana 15
 Karnataka 8
 AP 5
 Kerala     5
 All other boards (mostly foreign) 8

The RTI and its regime of concealing data: Amend this law

posted Jan 12, 2017, 8:10 PM by Prashant Bhattacharji

A rather dangerous law which gets a lot of love for no good reason is the RTI, a great objective but a horrible implementation passed by the Congress and now quite clearly loved by the BJP -- a party which doesn't have the intellectual capacity to either think of any significant legislation let alone its implementation and so, is like a dog who caught a car with 282+.
I have seen the stats somewhere. Something like 60-70% of the RTI requests fall into the category of (a)School marksheets (b) Property ownership (c) roads which are being maintained and constructed (d) eligibility for government schemes (I suspect scholarships) .
Then there's the long tail (the rest).
Here's the fun part: ALL of those four should be online and available anyway unless you're living in a communist "don't ask don't tell" regime.
Most absurd is this business of celebrating "RTI activists" . Some of them even end up losing their life in the process, for trying to access information meant to be public anyway. That happens while trying to get information about government tenders and contracts (again - that should be online by default). And then the government releases info AFTER they get killed.
One should not have to make a non-anonymous request for data which should be publicly available. Now is a good time to see this version of information flow in action.
Just look at how RBI is conveniently rationing its released statistics for an operation where there should have been public dashboards from the very first day.

A most brazen case of data witholding is the one cited here where IITs tried to offer "hard copy" printouts of JEE admission data making it nearly impossible to process data tabulated for thousands of students.

IIT Bombay to start an Economics program - Keep it accessible to Commerce, Humanities students

posted Jan 8, 2017, 4:22 AM by Prashant Bhattacharji

There's a good as well as bad side to this.
Seems like Eco programs at IITs have become popular (plenty of people like me who don't want anything to do with heavy engineering stuff like workshops and motors - thankfully I had to tolerate a limited amount of that stuff)
And with better access to Math, Stats, Finance, ML, management, programming courses (and some tech background) they will soon be more "in demand" than the traditional BA Eco programs at DU/Mumbai which currently have a very theoretical and qualitative version of eco. Employers and grad schools will see this closer to the quantitative and computing heavy programs at Oxford, Stanford etc.
But that is where things start to become a bit unfair to (say) commerce or humanities students who could very much benefit from this, but their options will be restricted to what will eventually become the second-tier BA Eco courses.
This shouldn't matter if IITs were private (like BITs) but as long as that isn't the case and they're built on taxpayer money, they need to make sure access to massive sections of student population isn't shut off to huge sections for arbitrary reasons.
Maybe just select the kids on the Math part?
Most of the posh ISC schools have stopped running their Humanities section altogether other than those with >200 students a batch.

"IIT-B, however, will not be the first to run an undergraduate course in economics as IIT Kharagpur and IIT Kanpur already run a similar programme."

Facebook recommends a ton of Mathematics courses to future AI professionals

posted Dec 13, 2016, 5:52 AM by Prashant Bhattacharji   [ updated Dec 13, 2016, 5:53 AM ]

Facebook has advised future AI enthusiasts to study calculus, linear algebra, probability and statistics.
This is excellent advice. A lot of people think that programming and coding is what prepares them for a career in AI. It doesn't. 
Programming and coding is just the medium by which you put those highly mathematical ideas into action and application. 
Computer Vision, Image Analysis - these are full of vector calculus, statistics, linear algebra. 
Problems like Image segmentation rely heavily on concepts like Graph Theory. 
Natural Language processing - again a very statistical field. 
Machine Learning - prob and stats, linear algebra 
For time series forecasting it might help to know about fourier analysis, FFT, regression based prediction, etc. Backpropagation algorithm in neural networks requires an understanding of calculus. 
There is no escaping mathematics in the field of AI. 
AI is simply Mathematics brought into action via Computing - which requires programming, data structures, algorithms, systems and concurrent programming. 

1-10 of 25