I am a computer scientist. I seek problems with impact and solve them using methods from a variety of fields, from software engineering/programming language technologies to machine learning/data mining (I believe in opportunities in data) and qualitative methods (I believe in the people and stories beyond the numbers).
I am currently finishing my PhD thesis at McGill University, while taking a leave of absence from
IBM T.J. Watson Research
My PhD thesis is about summarization of code examples: the generation succinct version of the original code.
First, I used a machine-learning based approach to highlight important lines of code, based on some easy-to-generate code features. Even with this approach I can generate summaries that are comparable to what a human would generate. [ESEC/FSE 2013 paper]
Second, using a mix of qualitative and quantitative methods, I studied how humans summarize code examples and elicited a list of summarization practices. [FSE 2014 paper - Winner of an ACM SIGSOFT Distinguished Paper Award]
Based on these practices, I formulate a novel summarization problem: the generation of 2-dimensional code summaries, and an optimization based algorithm.