Behnam Heydarshahi
MS in Computer Science

Tufts University Department of Computer Science


I research at the cross-section of program analysis, systems architecture, and human-computer interaction. I have two master's degrees; one in computer science from Tufts and another in computer engineering - systems architecture from Sharif University of Technology. The goal of my research is to model, measure, and predict the performance of software and hardware systems. Next, using this infrastructure, I would like to anaylze mobile apps and test if they are accessible for people with low vision, color blindness, etc.

Research Experience

RespDroid: App Performance Testing

The goal of this research with Professor Guyer is to answer two questions. First, how can we measure performance of an app running on a smartphone, in terms of quality of experience (QoE)? Second, if we measure the app QoE on a few smartphones, how can we predict its QoE on a larger set of smartphones? The importance of performance prediction comes from the fact that there are over 18000 different types of Android devices; it is not possible to test on that many devices, yet the users of our app will run it on those devices. For measuring QoE, I have come up with two metrics: smoothness, and responsiveness. Smoothness is the ability of an app to execute at a fixed and smooth frame-rate. On Android, a common design guide for an app is to execute at a frame-rate of 60 frames per second. To measure that, instead of measuring the framerate, I record all the individual frame timestamps through a mobile graphics card interface. I compare subsequent frame timestamps to compute every single frame time. Then I sort all the frame times. Finally, I report three metrics: average frame time, average low 1% frame time, and average low 0.1% frame time (slower frames that take more time to render). I am conducting preliminary user surveys, and what frustrates users most are the latter two metrics. So, an app that runs on average at 60 fps is not necessarily smoother than another that runs at 55 fps, if the former performs significantly slower on average low 1% and average low 0.1% frames. For responsiveness, I use static analysis with FlowDroid and Soot to instrument and mark potentially unresponsive parts of the app. In a second step, a dynamic analysis, I use ADB to run the app to detect if the marked parts execute with delays that are under a user-conceived threshold, usually 200 milliseconds. Finally I use the performance numbers I have gathered, and using a machine learning algorithm, I predict the app performance on new hardware, with an accuracy of over 80%. For future, I plan to improve the accuracy to 95%, and also find other metrics in addition to responsiveness and smoothness, such as accessibility.

Using ML and NLP to pinpoint code bugs from user reviews on app-store

At Tufts for my NLP class project, I wrote a program to perform latent semantic analysis on app-store user reviews for two apps, and the goal was to automatically find app bugs from user reviews. I implemented a topic modeling algorithm that read as input a corpus of app store user reviews. I used a TF-IDF matrix in Scala, and Breeze library’s singular value decomposition. I extracted 10 topics from the reviews. I showed that multiple of those topics referred to common bugs in those apps. For future work, I would like to combine this project with a static analysis tool that takes in the bugs I already detect, and takes as input also the app source code, and as output pinpoints the lines in code from which those bugs arise. This could be a big step towards fully-automated app testing with minimal developer “babysitting.”

Size-Aware Hardware Trojan Detection

I completed this project at Sharif University of Technology before coming to Tufts. I developed a circuit simulator system, and combined and compared functional testing with side-channel testing. For modelling the Trojan I started with a small logic gate, namely an XOR, and then moved on to more complex comibinational and sequential circuits. The functional simulator, implemented in Java, parses in the verilog circuit. The side-channel metric is power consumption. For power estimation, the I calculate the switching activity of gates in SAIF format and feed it to Power Compiler. I compare the results of the circuit containing Trojan, with a golden cicruit. I also consider the power threshold induced by process varitation.

Volunteer and Leadership Experience at Tufts

While studying at Tufts I have served on a few community leadership positions. I was elected as the president of Computer Science League of Learning (CSLoL), one of the most popular graduate student organizations across the university. We have taught coding to graduate students from many departments across Tufts. I also serve as a board member of Tufts Computer Science Student Council, where we focus on helping to make Tufts an environment where everyone feels like they belong enough. .

Contacting me

The best way to reach me is via e-mail to I read that regularly, including when I am away from home or office. Thank you for visiting my page!