Introduction

Nowadays, its a very common sight that students put their online coding plat orms handle in their resume while applying for a job/internship in various companies. but there could be ways in which this information of the candidate could be used against him to carry out the information about the candidate that he might not want to be disclosed to the interviewer of the company. This exploitation of the privacy and carrying out of the information of the candidate is also called as Inference Attack, that is a type of attack in which a person’s sensitive information is inferred by the data disclosed by him. In this project, we took the example of Codeforces which is one of the most used platform for online coding competitions and coding practice.

Implementation

Data Fetching

Codeforces provides a variety of APIs. One of them was designed to fetch all the submissions of a user. The result of the API for a user is given below.

Data Interpretation

Now that we have all the information about all the submissions of that user we need to carry out various information out of it. Such as, the number of problems solved by the user in each language, like 40 problems in C++, 50 problems solved by the user in python etc. This information helps the interviewer or any adversary to find the strongest and weakest languages of that user. Secondly, the count of the problems solved by the user accoding to the rating of that problem. In codeforces, the difficulty level of the problems are labeled using a rating tag. This info could be used to infer the problem solving skills level of that user like how much difficult and how much easy-medium level problems are there that this

user can solve. And the most important, the sorting of the problems according to their topic. For example, the user solved 70 problems on graph and among these 70, the ratings of each problem so that the interviewer can get insight about every topic that the user has solved and also the count of the problems in each category that shows the number of Wrong submissions, number of submissions that gave TLE, number of submissions that gave Runtime error etc. Last but not the least, the list of problems in every topic that the user was not able to solve. These are the problems that user tried and got wrong submission and was not able to do these problems till date.

Results

After fetching all the results we plotted graphs of various kinds so that the interviewer could infer all the coding habits of the user at a glance. To showcase the results, we have taken an example user tourist who is a very famous coder on codeforces.

Conclusion and Mitigation

In conclusion, The Inference Attack was successfully done on the codeforces platform. This could be scaled up to other platforms like codechef, atcoder etc. In order to mitigate this, the platform should provide a facility to the users to chose the visibilty of their submissions. That is, the User should have liberty to make their submissions public or private.

Video presentation

The Team

Danial Kafeel
Janmejay Pratap Singh Baghel
Akshay M
Divya Donapati
Chirag Shilwant

Acknowledgement

This project was carried out as part of the course Online Privacy , under the guidance of Professor Ponnurangam Kumaraguru, at International Institute of Information Technology, Hyderabad.

Search This Blog

cs4.407 Online Privacy

CP Analyzer: Extract the coding habits of students

Introduction

Implementation

Data Fetching

Data Interpretation

Results

Conclusion and Mitigation

Video presentation

The Team

Acknowledgement

Popular posts from this blog

Applications of NLP in Privacy Policies

"TL;DR: No more an excuse!": On making privacy policies easier to read and interpret

SiPP - Simple Privacy Policy