Sumon Biswas
Sumon Biswas
Home
Publication
Service
Projects
Teaching
Students
News
Talks
Blogs
Light
Dark
Automatic
Conference
23 Shades of Self-Admitted Technical Debt: An Empirical Study on Machine Learning Software
We provided a comprehensive taxonomy of machine learning SATDs. Our study analyzes ML SATD type organizations, their frequencies within stages of ML software, the differences between ML SATDs in applications and tools, and the effort of ML SATD removals. The findings discovered suggest implications for ML developers and researchers to create maintainable ML systems.
David OBrien
,
Sumon Biswas
,
Sayem Imtiaz
,
Rabe Abdalkareem
,
Emad Shihab
,
Hridesh Rajan
Cite
DOI
Fair Preprocessing: Towards Understanding Compositional Fairness of Data Transformers in Machine Learning Pipeline
We introduced the causal method of fairness to reason about the fairness impact of data preprocessing stages in ML pipeline. We leveraged existing metrics to define the fairness measures of the stages. Then we conducted a detailed fairness evaluation of the preprocessing stages in 37 pipelines collected from three different sources.
Sumon Biswas
,
Hridesh Rajan
Cite
DOI
Do the Machine Learning Models on a Crowd Sourced Platform Exhibit Bias? An Empirical Study on Model Fairness
We have focused on the empirical evaluation of fairness and mitigations on real-world machine learning models. We have created a benchmark of 40 top-rated models from Kaggle used for 5 different tasks, and then using a comprehensive set of fairness metrics, evaluated their fairness. Then, we have applied 7 mitigation techniques on these models and analyzed the fairness, mitigation results, and impacts on performance.
Sumon Biswas
,
Hridesh Rajan
Cite
DOI
Boa Meets Python: A Boa Dataset of Data Science Software in Python Language
The popularity of Python programming language has surged in recent years due to its increasing usage in Data Science. The availability of Python repositories in Github presents an opportunity for mining software repository research, e.g., suggesting the best practices in developing Data Science applications, identifying bug-patterns, recommending code enhancements, etc. To enable this research, we have created a new dataset that includes 1,558 mature Github projects that develop Python software for Data Science tasks.
Sumon Biswas
,
Md Johirul Islam
,
Yijia Huang
,
Hridesh Rajan
Cite
Dataset
DOI
Slides
A Secure Data Security Infrastructure for Small Organization in Cloud Computing
This paper shows a concern on the security element in cloud environment for small business addressing their shortcomings and finding solutions for it. Measured security features have been implemented by developing a secured data encryption, exchange and decryption infrastructure resulting in a data security model.
Manan B T Noor
,
Sumon Biswas
Cite
DOI
Applying Ant Colony Optimization in Software testing to Generate Prioritized Optimal Path and Test Data
Ant colony optimization (ACO) based algorithm has been proposed which will generate set of optimal paths and prioritize the paths. Additionally, the approach generates test data sequence within the domain to use as inputs of the generated paths. Proposed approach guarantees full software coverage with minimum redundancy.
Sumon Biswas
,
M Shamim Kaiser
,
Shamin Al Mamun
Cite
DOI
Cloud Based Healthcare Application Architecture and Electronic Medical Record Mining: An Integrated Approach to Improve Healthcare System
In this article, a three tier cloud based application “eHealth Cloud” has been developed which will involve different parties to improve old-fashioned healthcare system. RIA (Rich Internet Application) based client, SimpleDB based server and a logic layer have been designed to build an easily accessible network.
Sumon Biswas
,
Anisuzzaman
,
Tanjina Akhter
,
M Shamim Kaiser
,
Shamim Al Mamun
Cite
DOI
«
Cite
×