Foutse Khomh

2023-07-01

Information and Software Technology (publié)

Studying the challenges of developing hardware description language programs

Fatemeh Yousefifeshki

Heng Li

2023-07-01

Information and Software Technology (publié)

Responsible Design Patterns for Machine Learning Pipelines

Saud Hakem Al Harbi

Lionel Nganyewou Tidjon

Integrating ethical practices into the AI development process for artificial intelligence (AI) is essential to ensure safe, fair, and respon… (voir plus)sible operation. AI ethics involves applying ethical principles to the entire life cycle of AI systems. This is essential to mitigate potential risks and harms associated with AI, such as algorithm biases. To achieve this goal, responsible design patterns (RDPs) are critical for Machine Learning (ML) pipelines to guarantee ethical and fair outcomes. In this paper, we propose a comprehensive framework incorporating RDPs into ML pipelines to mitigate risks and ensure the ethical development of AI systems. Our framework comprises new responsible AI design patterns for ML pipelines identified through a survey of AI ethics and data management experts and validated through real-world scenarios with expert feedback. The framework guides AI developers, data scientists, and policy-makers to implement ethical practices in AI development and deploy responsible AI systems in production.

2023-05-31

ArXiv (prépublication)

Testing Feedforward Neural Networks Training Programs

Houssem Ben Braiek

2023-05-26

ACM Transactions on Software Engineering and Methodology (publié)

On Codex Prompt Engineering for OCL Generation: An Empirical Study

Seif Abukhalaf

Mohammad Hamdaqa

The Object Constraint Language (OCL) is a declarative language that adds constraints and object query expressions to Meta-Object Facility (M… (voir plus)OF) models. OCL can provide precision and conciseness to UML models. Nevertheless, the unfamiliar syntax of OCL has hindered its adoption by software practitioners. LLMs, such as GPT-3, have made significant progress in many NLP tasks, such as text generation and semantic parsing. Similarly, researchers have improved on the downstream tasks by fine-tuning LLMs for the target task. Codex, a GPT-3 descendant by OpenAI, has been fine-tuned on publicly available code from GitHub and has proven the ability to generate code in many programming languages, powering the AI-pair programmer Copilot. One way to take advantage of Codex is to engineer prompts for the target downstream task. In this paper, we investigate the reliability of the OCL constraints generated by Codex from natural language specifications. To achieve this, we compiled a dataset of 15 UML models and 168 specifications from various educational resources. We manually crafted a prompt template with slots to populate with the UML information and the target task in the prefix format to complete the template with the generated OCL constraint. We used both zero- and few-shot learning methods in the experiments. The evaluation is reported by measuring the syntactic validity and the execution accuracy metrics of the generated OCL constraints. Moreover, to get insight into how close or natural the generated OCL constraints are compared to human-written ones, we measured the cosine similarity between the sentence embedding of the correctly generated and human-written OCL constraints. Our findings suggest that by enriching the prompts with the UML information of the models and enabling few-shot learning, the reliability of the generated OCL constraints increases. Furthermore, the results reveal a close similarity based on sentence embedding between the generated OCL constraints and the human-written ones in the ground truth, implying a level of clarity and understandability in the generated OCL constraints by Codex.

2023-05-15

2023 IEEE/ACM 20th International Conference on Mining Software Repositories (MSR) (publié)

RIGAA at the SBFT 2023 Tool Competition - Cyber-Physical Systems Track

Dmytro Humeniuk

Giuliano Antoniol

Testing and verification of autonomous systems is critically important. In the context of SBFT 2023 CPS testing tool competition, we present… (voir plus) our tool RIGAA for generating virtual roads to test an autonomous vehicle lane keeping assist system. RIGAA combines reinforcement learning as well as evolutionary search to generate test scenarios. It has achieved the second highest final score among 5 other submitted tools.

2023-05-01

2023 IEEE/ACM International Workshop on Search-Based and Fuzz Testing (SBFT) (publié)

UnityLint: A Bad Smell Detector for Unity

Matteo Bosco

Pasquale Cavoto

Augusto Ungolo

Biruk Asmare Muse

Vittoria Nardone

Massimiliano Di Penta

The video game industry is particularly rewarding as it represents a large portion of the software development market. However, working in t… (voir plus)his domain may be challenging for developers, not only because of the need for heterogeneous skills (from software design to computer graphics), but also for the limited body of knowledge in terms of good and bad design and development principles, and the lack of tool support to assist them. This tool demo proposes UnityLint, a tool able to detect 18 types of bad smells in Unity video games. UnityLint builds upon a previously-defined and validated catalog of bad smells for video games. The tool, developed in C# and available both as open-source and binary releases, is composed of (i) analyzers that extract facts from video game source code and metadata, and (ii) smell detectors that leverage detection rules to identify smells on top of the extracted facts.Tool: https://github.com/mdipenta/UnityCodeSmellAnalyzerTeaser Video: https://youtu.be/HooegxZ8H6g

2023-05-01

IEEE International Conference on Program Comprehension (publié)

Leveraging Data Mining Algorithms to Recommend Source Code Changes

AmirHossein Naghshzan

Saeed Khalilazar

Pierre Poilane

Olga Baysal

Latifa Guerrouj

2023-04-29

ArXiv (prépublication)

Ranking code clones to support maintenance activities

Osama Ehsan

Ying Zou

Dong Qiu

2023-04-25

Empirical Software Engineering (publié)

Bugs in machine learning-based systems: a faultload benchmark

Mohammad Mehdi Morovati

Amin Nikanjam

Z. Jiang

2023-04-05

Empirical Software Engineering (publié)

Can Ensembling Preprocessing Algorithms Lead to Better Machine Learning Fairness?

Khaled Badran

Pierre-Olivier Côté

Amanda Kolopanis

Rached Bouchoucha

Antonio Collante

Diego Elias Costa

Emad Shihab

In this work, we evaluate three popular fairness preprocessing algorithms and investigate the potential for combining all algorithms into a … (voir plus)more robust preprocessing ensemble. We report on lessons learned that can help practitioners better select fairness algorithms for their models.

2023-04-01

Computer (publié)

Machine learning application development: practitioners’ insights

Md Saidur Rahman

Alaleh Hamidi

Jinghui Cheng

Giuliano Antoniol

Hironori Washizaki

2023-03-30

Software Quality Journal (publié)