“Research is to see what everybody else has seen, and to think what nobody else has thought.” — Albert Szent-Györgyi

PUBLICATIONS

Conference Papers

  • FeaRS: Recommending Complete Android Method Implementations
    Fengcai Wen, Valentina Ferrari, Emad Aghajani, Csaba Nagy, Michele Lanza and Gabriele Bavota
    In Proceedings of the 37th International Conference on Software Maintenance and Evolution (ICSME 2021), Tool Demo Track. 2021.
    Abstract:
    Several techniques have been proposed in the literature to support code completion, showing excellent results in predicting the next few tokens a developer is likely to type given the current context. Only recently, approaches pushing the boundaries of code completion (e.g., by presenting entire code statements) have been proposed. In this line of research, we present FeaRS, a recommender system that, given the current code a developer is writing in the IDE, recommends the next complete method to be implemented. FeaRS has been deployed to learn ``implementation patterns'' (i.e., groups of methods usually implemented within the same task) by continuously mining open-source Android projects. Such knowledge is leveraged to provide method recommendations when the code written by the developer in the IDE matches an ``implementation pattern''. Preliminary results of FeaRS' accuracy show its potential as well as some open challenges to overcome.
    BibTex:
    @inproceedings{Wen2021a,
      author = {Wen, Fengcai and Ferrari, Valentina and Aghajani, Emad and Nagy, Csaba and Lanza, Michele and Bavota, Gabriele},
      title = {FeaRS: Recommending Complete Android Method Implementations},
      booktitle = {Proceedings of the 37th International Conference on Software Maintenance and Evolution (ICSME 2021), Tool Demo Track},
      year = {2021}
    }
  • Siri, Write the Next Method
    Fengcai Wen, Emad Aghajani, Csaba Nagy, Michele Lanza and Gabriele Bavota
    In Proceedings of the IEEE/ACM 43rd International Conference on Software Engineering (ICSE 2021). pp. 138-149, 2021.
    DOI
    Abstract:
    Code completion is one of the killer features of Integrated Development Environments (IDEs), and researchers have proposed different methods to improve its accuracy. While these techniques are valuable to speed up code writing, they are limited to recommendations related to the next few tokens a developer is likely to type given the current context. In the best case, they can recommend a few APIs that a developer is likely to use next. We present FeaRS, a novel retrieval-based approach that, given the current code a developer is writing in the IDE, can recommend the next complete method (i.e., signature and method body) that the developer is likely to implement. To do this, FeaRS exploits “implementation patterns” (i.e., groups of methods usually implemented within the same task) learned by mining thousands of open source projects. We instantiated our approach to the specific context of Android apps. A large-scale empirical evaluation we performed across more than 20k apps shows encouraging preliminary results, but also highlights future challenges to overcome.
    BibTex:
    @inproceedings{Wen2021,
      author = {Wen, Fengcai and Aghajani, Emad and Nagy, Csaba and Lanza, Michele and Bavota, Gabriele},
      title = {Siri, Write the Next Method},
      booktitle = {Proceedings of the IEEE/ACM 43rd International Conference on Software Engineering (ICSE 2021)},
      year = {2021},
      pages = {138-149},
      doi = {10.1109/ICSE43902.2021.00025}
    }
  • Visualizing Discord Servers
    Marco Raglianti, Roberto Minelli, Csaba Nagy and Michele Lanza
    In Proceedings of the 9th IEEE Working Conference on Software Visualization (VISSOFT 2021), NIER/TD. 2021.
    Abstract:
    The last decade has seen the rise of global software community platforms, such as Slack, Gitter, and Discord. They allow developers to discuss implementation issues, report bugs, and, in general, interact with one another. Such real-time communication platforms are thus slowly complementing, if not replacing, more traditional communication channels, such as development mailing lists. Apart from simple text messaging and conference calls, they allow the sharing of any type of content, such as videos, images, and source code. This is turning such platforms into precious information sources when it comes to searching for documentation and understanding design and implementation choices. However, the velocity and volatility of the contents shared and discussed on such platforms, combined with their often informal structure, makes it difficult to grasp and differentiate the relevant pieces of information. We present a visual analytics approach, supported by a tool named DiscOrDance, which provides numerous custom views to support the understanding of Discord servers in terms of their structure, contents, and community. We illustrate DiscOrDance, using as running example the public Pharo development community Discord Server, which counts to date ~180k messages shared among ~2,900 developers, spanning 5 years of history. Based on our analyses, we distill and discuss interesting insights and lessons learned.
    BibTex:
    @inproceedings{Raglianti2021,
      author = {Raglianti, Marco and Minelli, Roberto and Nagy, Csaba and Lanza, Michele},
      title = {Visualizing Discord Servers},
      booktitle = {Proceedings of the 9th IEEE Working Conference on Software Visualization (VISSOFT 2021), NIER/TD},
      year = {2021}
    }
  • Challenges and Perils of Testing Database Manipulation Code
    Maxime Gobert, Csaba Nagy, Henrique Rocha, Serge Demeyer and Anthony Cleve
    In Proceedings of the 33rd International Conference on Advanced Information Systems Engineering (CAiSE 2021). 2021.
    Abstract:
    Software testing enable development teams to maintain the quality of a software system while it evolves. The database manipulation code requires special attention in this context. However, it is often neglected and suffers from software maintenance problems. In this paper, we investigate the current state-of-the-practice in testing database manipulation code. We first analyse the code of 72 projects mined from Libraries.io to get an impression of the test coverage for database code. We confirm that the database is poorly tested: 46% of the projects did not cover with tests half of their database access methods, and 33% of the projects did not cover the database code at all. To understand the difficulties in testing database code, we analysed 532 questions on StackExchange sites and deduced a taxonomy. We found that developers mostly look for insights on general best practices to test database access code. They also have more technical questions related to DB handling, mocking, parallelisation or framework/tool usage. This investigation lays the basis for future research on improving database code testing.
    BibTex:
    @inproceedings{Gober2021,
      author = {Gobert, Maxime and Nagy, Csaba and Rocha, Henrique and Demeyer, Serge and Cleve, Anthony},
      title = {Challenges and Perils of Testing Database Manipulation Code},
      booktitle = {Proceedings of the 33rd International Conference on Advanced Information Systems Engineering (CAiSE 2021)},
      year = {2021}
    }
  • Visualizing GitHub Issues
    Aron Fiechter, Roberto Minelli, Csaba Nagy and Michele Lanza
    In Proceedings of the 9th IEEE Working Conference on Software Visualization (VISSOFT 2021), NIER/TD. 2021.
    Abstract:
    The rise of distributed version control systems, such as git, and platforms built on top of it, such as GitHub, has triggered a change in how software is developed. Most notably, state-of-the-art practice foresees the use of pull requests and issues, enriched by means to enable discussions among the involved people. Platforms like GitHub and GitLab have thus turned into comprehensive and cohesive modern software development environments, also offering additional mechanisms, such as code review tools and a transversal support for continuous integration and deployment. However, the plethora of concepts, mechanisms, and their interconnections are stored and presented in textual form, which makes the understanding of the underlying evolutionary processes difficult. We introduce the notion of an issue tale, a visual narrative of the events and actors revolving around any GitHub issue, and present an approach, implemented as an interactive visual analytics tool, to depict and analyze the relevant information pertaining to issue tales. We illustrate our approach and its implementation on several open-source software systems.
    BibTex:
    @inproceedings{Fiechter2021,
      author = {Fiechter, Aron and Minelli, Roberto and Nagy, Csaba and Lanza, Michele},
      title = {Visualizing GitHub Issues},
      booktitle = {Proceedings of the 9th IEEE Working Conference on Software Visualization (VISSOFT 2021), NIER/TD},
      year = {2021}
    }
  • An Empirical Study of (Multi-) Database Models in Open-Source Projects
    Pol Benats, Maxime Gobert, Loup Meurice, Csaba Nagy and Anthony Cleve
    In Proceedings of the 40th International Conference on Conceptual Modeling (ER 2021). 2021.
    Abstract:
    Managing data-intensive systems has long been recognized as an expensive and error-prone process. This is mainly due to the often implicit consistency relationships that hold between applications and their database. As new technologies emerged for specialized purposes (e.g., graph databases, document stores), the joint use of database models has also become popular. There are undeniable benefits of such multi-database models where developers combine various technologies. However, the side effects on design, querying, and maintenance are not well-known yet. In this paper, we study multi-database models in software systems by mining major open-source repositories. We consider four years of history, from 2017 to 2020, of a total number of 40,609 projects with databases. Our results confirm the emergence of hybrid data-intensive systems as we found (multi-) database models (e.g., relational and non-relational) used together in 16\% of all database-dependent projects. One percent of the systems added, deleted, or changed a database during the four years. The majority (62\%) of these systems had a single database before becoming hybrid, and another significant part (19\%) became ``mono-database'' after initially using multiple databases. We examine the evolution of these systems to understand the rationale of the design choices of the developers. Our study aims to guide future research towards new challenges posed by those emerging data management architectures.
    BibTex:
    @inproceedings{Benats2021,
      author = {Benats, Pol and Gobert, Maxime and Meurice, Loup and Nagy, Csaba and Cleve, Anthony},
      title = {An Empirical Study of (Multi-) Database Models in Open-Source Projects},
      booktitle = {Proceedings of the 40th International Conference on Conceptual Modeling (ER 2021)},
      year = {2021}
    }
  • Visualizing Data in Software Cities
    Susanna Ardigò, Csaba Nagy, Roberto Minelli and Michele Lanza
    In Proceedings of the 9th IEEE Working Conference on Software Visualization (VISSOFT 2021), NIER/TD. 2021.
    Abstract:
    The city metaphor for visualizing software systems in 3D has been widely explored and it has led to many diverse implementations and approaches. However, when looking at software systems in general, and when using specifically a city approach, it is evident that something is missing: The data. Indeed, software systems are intrinsically driven by data, which is usually managed using databases or often also simply stored in files coming in a variety of formats, such as CSV, XML, and JSON. While such data files are part of a project's file system and can thus be easily retrieved, the situation is different for databases: A database is usually not contained in the file system, and its presence can only be inferred from the source code which contains the database accesses. We present an extension of the CodeCity implementation, M3tricity2, with two new contributions: First, we consider data files and use simple metrics to integrate them in the visualization seamlessly. Second, we present a novel way to add a database to the visualization by making use of the one remaining space left unused: the sky and the underground. We present our contributions and illustrate them on various software systems.
    BibTex:
    @inproceedings{Ardigo2021,
      author = {Ardigò, Susanna and Nagy, Csaba and Minelli, Roberto and Lanza, Michele},
      title = {Visualizing Data in Software Cities},
      booktitle = {Proceedings of the 9th IEEE Working Conference on Software Visualization (VISSOFT 2021), NIER/TD},
      year = {2021}
    }
  • An Empirical Study of Quick Remedy Commits
    Fengcai Wen, Csaba Nagy, Michele Lanza and Gabriele Bavota
    In Proceedings of the 28th IEEE/ACM International Conference on Program Comprehension (ICPC 2020). 2020.
    ACM SIGSOFT Distinguished Paper Award
    Abstract:
    Software systems are continuously modified to implement new features, to fix bugs, and to improve quality attributes. Most of these activities are not atomic changes, but rather the result of several related changes affecting different parts of the code. For this reason, it may happen that developers omit some of the needed changes and, as a consequence, leave a task partially unfinished, introduce technical debt or, in the worst case scenario, inject bugs. Knowing the changes that are mistakenly omitted by developers can help in designing recommender systems able to automatically identify risky situations in which, for example, the developer is likely to be pushing an incomplete change to the software repository. We present a qualitative study investigating "quick remedy commits" performed by developers with the goal of implementing changes omitted in previous commits. With quick remedy commits we refer to commits that (i) quickly follow a commit performed by the same developer in the same repository, and (ii) aim at remedying issues introduced as the result of code changes omitted in the previous commit (e.g., fix references to code components that have been broken as a consequence of a rename refactoring). Through a manual analysis of 500 quick remedy commits, we define a taxonomy categorizing the types of changes that developers tend to omit. The defined taxonomy can guide the development of tools aimed at detecting omitted changes, and possibly autocomplete them.
    BibTex:
    @inproceedings{Wen2020,
      author = {Wen, Fengcai and Nagy, Csaba and Lanza, Michele and Bavota, Gabriele},
      title = {An Empirical Study of Quick Remedy Commits},
      booktitle = {Proceedings of the 28th IEEE/ACM International Conference on Program Comprehension (ICPC 2020)},
      year = {2020}
    }
  • Visualizing Evolving Software Cities
    Federico Pfahler, Roberto Minelli, Csaba Nagy and Michele Lanza
    In Proceedings of the 2020 Working Conference on Software Visualization (VISSOFT), NIER/TD. pp. 22-26, 2020.
    DOI
    Abstract:
    Visualization approaches that leverage a 3D city metaphor have become popular. Numerous variations, including virtual and augmented reality have emerged. Despite its popularity, the city metaphor falls short when depicting the evolution of a system, which results in buildings and districts moving around in unpredictable ways. We present a novel approach to visualize software systems as evolving cities that treats evolution as a first-class concept. It renders with fidelity not only changes but also refactorings in a comprehensive way. To do so, we developed custom ways to traverse time. We implemented our approach in a publicly accessible web-based platform named m3triCity.
    BibTex:
    @inproceedings{Pfahler2020,
      author = {Pfahler, Federico and Minelli, Roberto and Nagy, Csaba and Lanza, Michele},
      title = {Visualizing Evolving Software Cities},
      booktitle = {Proceedings of the 2020 Working Conference on Software Visualization (VISSOFT), NIER/TD},
      year = {2020},
      pages = {22-26},
      doi = {10.1109/VISSOFT51673.2020.00007}
    }
  • On the Prevalence, Impact, and Evolution of SQLcode smells in Data-Intensive Systems
    Biruk Asmare Muse, Masud Rahman, Csaba Nagy, Anthony Cleve, Foutse Khomh and Giuliano Antoniol
    In Proceedings of the 17th International Conference on Mining Software Repositories (MSR 2020). 2020.
    Abstract:
    Code smells indicate software design problems that harm software quality. Data-intensive systems that frequently access databases often suffer from SQL code smells besides the traditional smells. While there have been extensive studies on traditional code smells, recently, there has been a growing interest in SQL code smells. In this paper, we conduct an empirical study to investigate the prevalence and evolution of SQL code smells in open source, data-intensive systems. We collected 150 projects and examined both traditional and SQL code smells in these projects. Our investigation delivers several important findings. First, SQL code smells are indeed prevalent in data-intensive software systems. Second, SQL code smells have a weak co-occurrence with traditional code smells. Third, SQL code smells have a weaker association with bugs than that of traditional code smells. Fourth, SQL code smells are more likely to be introduced at the beginning of the project lifetime and likely to be left in the code without a fix, compared to traditional code smells. Overall, our results show that SQL code smells are indeed prevalent and persistent in the studied data-intensive software systems. Developers should be aware of these smells and consider detecting and refactoring SQL code smells and traditional code smells separately, using dedicated tools.
    BibTex:
    @inproceedings{Muse2020,
      author = {Muse, Biruk Asmare and Rahman, Masud and Nagy, Csaba and Cleve, Anthony and Khomh, Foutse and Antoniol, Giuliano},
      title = {On the Prevalence, Impact, and Evolution of SQLcode smells in Data-Intensive Systems},
      booktitle = {Proceedings of the 17th International Conference on Mining Software Repositories (MSR 2020)},
      year = {2020}
    }
  • Automated Identification of On-hold Self-admitted Technical Debt
    Rungroj Maipradit, Bin Lin, Csaba Nagy, Gabriele Bavota, Michele Lanza, Hideaki Hata and Kenichi Matsumoto
    In Proceedings of the 20th International Working Conference on Source Code Analysis and Manipulation (SCAM 2020). pp. 54-64, 2020.
    DOI
    Abstract:
    Modern software is developed under considerable time pressure, which implies that developers more often than not have to resort to compromises when it comes to code that is well written and code that just does the job. This has led over the past decades to the concept of “technical debt”, a short-term hack that potentially generates long-term maintenance problems. Self-admitted technical debt (SATD) is a particular form of technical debt: developers consciously perform the hack but also document it in the code by adding comments as a reminder (or as an admission of guilt). We focus on a specific type of SATD, namely “On-hold” SATD, in which developers document in their comments the need to halt an implementation task due to conditions outside of their scope of work (e.g., an open issue must be closed before a function can be implemented).We present an approach, based on regular expressions and machine learning, which is able to detect issues referenced in code comments, and to automatically classify the detected instances as either “On-hold” (the issue is referenced to indicate the need to wait for its resolution before completing a task), or as “cross-reference”, (the issue is referenced to document the code, for example to explain the rationale behind an implementation choice). Our approach also mines the issue tracker of the projects to check if the On-hold SATD instances are “superfluous” and can be removed (i.e., the referenced issue has been closed, but the SATD is still in the code). Our evaluation confirms that our approach can indeed identify relevant instances of On-hold SATD. We illustrate its usefulness by identifying superfluous On-hold SATD instances in open source projects as confirmed by the original developers.
    BibTex:
    @inproceedings{Maipradit2020,
      author = {Maipradit, Rungroj and Lin, Bin and Nagy, Csaba and Bavota, Gabriele and Lanza, Michele and Hata, Hideaki and Matsumoto, Kenichi},
      title = {Automated Identification of On-hold Self-admitted Technical Debt},
      booktitle = {Proceedings of the 20th International Working Conference on Source Code Analysis and Manipulation (SCAM 2020)},
      year = {2020},
      pages = {54-64},
      doi = {10.1109/SCAM51674.2020.00011}
    }
  • Software Documentation: The Practitioners' Perspective
    Emad Aghajani, Csaba Nagy, Mario Vega-Márquez Linares-Vásquez, Laura Moreno, Gabriele Bavota, Michele Lanza and David C. Shepherd
    In Proceedings of the 42nd International Conference on Software Engineering (ICSE 2020). 2020.
    Abstract:
    In theory, (good) documentation is an invaluable asset to any software project, as it helps stakeholders to use, understand, maintain, and evolve a system. In practice, however, documentation is generally affected by numerous shortcomings and issues, such as insufficient and inadequate content and obsolete, ambiguous information. To counter this, researchers are investigating the development of advanced recommender systems that automatically suggest high-quality documentation, useful for a given task. A crucial first step is to understand what quality means for practitioners and what information is actually needed for specific tasks. We present two surveys performed with 146 practitioners to investigate (i) the documentation issues they perceive as more relevant together with solutions they apply when these issues arise; and (ii) the types of documentation considered as important given specific tasks. Our findings can help researchers in designing the next generation of documentation recommender systems.
    BibTex:
    @inproceedings{Aghajani2020,
      author = {Aghajani, Emad and Nagy, Csaba and Vega-Márquez, Linares-Vásquez, Mario and Moreno, Laura and Bavota, Gabriele and Lanza, Michele and Shepherd, David C.},
      title = {Software Documentation: The Practitioners' Perspective},
      booktitle = {Proceedings of the 42nd International Conference on Software Engineering (ICSE 2020)},
      year = {2020}
    }
  • A Large-scale Empirical Study on Code-comment Inconsistencies
    Fengcai Wen, Csaba Nagy, Gabriele Bavota and Michele Lanza
    In Proceedings of the 27th International Conference on Program Comprehension (ICPC 2019). Montreal, Quebec, Canada, pp. 53-64, IEEE Press, may, 2019.
    DOI PDF
    Abstract:
    Code comments are a primary means to document source code. Keeping comments up-to-date during code change activities requires substantial time and attention. For this reason, researchers have proposed methods to detect code-comment inconsistencies (i.e., comments that are not kept in sync with the code they document) and studies have been conducted to investigate this phenomenon. However, these studies were performed at a small scale, relying on quantitative analysis, thus limiting the empirical knowledge about code-comment inconsistencies. We present the largest study at date investigating how code and comments co-evolve. The study has been performed by mining 1.3 Billion AST-level changes from the complete history of 1,500 systems. Moreover, we manually analyzed 500 commits to define a taxonomy of code-comment inconsistencies fixed by developers. Our analysis discloses the extent to which different types of code changes (e.g., change of selection statements) trigger updates to the related comments, identifying cases in which code-comment inconsistencies are more likely to be introduced. The defined taxonomy categorizes the types of inconsistencies fixed by developers. Our results can guide the development of tools aimed at detecting and fixing code-comment inconsistencies.
    BibTex:
    @inproceedings{Wen2019,
      author = {Wen, Fengcai and Nagy, Csaba and Bavota, Gabriele and Lanza, Michele},
      title = {A Large-scale Empirical Study on Code-comment Inconsistencies},
      booktitle = {Proceedings of the 27th International Conference on Program Comprehension (ICPC 2019)},
      publisher = {IEEE Press},
      year = {2019},
      pages = {53-64},
      doi = {10.1109/ICPC.2019.00019}
    }
  • On the Quality of Identifiers in Test Code
    Bin Lin, Csaba Nagy, Gabriele Bavota, Andrian Marcus and Michele Lanza
    In Proceedings of 19th International Working Conference on Source Code Analysis and Manipulation (SCAM 2019). pp. 204-215, sep, 2019.
    DOI PDF
    Abstract:
    Meaningful, expressive identifiers in source code can enhance the readability and reduce comprehension efforts. Over the past years, researchers have devoted considerable effort to understanding and improving the naming quality of identifiers in source code. However, little attention has been given to test code, an important resource during program comprehension activities. To better grasp identifier quality in test code, we conducted a survey involving manually written and automatically generated test cases from ten open source software projects. The survey results indicate that test cases contain low quality identifiers, including the manually written ones, and that the quality of identifiers is lower in test code than in production code. We also investigated the use of three state-of-the-art rename refactoring recommenders for improving test code identifiers. The analysis highlights their limitations when applied to test code and supports mapping out a research agenda for future work in the area.
    BibTex:
    @inproceedings{Lin2019a,
      author = {Lin, Bin and Nagy, Csaba and Bavota, Gabriele and Marcus, Andrian and Lanza, Michele},
      title = {On the Quality of Identifiers in Test Code},
      booktitle = {Proceedings of 19th International Working Conference on Source Code Analysis and Manipulation (SCAM 2019)},
      year = {2019},
      pages = {204-215},
      doi = {10.1109/SCAM.2019.00031}
    }
  • On the Impact of Refactoring Operations on Code Naturalness
    Bin Lin, Csaba Nagy, Gabriele Bavota and Michele Lanza
    In Proceedings of the 26th International Conference on Software Analysis, Evolution and Reengineering (SANER 2019). Hangzhou, China, pp. 594-598, feb, 2019.
    DOI PDF
    Abstract:
    Recent studies have demonstrated that software is natural, that is, its source code is highly repetitive and predictable like human languages. Also, previous studies suggested the existence of a relationship between code quality and its naturalness, presenting empirical evidence showing that buggy code is “less natural” than non-buggy code. We conjecture that this qualitynaturalness relationship could be exploited to support refactoring activities (e.g., to locate source code areas in need of refactoring). We perform a first step in this direction by analyzing whether refactoring can improve the naturalness of code. We use state-of-the-art tools to mine a large dataset of refactoring operations performed in open source systems. Then, we investigate the impact of different types of refactoring operations on the naturalness of the impacted code. We found that (i) code refactoring does not necessarily increase the naturalness of the refactored code; and (ii) the impact on the code naturalness strongly depends on the type of refactoring operations.
    BibTex:
    @inproceedings{Lin2019,
      author = {{Lin}, Bin and {Nagy}, Csaba and {Bavota}, Gabriele and {Lanza}, Michele},
      title = {On the Impact of Refactoring Operations on Code Naturalness},
      booktitle = {Proceedings of the 26th International Conference on Software Analysis, Evolution and Reengineering (SANER 2019)},
      year = {2019},
      pages = {594-598},
      doi = {10.1109/SANER.2019.8667992}
    }
  • Software Documentation Issues Unveiled
    Emad Aghajani, Csaba Nagy, Olga Lucero Vega-Márquez, Mario Linares-Vásquez, Laura Moreno, Gabriele Bavota and Michele Lanza
    In Proceedings of the 41st International Conference on Software Engineering (ICSE 2019). Montréal, QC, Canada, pp. 1199-1210, IEEE Press, may, 2019.
    DOI PDF
    Abstract:
    (Good) Software documentation provides developers and users with a description of what a software system does, how it operates, and how it should be used. For example, technical documentation (e.g., an API reference guide) aids developers during evolution/maintenance activities, while a user manual explains how users are to interact with a system. Despite its intrinsic value, the creation and the maintenance of documentation is often neglected, negatively impacting its quality and usefulness, ultimately leading to a generally unfavorable take on documentation. Previous studies investigating documentation issues have been based on surveying developers, which naturally leads to a somewhat biased view of problems affecting documentation. We present a large scale empirical study, where we mined, analyzed, and categorized 878 documentation-related artifacts stemming from four different sources, namely mailing lists, Stack Overflow discussions, issue repositories, and pull requests. The result is a detailed taxonomy of documentation issues from which we infer a series of actionable proposals both for researchers and practitioners.
    BibTex:
    @inproceedings{Aghajani2019,
      author = {Aghajani, Emad and Nagy, Csaba and Vega-Márquez, Olga Lucero and Linares-Vásquez, Mario and Moreno, Laura and Bavota, Gabriele and Lanza, Michele},
      title = {Software Documentation Issues Unveiled},
      booktitle = {Proceedings of the 41st International Conference on Software Engineering (ICSE 2019)},
      publisher = {IEEE Press},
      year = {2019},
      pages = {1199-1210},
      doi = {10.1109/ICSE.2019.00122}
    }
  • SQLInspect: A Static Analyzer to Inspect Database Usage in Java Applications
    Csaba Nagy and Anthony Cleve
    In Proceedings of the IEEE/ACM 40th International Conference on Software Engineering: Companion (ICSE 2018). Gothenburg, Sweden, pp. 93-96, may, 2018.
    PDF
    Abstract:
    We present SQLInspect, a tool intended to assist developers who deal with SQL code embedded in Java applications. It is integrated into Eclipse as a plug-in that is able to extract SQL queries from Java code through static string analysis. It parses the extracted queries and performs various analyses on them. As a result, one can readily explore the source code which accesses a given part of the database, or which is responsible for the construction of a given SQL query. SQL-related metrics and common coding mistakes are also used to spot inefficiently or defectively performing SQL statements and to identify poorly designed classes, like those that construct many queries via complex control-flow paths. SQLInspect is a novel tool that relies on recent query extraction approaches. It currently supports Java applications working with JDBC and SQL code written for MySQL or Apache Impala. Check out the live demo of SQLInspect at http://perso.unamur.be/~cnagy/sqlinspect.
    BibTex:
    @inproceedings{Nagy2018,
      author = {Nagy, Csaba and Cleve, Anthony},
      title = {SQLInspect: A Static Analyzer to Inspect Database Usage in Java Applications},
      booktitle = {Proceedings of the IEEE/ACM 40th International Conference on Software Engineering: Companion (ICSE 2018)},
      year = {2018},
      pages = {93-96}
    }
  • A Large-scale Empirical Study on Linguistic Antipatterns Affecting APIs
    Emad Aghajani, Csaba Nagy, Gabriele Bavota and Michele Lanza
    In Proceedings of the 34th International Conference on Software Maintenance and Evolution (ICSME 2018). Madrid, Spain, IEEE Computer Society, sep, 2018.
    PDF
    Abstract:
    The concept of monolithic stand-alone software systems developed completely from scratch has become obsolete, as modern systems nowadays leverage the abundant presence of Application Programming Interfaces (APIs) developed by third parties, which leads on the one hand to accelerated development, but on the other hand introduces potentially fragile dependencies on external resources. In this context, the design of any API strongly influences how developers write code utilizing it. A wrong design decision like a poorly chosen method name can lead to a steeper learning curve, due to misunderstandings, misuse and eventually bug-prone code in the client projects using the API. It is not unfrequent to find APIs with poorly expressive or misleading names, possibly lacking appropriate documentation. Such issues can manifest in what have been defined in the literature as Linguistic Antipatterns (LAs), i.e., inconsistencies among the naming, documentation, and implementation of a code entity. While previous studies showed the relevance of LAs for software developers, their impact on (developers of) client projects using APIs affected by LAs has not been investigated. This paper fills this gap by presenting a large-scale study conducted on 1.6k releases of popular Maven libraries, 14k open- source Java projects using these libraries, and 4.4k questions related to the investigated APIs asked on Stack Overflow. In particular, we investigate whether developers of client projects have higher chances of introducing bugs when using APIs affected by LAs and if these trigger more questions on Stack Overflow as compared to non-affected APIs.
    BibTex:
    @inproceedings{Aghajani2018,
      author = {Aghajani, Emad and Nagy, Csaba and Bavota, Gabriele and Lanza, Michele},
      title = {A Large-scale Empirical Study on Linguistic Antipatterns Affecting APIs},
      booktitle = {Proceedings of the 34th International Conference on Software Maintenance and Evolution (ICSME 2018)},
      publisher = {IEEE Computer Society},
      year = {2018}
    }
  • Static Code Smell Detection in SQL Queries Embedded in Java Code
    Csaba Nagy and Anthony Cleve
    In Proceedings of the 17th IEEE International Working Conference on Source Code Analysis and Manipulation (SCAM 2017). IEEE Computer Society, 2017.
    PDF
    Abstract:
    A database plays a central role in the architecture of an information system, and the way it stores the data delimits its main features. However, it is not just the data that matters. The way it is handled, i.e., how the application communicates with the database is of critical importance too. Therefore the implementation of such a communication layer has to be reliable and efficient. SQL is a popular language to query a database, and modern technologies rely on it (or its dialects) as query strings embedded in the application code. In many languages (e.g. in Java), an embedded query is typically constructed through several string operations that obstruct developers in understanding the statement finally sent to the database. It is a potential source of fault-prone and inefficient database usage, i.e., code smells. In our paper, we present a tool for the identification of code smells in SQL queries embedded in Java code. Our tool implements a combined static analysis of the SQL statements embedded in the source code, the database schema, and the data in the database. We use a lightweight query extraction algorithm to extract SQL code from the Java code and implement smell detectors on the ASG of our fault-tolerant SQL parser. Depending on the context of the smell, its severity is also determined. Developers can examine the identified issues with the help of an Eclipse plug-in or through command line interfaces.
    BibTex:
    @inproceedings{Nagy2017,
      author = {Nagy, Csaba and Cleve, Anthony},
      title = {Static Code Smell Detection in SQL Queries Embedded in Java Code},
      booktitle = {Proceedings of the 17th IEEE International Working Conference on Source Code Analysis and Manipulation (SCAM 2017)},
      publisher = {IEEE Computer Society},
      year = {2017}
    }
  • Designing and Developing Automated Refactoring Transformations: An Experience Report
    Gábor Szőke, Csaba Nagy, Rudolf Ferenc and Tibor Gyimóthy
    In Proceedings of the 23rd IEEE International Conference on Software Analysis, Evolution, and Reengineering (SANER 2016). Osaka, Japan, IEEE Computer Society, mar, 2016.
    PDF
    Abstract:
    There are several challenges which should be kept in mind during the design and development phases of a refactoring tool, and one is that developers have several expectations that are quite hard to satisfy. In this report, we present our experiences of a two-year project where we attempted to create an automatic refactoring tool. In this project, we worked with five software development companies that wanted to improve the maintainability of their products. The project was designed to take into account the expectations of the developers of these companies and consisted of three main stages: a manual refactoring phase, a tool building phase, and an automatic refactoring phase. Throughout these stages we collected the opinions of the developers and faced several challenges on how to automate refactoring transformations, which we present and summarize.
    BibTex:
    @inproceedings{Szoeke2016,
      author = {Szőke, Gábor and Nagy, Csaba and Ferenc, Rudolf and Gyimóthy, Tibor},
      title = {Designing and Developing Automated Refactoring Transformations: An Experience Report},
      booktitle = {Proceedings of the 23rd IEEE International Conference on Software Analysis, Evolution, and Reengineering (SANER 2016)},
      publisher = {IEEE Computer Society},
      year = {2016}
    }
  • Detecting and Preventing Program Inconsistencies Under Database Schema Evolution
    Loup Meurice, Csaba Nagy and Anthony Cleve
    In Proceedings of the 2016 IEEE International Conference on Software Quality, Reliability & Security (QRS 2016). Vienna, Austria, IEEE Computer Society, aug, 2016.
    PDF Best Paper Award
    Abstract:
    Nowadays, data-intensive applications tend to access their underlying database in an increasingly dynamic way. The queries that they send to the database server are usually built at runtime, through String concatenation, or Object-Relational-Mapping (ORM) frameworks. This level of dynamicity significantly complicates the task of adapting application programs to database schema changes. Failing to correctly adapt programs to an evolving database schema results in program inconsistencies, which in turn may cause program failures. In this paper, we present a tool-supported approach, that allows developers to (1) analyze how the source code and database schema co-evolved in the past and (2) simulate a database schema change and automatically determine the set of source code locations that would be impacted by this change. The developers are then provided with recommendations about what they should modify at those source code locations in order to avoid inconsistencies. The approach has been designed to deal with Java systems that use dynamic data access frameworks such as JDBC, Hibernate and JPA. We motivate and evaluate the proposed approach, based on three real-life systems of different size and nature.
    BibTex:
    @inproceedings{Meurice2016a,
      author = {Meurice, Loup and Nagy, Csaba and Cleve, Anthony},
      title = {Detecting and Preventing Program Inconsistencies Under Database Schema Evolution},
      booktitle = {Proceedings of the 2016 IEEE International Conference on Software Quality, Reliability & Security (QRS 2016)},
      publisher = {IEEE Computer Society},
      year = {2016}
    }
  • Static Analysis of Dynamic Database Usage in Java Systems
    Loup Meurice, Csaba Nagy and Anthony Cleve
    In Proceedings of the 28th International Conference on Advanced Information Systems Engineering (CAiSE 2016). Ljubljana, Slovenia, Springer LNCS, jun, 2016.
    PDF
    Abstract:
    Understanding the links between application programs and their database is useful in various contexts such as migrating information systems towards a new database platform, evolving the database schema, or assessing the overall system quality. In the case of Java systems, identifying which portion of the source code accesses which portion of the database may prove challenging. Indeed, Java programs typically access their database in a dynamic way. The queries they send to the database server are built at runtime, through String concatenations, or Object-Relational Mapping frameworks like Hibernate and JPA. This paper presents a static analysis approach to program-database links recovery, specifically designed for Java systems. The approach allows developers to automatically identify the source code locations accessing given database tables and columns. It focuses on the combined analysis of JDBC, Hibernate and JPA invocations. We report on the use of our approach to analyse three real-life Java systems.
    BibTex:
    @inproceedings{Meurice2016,
      author = {Meurice, Loup and Nagy, Csaba and Cleve, Anthony},
      title = {Static Analysis of Dynamic Database Usage in Java Systems},
      booktitle = {Proceedings of the 28th International Conference on Advanced Information Systems Engineering (CAiSE 2016)},
      publisher = {Springer LNCS},
      year = {2016}
    }
  • Do Automatic Refactorings Improve Maintainability? An Industrial Case Study
    Gábor Szőke, Csaba Nagy, Péter Hegedűs, Rudolf Ferenc and Tibor Gyimóthy
    In Proceedings of the 31st International Conference on Software Maintenance and Evolution (ICSME 2015). Bremen, Germany, pp. 429-438, IEEE, sep, 2015.
    PDF
    Abstract:
    Refactoring is often treated as the main remedy against the unavoidable code erosion happening during software evolution. Studies show that refactoring is indeed an elemental part of the developers' arsenal. However, empirical studies about the impact of refactorings on software maintainability still did not reach a consensus. Moreover, most of these empirical investigations are carried out on open-source projects where distinguishing refactoring operations from other development activities is a challenge in itself. We had a chance to work together with several software development companies in a project where they got extra budget to improve their source code by performing refactoring operations. Taking advantage of this controlled environment, we collected a large amount of data during a refactoring phase where the developers used a (semi)automatic refactoring tool. By measuring the maintainability of the involved subject systems before and after the refactorings, we got valuable insights into the effect of these refactorings on large-scale industrial projects. All but one company, who applied a special refactoring strategy, achieved a maintainability improvement at the end of the refactoring phase, but even that one company suffered from the negative impact of only one type of refactoring.
    BibTex:
    @inproceedings{Szoeke2015a,
      author = {Szőke, Gábor and Nagy, Csaba and Hegedűs, Péter and Ferenc, Rudolf and Gyimóthy, Tibor},
      title = {Do Automatic Refactorings Improve Maintainability? An Industrial Case Study},
      booktitle = {Proceedings of the 31st International Conference on Software Maintenance and Evolution (ICSME 2015)},
      publisher = {IEEE},
      year = {2015},
      pages = {429-438}
    }
  • FaultBuster: An Automatic Code Smell Refactoring Toolset
    Gábor Szőke, Csaba Nagy, Lajos Jenő Fülöp, Rudolf Ferenc and Tibor Gyimóthy
    In Proceedings of the 15th International Working Conference on Source Code Analysis and Manipulation (SCAM 2015). Bremen, Germany, pp. 253-258, IEEE, sep, 2015.
    PDF
    Abstract:
    One solution to prevent the quality erosion of a software product is to maintain its quality by continuous refactoring. However, refactoring is not always easy. Developers need to identify the piece of code that should be improved and decide how to rewrite it. Furthermore, refactoring can also be risky; that is, the modified code needs to be re-tested, so developers can see if they broke something. Many IDEs offer a range of refactorings to support so-called automatic refactoring, but tools which are really able to automatically refactor code smells are still under research. In this paper we introduce FaultBuster, a refactoring toolset which is able to support automatic refactoring: identifying the problematic code parts via static code analysis, running automatic algorithms to fix selected code smells, and executing integrated testing tools. In the heart of the toolset lies a refactoring framework to control the analysis and the execution of automatic algorithms. FaultBuster provides IDE plugins to interact with developers via popular IDEs (Eclipse, Netbeans and IntelliJ IDEA). All the tools were developed and tested in a 2-year project with 6 software development companies where thousands of code smells were identified and fixed in 5 systems having altogether over 5 million lines of code.
    BibTex:
    @inproceedings{Szoeke2015,
      author = {Szőke, Gábor and Nagy, Csaba and Fülöp, Lajos Jenő and Ferenc, Rudolf and Gyimóthy, Tibor},
      title = {FaultBuster: An Automatic Code Smell Refactoring Toolset},
      booktitle = {Proceedings of the 15th International Working Conference on Source Code Analysis and Manipulation (SCAM 2015)},
      publisher = {IEEE},
      year = {2015},
      pages = {253-258}
    }
  • Mining Stack Overflow for Discovering Error Patterns in SQL Queries
    Csaba Nagy and Anthony Cleve
    In Proceedings of the 31st International Conference on Software Maintenance and Evolution (ICSME 2015). pp. 516-520, IEEE, 2015.
    DOI PDF
    Abstract:
    Constructing complex queries in SQL sometimes necessitates the use of language constructs and the invocation of internal functions which inexperienced developers find hard to comprehend or which are unknown to them. In the worst case, bad usage of these constructs might lead to errors, to ineffective queries, or hamper developers in their tasks. This paper presents a mining technique for Stack Overflow to identify error-prone patterns in SQL queries. Identifying such patterns can help developers to avoid the use of error-prone constructs, or if they have to use such constructs, the Stack Overflow posts can help them to properly utilize the language. Hence, our purpose is to provide the initial steps towards a recommendation system that supports developers in constructing SQL queries. Our current implementation supports the MySQL dialect, and Stack Overflow has over 300,000 questions tagged with the MySQL flag in its database. It provides a huge knowledge base where developers can ask questions about real problems. Our initial results indicate that our technique is indeed able to identify patterns among them.
    BibTex:
    @inproceedings{Nagy2015a,
      author = {Nagy, Csaba and Cleve, Anthony},
      title = {Mining Stack Overflow for Discovering Error Patterns in SQL Queries},
      booktitle = {Proceedings of the 31st International Conference on Software Maintenance and Evolution (ICSME 2015)},
      publisher = {IEEE},
      year = {2015},
      pages = {516-520},
      doi = {10.1109/ICSM.2015.7332505}
    }
  • Where Was This SQL Query Executed? A Static Concept Location Approach
    Csaba Nagy, Loup Meurice and Anthony Cleve
    In Proceedings of the 22nd IEEE International Conference on Software Analysis, Evolution, and Reengineering (SANER 2015). Montréal, Québec, Canada, pp. 580-584, IEEE Computer Society, mar 2-6, 2015.
    PDF
    Abstract:
    Concept location in software engineering is the process of identifying where a specific concept is implemented in the source code of a software system. It is a very common task performed by developers during development or maintenance, and many techniques have been studied by researchers to make it more efficient. However, most of the current techniques ignore the role of a database in the architecture of a system, which is also an important source of concepts or dependencies among them. In this paper, we present a concept location technique for data-intensive systems, as systems with at least one database server in their architecture which is intensively used by its clients. Specifically, we present a static technique for identifying the exact source code location from where a given SQL query was sent to the database. We evaluate our technique by collecting and locating SQL queries from testing scenarios of two open source Java systems under active development. With our technique, we are able to successfully identify the source of most of these queries.
    BibTex:
    @inproceedings{Nagy2015,
      author = {Nagy, Csaba and Meurice, Loup and Cleve, Anthony},
      title = {Where Was This SQL Query Executed? A Static Concept Location Approach},
      booktitle = {Proceedings of the 22nd IEEE International Conference on Software Analysis, Evolution, and Reengineering (SANER 2015)},
      publisher = {IEEE Computer Society},
      year = {2015},
      pages = {580-584}
    }
  • Bulk Fixing Coding Issues and Its Effects on Software Quality: Is It Worth Refactoring?
    Gábor Szőke, Gábor Antal, Csaba Nagy, Rudolf Ferenc and Tibor Gyimóthy
    In 14th IEEE International Working Conference on Source Code Analysis and Manipulation, SCAM 2014. Victoria, BC, Canada, pp. 95-104, IEEE Computer Society, sep 28-29, 2014.
    DOI PDF
    Abstract:
    The quality of a software system is mostly defined by its source code. Software evolves continuously, it gets modified, enhanced, and new requirements always arise. If we do not spend time periodically on improving our source code, it becomes messy and its quality will decrease inevitably. Literature tells us that we can improve the quality of our software product by regularly refactoring it. But does refactoring really increase software quality? Can it happen that a refactoring decreases the quality? Is it possible to recognize the change in quality caused by a single refactoring operation? In our paper, we seek answers to these questions in a case study of refactoring large-scale proprietary software systems. We analyzed the source code of 5 systems, and measured the quality of several revisions for a period of time. We analyzed 2 million lines of code and identified nearly 200 refactoring commits which fixed over 500 coding issues. We found that one single refactoring only makes a small change (sometimes even decreases quality), but when we do them in blocks, we can significantly increase quality, which can result not only in the local, but also in the global improvement of the code.
    BibTex:
    @inproceedings{Szoke2014,
      author = {Szőke, Gábor and Antal, Gábor and Nagy, Csaba and Ferenc, Rudolf and Gyimóthy, Tibor},
      title = {Bulk Fixing Coding Issues and Its Effects on Software Quality: Is It Worth Refactoring?},
      booktitle = {14th IEEE International Working Conference on Source Code Analysis and Manipulation, SCAM 2014},
      publisher = {IEEE Computer Society},
      year = {2014},
      pages = {95-104},
      doi = {10.1109/SCAM.2014.18}
    }
  • A Case Study of Refactoring Large-Scale Industrial Systems to Efficiently Improve Source Code Quality
    Gábor Szőke, Csaba Nagy, Rudolf Ferenc and Tibor Gyimóthy
    In Computational Science and Its Applications – ICCSA 2014. Guimarães, Portugal, Vol. 8583, pp. 524-540, Springer International Publishing, jun, 2014.
    DOI PDF
    Abstract:
    Refactoring source code has many benefits (e.g. improving maintainability, robustness and source code quality), but it takes time away from other implementation tasks, resulting in developers neglecting refactoring steps during the development process. But what happens when they know that the quality of their source code needs to be improved and they can get the extra time and money to refactor the code? What will they do? What will they consider the most important for improving source code quality? What sort of issues will they address first or last and how will they solve them? In our paper, we look for answers to these questions in a case study of refactoring large-scale industrial systems where developers participated in a project to improve the quality of their software systems. We collected empirical data of over a thousand refactoring patches for 5 systems with over 5 million lines of code in total, and we found that developers really optimized the refactoring process to significantly improve the quality of these systems.
    BibTex:
    @inproceedings{Szoeke2014,
      author = {Szőke, Gábor and Nagy, Csaba and Ferenc, Rudolf and Gyimóthy, Tibor},
      editor = {Murgante, Beniamino and Misra, Sanjay and Rocha, AnaMariaA.C. and Torre, Carmelo and Rocha, JorgeGustavo and Falcão, MariaIrene and Taniar, David and Apduhan, BernadyO. and Gervasi, Osvaldo},
      title = {A Case Study of Refactoring Large-Scale Industrial Systems to Efficiently Improve Source Code Quality},
      booktitle = {Computational Science and Its Applications – ICCSA 2014},
      publisher = {Springer International Publishing},
      year = {2014},
      volume = {8583},
      pages = {524-540},
      doi = {10.1007/978-3-319-09156-3_37}
    }
  • A Regression Test Selection Technique for Magic Systems
    Gábor Novák, Csaba Nagy and Rudolf Ferenc
    In Proceedings of the 13th Symposium on Programming Languages and Software Tools (SPLST 2013). Szeged, Hungary, pp. 76-89, aug 26-27, 2013.
    PDF
    Abstract:
    Regression testing is an important step to make sure that after committing a change to our software we do not make unwanted changes to other, untouched features. For larger and faster evolving soft- ware, however, executing all the test cases of a regression test and easily become a tremendous process which takes too much time to thoroughly test each change separately. In our paper, we present a method to support regression testing with impact analysis based test selection. As a result, we and show a limited set of test cases that must be re-executed after a change, to test the changed part of the code and its related code elements. Our technique is implemented for a special 4th-generation language, the Magi xpa development environment. The technique was implemented in cooperation with our industrial partner, SZEGED Software In, who has been developing Magi applications for more than a decade.
    BibTex:
    @inproceedings{Novak2013,
      author = {Novák, Gábor and Nagy, Csaba and Ferenc, Rudolf},
      title = {A Regression Test Selection Technique for Magic Systems},
      booktitle = {Proceedings of the 13th Symposium on Programming Languages and Software Tools (SPLST 2013)},
      year = {2013},
      pages = {76-89}
    }
  • Static Analysis of Data-Intensive Applications
    Csaba Nagy
    In Proceedings of the 17th European Conference on Software Maintenance and Reengineering (CSMR 2013). Genova, Italy, IEEE Computer Society, mar 5-8, 2013.
    DOI PDF
    Abstract:
    Data-intensive systems are designed to handle data at massive scale, and during the years they might evolve to very large, complex systems. In order to support maintenance tasks of these systems several techniques have been developed to analyze the source code of applications or to analyze the underlying databases for the purpose of reverse engineering, e.g. quality assurance or program comprehension. However, only a few techniques take into account the specialties of data-intensive systems (e.g. dependencies arising via database accesses). In this thesis we conducted research to analyze and to improve data-intensive applications via different methods based on static analysis: methods for recovering architecture of data-intensive systems and a quality assurance methodology for applications developed in Magic 4GL. We targeted SQL as the most widespread databases are relational databases using certain dialect of SQL for their queries. With the proposed techniques we were able to analyze large scale industrial projects, such as banking systems with more than 3 million lines of code, and we successfully recovered architecture maps and quality issues of these systems.
    BibTex:
    @inproceedings{Nagy2013,
      author = {Nagy, Csaba},
      title = {Static Analysis of Data-Intensive Applications},
      booktitle = {Proceedings of the 17th European Conference on Software Maintenance and Reengineering (CSMR 2013)},
      publisher = {IEEE Computer Society},
      year = {2013},
      doi = {10.1109/CSMR.2013.66}
    }
  • A Methodology and Framework for Automatic Layout Independent GUI Testing of Applications Developed in Magic xpa
    Dániel Fritsi, Csaba Nagy, Rudolf Ferenc and Tibor Gyimóthy
    In Proceedings of the 13th International Conference on Computational Science and Its Applications - ICCSA 2013 - Part II. Ho Chi Minh City, Vietnam, pp. 513-528, Springer, jun 24-27, 2013.
    DOI PDF
    Abstract:
    Testing an application via its Graphical User Interface (GUI) requires lots of manual work, even if some steps of GUI testing can be automated. Test automation tools are great help for testers, particularly for regression testing. However these tools still lack some important features and still require manual work to maintain the test cases. For instance, if the layout of a window is changed without affecting the main functionality of the application, all test cases testing the window must be re-recorded again. This hard maintenance work is one of the greatest problems with the regression tests of GUI applications. In our paper we propose an approach to use the GUI information stored in the source code during automatic testing processes to create layout independent test scripts. The idea was motivated by testing an application developed in a fourth generation language, Magic. In this language the layout of the GUI elements (e.g. position and size of controls) are stored in the code and can be gathered via static code analysis. We implemented the presented approach for Magic xpa in a tool called Magic Test Automation, which is used by our industrial partner who has developed applications in Magic for more than a decade.
    BibTex:
    @inproceedings{Fritsi2013,
      author = {Fritsi, Dániel and Nagy, Csaba and Ferenc, Rudolf and Gyimóthy, Tibor},
      title = {A Methodology and Framework for Automatic Layout Independent GUI Testing of Applications Developed in Magic xpa},
      booktitle = {Proceedings of the 13th International Conference on Computational Science and Its Applications - ICCSA 2013 - Part II},
      publisher = {Springer},
      year = {2013},
      pages = {513-528},
      doi = {10.1007/978-3-642-39643-4_37}
    }
  • Designing and Implementing Control Flow Graph for Magic 4th Generation Language
    Richárd Dévai, Judit Jász, Csaba Nagy and Rudolf Ferenc
    In Proceedings of the 13th Symposium on Programming Languages and Software Tools (SPLST 2013). Szeged, Hungary, pp. 200-214, aug 26-27, 2013.
    PDF
    Abstract:
    A good compiler which implements many optimizations during its compilation phases must be able to perform several static analysis techniques such as control flow or data flow analysis. Besides compilers, these techniques are common for static analyzers to retrieve information from the code for example code auditing, quality assurance, or testing purposes. Implementing control flow analysis requires handling many special structures of the target language. In our paper we present our experiences in implementing control flow graph (CFG) construction for a special 4th generation language called Magic. During designing and implementing the CFG for this language we identified differences compared to 3rd generation languages because the special programming technique of this language (e.g. data access, parallel task execution, events). Our work was motivated by our industrial partner who needed precise static analysis tools (e.g. for quality assurance or testing purposes) for this language. We believe that our experiences for Magic, as a representative of 4GLs might be generalized for other languages too.
    BibTex:
    @inproceedings{Devai2013,
      author = {Dévai, Richárd and Jász, Judit and Nagy, Csaba and Ferenc, Rudolf},
      title = {Designing and Implementing Control Flow Graph for Magic 4th Generation Language},
      booktitle = {Proceedings of the 13th Symposium on Programming Languages and Software Tools (SPLST 2013)},
      year = {2013},
      pages = {200-214}
    }
  • Solutions for Reverse Engineering 4GL Applications, Recovering the Design of a Logistical Wholesale System
    Csaba Nagy, László Vidács, Rudolf Ferenc, Tibor Gyimóthy, Ferenc Kocsis and István Kovács
    In Proceedings of the 15th European Conference on Software Maintenance and Reengineering (CSMR 2011). pp. 343 -346, IEEE Computer Society, mar, 2011.
    DOI PDF
    Abstract:
    Re-engineering a legacy software system to support new, modern technologies instead of old ones is not an easy task, especially for large systems with a complex architecture. The use of reverse engineering tools is crucial for different subtasks of the full process, such as re-documenting the old code or recovering its design. There are many tools available to assist developers, but most of these tools were designed to deal with third generation languages (e.g. Java, C, C++, C#). However, many large systems are developed in higher level languages (e.g. Magic, Informix, ABAP) and current tools are not able to support all the arising problems during re-engineering systems written in fourth generation languages. In this paper we present a project whose main goal is the development of a technologically and functionally renewed medicinal wholesale system. This system is developed in Magic 4GL, and its development is based on re-engineering an old Magic (version 5) system to uniPaaS, which is the current release version of Magic. In the early phases of this project we developed a reverse engineering toolset for Magic 4GL to support reverse engineering, recovering the design of the old system, and to support some forward engineering tasks too. Here we present a report on this project that was carried out in cooperation with SZEGED Software Zrt and the Department of Software Engineering at the University of Szeged. The project was partly funded by the Economic Development Operational Programme, New Hungary Development Plan.
    BibTex:
    @inproceedings{Nagy2011b,
      author = {Nagy, Csaba and Vidács, László and Ferenc, Rudolf and Gyimóthy, Tibor and Kocsis, Ferenc and Kovács, István},
      title = {Solutions for Reverse Engineering 4GL Applications, Recovering the Design of a Logistical Wholesale System},
      booktitle = {Proceedings of the 15th European Conference on Software Maintenance and Reengineering (CSMR 2011)},
      publisher = {IEEE Computer Society},
      year = {2011},
      pages = {343 -346},
      doi = {10.1109/CSMR.2011.66}
    }
  • Complexity measures in 4GL environment
    Csaba Nagy, László Vidács, Rudolf Ferenc, Tibor Gyimóthy, Ferenc Kocsis and István Kovács
    In Proceedings of the 2011 International Conference on Computational Science and Its Applications - Volume Part V. Santander, Spain, pp. 293-309, Springer-Verlag, jun 20-23, 2011.
    DOI PDF
    Abstract:
    Nowadays, the most popular programming languages are socalled third generation languages, such as Java, C# and C++, but higher level languages are also widely used for application development. Our work was motivated by the need for a quality assurance solution for a fourth generation language (4GL) called Magic. We realized that these very high level languages lie outside the main scope of recent static analysis techniques and researches, even though there is an increasing need for solutions in 4GL environment. During the development of our quality assurance framework we faced many challenges in adapting metrics from popular 3GLs and defining new ones in 4GL context. Here we present our results and experiments focusing on the complexity of a 4GL system. We found that popular 3GL metrics can be easily adapted based on syntactic structure of a language, however it requires more complex solutions to define complexity metrics that are closer to developers' opinion. The research was conducted in co-operation with a company where developers have been programming in Magic for more than a decade. As an outcome, the resulting metrics are used in a novel quality assurance framework based on the Columbus methodology.
    BibTex:
    @inproceedings{Nagy2011a,
      author = {Nagy, Csaba and Vidács, László and Ferenc, Rudolf and Gyimóthy, Tibor and Kocsis, Ferenc and Kovács, István},
      title = {Complexity measures in 4GL environment},
      booktitle = {Proceedings of the 2011 International Conference on Computational Science and Its Applications - Volume Part V},
      publisher = {Springer-Verlag},
      year = {2011},
      pages = {293-309},
      doi = {10.1007/978-3-642-21934-4_25}
    }
  • A true story of refactoring a large Oracle PL/SQL banking system
    Csaba Nagy, Rudolf Ferenc and Tibor Bakota
    In Industrial Track of the 8th joint meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on the Foundations of Software Engineering (ESEC/FSE 2011). Szeged, Hungary, sep 5-9, 2011.
    PDF
    Abstract:
    It is common that due to the pressure of business, banking systems evolve and grow fast and even the slightest wrong decision may result in losing control over the codebase in long term. Once it happens, the business will not drive developments any more, but will be constrained by maintenance preoccupations. As easy is to lose control, as hard is to regain it again. Software comprehension and refactoring are the proper means for reestablishing governance over the system, but they require sophisticated tools and methods that help analyzing, understanding and refactoring the codebase. This paper tells a true story about how control has been lost and regained again in case of a real banking system written in PL/SQL programming language.
    BibTex:
    @inproceedings{Nagy2011,
      author = {Nagy, Csaba and Ferenc, Rudolf and Bakota, Tibor},
      title = {A true story of refactoring a large Oracle PL/SQL banking system},
      booktitle = {Industrial Track of the 8th joint meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on the Foundations of Software Engineering (ESEC/FSE 2011)},
      year = {2011}
    }
  • A Layout Independent GUI Test Automation Tool for Applications Developed in Magic/uniPaaS
    Dániel Fritsi, Csaba Nagy, Rudolf Ferenc and Tibor Gyimóthy
    In Proceedings of the 12th Symposium on Programming Languages and Software Tools (SPLST 2011). Tallinn, Estonia, pp. 248-259, oct 4-7, 2011.
    PDF
    Abstract:
    A good software development process involves thorough testing phases, that are usually expensive, but necessary to deliver a reliable and high quality product. Testing an application via its graphical user interface requires lots of manual work, even if some steps of GUI testing can be automated. Test automation tools are a great help for testers, particularly for regression tests. However these tools still lack some important features and still require manual work to maintain the test cases. For instance, if the layout of a window is changed without affecting the main functionality of the application, all test cases testing the window must be re-recorded again. This hard maintenance work is one of the greatest problems with the regression tests of GUI applications. In our paper we propose an approach to use the GUI information stored in the source code during automatic testing processes to create layout independent test scripts. With this technique, the already recorded tests scripts will be unaffected by minor changes in the GUI. It reduces the maintenance effort of very expensive regression tests where thousands of test cases have to be maintained by testing teams. The idea was motivated by testing an application developed in a fourth generation language, Magic/uniPaaS. In this language the layout of the GUI elements (structure of the window, position and size of controls, etc.) are stored in the code and it can be gathered via static code analysis. We implemented the presented approach for Magic/uniPaaS, and our Magic Test Automation tool is used by our industrial partner who has developed applications in Magic/uniPaaS for more than a decade.
    BibTex:
    @inproceedings{Fritsi2011,
      author = {Fritsi, Dániel and Nagy, Csaba and Ferenc, Rudolf and Gyimóthy, Tibor},
      title = {A Layout Independent GUI Test Automation Tool for Applications Developed in Magic/uniPaaS},
      booktitle = {Proceedings of the 12th Symposium on Programming Languages and Software Tools (SPLST 2011)},
      year = {2011},
      pages = {248-259}
    }
  • CIASYS--Change Impact Analysis at System Level
    Gabriella Tóth, Csaba Nagy, Judit Jász, Árpád Beszédes and Fülöp Lajos
    In Proceedings of the 14th European Conference on Software Maintenance and Reengineering (CSMR 2010). Madrid, Spain, pp. 198-201, IEEE Computer Society, mar 15-18, 2010.
    DOI
    Abstract:
    The research field of change impact analysis plays an important role in software engineering theory and practice nowadays. Not only because it has many scientific challenges, but it has many industrial applications too (e.g., cost estimation, test optimization), and the current techniques are still not ready to fulfill the requirements of industry. Typically, the current solutions lack a whole-system view and give either precise results with high computation costs or less precise results with fast algorithms. For these reasons, they are not applicable to large industrial systems where both scalability and precision are very important. In this paper, we present a project whose main goal is to develop an innovative change impact analysis software-suit based on recent scientific results and modern technologies. The suite will use hybrid analysis techniques to benefit from all the advantages of static and dynamic analyses. In addition, it will be able to determine the dependencies at system level of software systems with heterogeneous architecture. The software is being developed by FrontEndART Ltd. while the theoretical and technological background is provided by the Department of Software Engineering at the University of Szeged. The project is funded by the Economic Development Operational Programme, New Hungary Development Plan.
    BibTex:
    @inproceedings{Toth2010,
      author = {Tóth, Gabriella and Nagy, Csaba and Jász, Judit and Beszédes, Árpád and Lajos, Fülöp},
      title = {CIASYS--Change Impact Analysis at System Level},
      booktitle = {Proceedings of the 14th European Conference on Software Maintenance and Reengineering (CSMR 2010)},
      publisher = {IEEE Computer Society},
      year = {2010},
      pages = {198-201},
      doi = {10.1109/CSMR.2010.35}
    }
  • Towards a Safe Method for Computing Dependencies in Database-Intensive Systems
    Csaba Nagy, János Pántos, Tamás Gergely and Árpád Beszédes
    In Proceedings of the 14th European Conference on Software Maintenance and Reengineering (CSMR 2010). Madrid, Spain, pp. 166-175, IEEE Computer Society, mar 15-18, 2010.
    DOI PDF
    Abstract:
    Determining dependencies between different components of an application is useful in lots of applications (e. g., architecture reconstruction, reverse engineering, regression test case selection, change impact analysis). However, implementing automated methods to recover dependencies has many challenges, particularly in systems using databases, where dependencies may arise via database access. Furthermore, it is especially hard to find safe techniques (which do not omit any important dependency) that are applicable to large and complex systems at the same time. We propose two techniques that can cope with these problems in most situations. These methods compute dependencies between procedures or database tables, and they are based on the simultaneous static analysis of the source code, the database schema and the SQL instructions. In this paper, we quantitatively and qualitatively evaluate the methods on real-life data, and also evaluate them on some of their potential applications.
    BibTex:
    @inproceedings{Nagy2010a,
      author = {Nagy, Csaba and Pántos, János and Gergely, Tamás and Beszédes, Árpád},
      title = {Towards a Safe Method for Computing Dependencies in Database-Intensive Systems},
      booktitle = {Proceedings of the 14th European Conference on Software Maintenance and Reengineering (CSMR 2010)},
      publisher = {IEEE Computer Society},
      year = {2010},
      pages = {166-175},
      doi = {http://doi.ieeecomputersociety.org/10.1109/CSMR.2010.29}
    }
  • MAGISTER: Quality assurance of Magic applications for software developers and end users
    Csaba Nagy, László Vidacs, Rudolf Ferenc, Tibor Gyimóthy, Ferenc Kocsis and István Kovács
    In Proceedings of the 26th IEEE International Conference on Software Maintenance (ICSM 2010). Timisoara, Romania, pp. 1-6, IEEE Computer Society, sep 12-18, 2010.
    DOI PDF
    Abstract:
    Nowadays there are many tools and methods available for source code quality assurance based on static analysis, but most of these tools focus on traditional software development techniques with 3GL languages. Besides procedural languages, 4GL programming languages such as Magic 4GL and Progress are widely used for application development. All these languages lie outside the main scope of analysis techniques. In this paper we present MAGISTER, which is a quality assurance framework for applications being developed in Magic, a 4GL application development solution created by Magic Software Enterprises. MAGISTER extracts data using static analysis methods from applications being developed in different versions of Magic (v5-9 and uniPaaS). The extracted data (including metrics, rule violations and dependency relations) is presented to the user via a GUI so it can be queried and visualized for further analysis. It helps software developers, architects and managers through the full development cycle by performing continuous code scans and measurements.
    BibTex:
    @inproceedings{Nagy2010,
      author = {Nagy, Csaba and Vidacs, László and Ferenc, Rudolf and Gyimóthy, Tibor and Kocsis, Ferenc and Kovács, István},
      title = {MAGISTER: Quality assurance of Magic applications for software developers and end users},
      booktitle = {Proceedings of the 26th IEEE International Conference on Software Maintenance (ICSM 2010)},
      publisher = {IEEE Computer Society},
      year = {2010},
      pages = {1-6},
      doi = {10.1109/ICSM.2010.5609550}
    }
  • Static Security Analysis Based on Input-Related Software Faults
    Csaba Nagy and Spiros Mancoridis
    In Proceedings of the 13th European Conference on Software Maintenance and Reengineering (CSMR '09). Fraunhofer IESE, Kaiserslautern, Germany, pp. 37-46, IEEE Computer Society, mar 24-27, 2009.
    DOI PDF
    Abstract:
    It is important to focus on security aspects during the development cycle to deliver reliable software. However, locating security faults in complex systems is difficult and there are only a few effective automatic tools available to help developers. In this paper we present an approach to help developers locate vulnerabilities by marking parts of the source code that involve user input. We focus on input-related code, since an attacker can usually take advantage of vulnerabilities by passing malformed input to the application. The main contributions of this work are two metrics to help locate faults during a code review, and algorithms to locate buffer overflow and format string vulnerabilities in C source code. We implemented our approach as a plug in to the Grammatech CodeSurfer tool. We tested and validated our technique on open source projects and we found faults in software that includes Pidgin and cyrus-imapd.
    BibTex:
    @inproceedings{Nagy2009,
      author = {Nagy, Csaba and Mancoridis, Spiros},
      title = {Static Security Analysis Based on Input-Related Software Faults},
      booktitle = {Proceedings of the 13th European Conference on Software Maintenance and Reengineering (CSMR '09)},
      publisher = {IEEE Computer Society},
      year = {2009},
      pages = {37-46},
      doi = {10.1109/CSMR.2009.51}
    }
  • Code factoring in GCC on different intermediate languages
    Csaba Nagy, Gábor Lóki, Árpád Beszédes and Tibor Gyimóthy
    In Proceedings of the 10th Symposium on Programming Languages and Software Tools (SPLST 2007). Budapest, Hungary, pp. 81-95, jun 14-16, 2007.
    Abstract:
    BibTex:
    @inproceedings{Nagy2007b,
      author = {Nagy, Csaba and Lóki, Gábor and Beszédes, Árpád and Gyimóthy, Tibor},
      title = {Code factoring in GCC on different intermediate languages},
      booktitle = {Proceedings of the 10th Symposium on Programming Languages and Software Tools (SPLST 2007)},
      year = {2007},
      pages = {81-95}
    }
  • Extension of GCC with a fully manageable reverse engineering front end
    Csaba Nagy
    In Proceedings of the 7th International Conference on Applied Informatics (ICAI 2007). Eger, Hungary, jan 28-31, 2007.
    PDF
    Abstract:
    In the open source community one of the most popular compiler is GNU GCC. It is a very complex and robust compiler but because of its working mechanism it has no ability for special transformations like interprocedural optimizations. A typical compiler has a three sided construction. It has a front end for analyzes and for building an abstract internal representation of the program, a middle for transformations (eg. optimizations), and a back end for final code generation. However there are smaller but very useful projects for only front/middle/back ends, too. It seems possible to achieve a more effective compiler by extending GCC with a front end which is capable of running special algorithms. This paper shows one solution for this extension. The described method is based on using Columbus/CAN instead of GCC's front end and GCC's back end for code generation. As Columbus has a well-structured schema for the representation of C++ sources, by this transformation we will have the ability to execute those special transformations on the code before the compiling phases. Furthermore this technique opens the possibility to link other front ends (like EDG) with GCC to achieve a more powerful compiler, for example in code size optimizations. This approach has been tested on GCC's official Code-Size Benchmark Environment (CSiBE) as real-world system and for the testing diffeerent metrics have been measured on the compilation with this 'extended compiler' and with the official GCC.
    BibTex:
    @inproceedings{Nagy2007a,
      author = {Nagy, Csaba},
      title = {Extension of GCC with a fully manageable reverse engineering front end},
      booktitle = {Proceedings of the 7th International Conference on Applied Informatics (ICAI 2007)},
      year = {2007}
    }

Book Chapters

Journal Articles

  • Empirical study on refactoring large-scale industrial systems and its effects on maintainability
    Gábor Szőke, Gábor Antal, Csaba Nagy, Rudolf Ferenc and Tibor Gyimóthy
    Journal of Systems and Software. Vol. 129, pp. 107-126, jul, 2017.
    DOI PDF
    Abstract:
    Abstract Software evolves continuously, it gets modified, enhanced, and new requirements always arise. If we do not spend time occasionally on improving our source code, its maintainability will inevitably decrease. The literature tells us that we can improve the maintainability of a software system by regularly refactoring it. But does refactoring really increase software maintainability? Can it happen that refactoring decreases the maintainability? Empirical studies show contradicting answers to these questions and there have been only a few studies which were performed in a large-scale, industrial context. In our paper, we assess these questions in an in vivo context, where we analyzed the source code and measured the maintainability of 6 large-scale, proprietary software systems in their manual refactoring phase. We analyzed 2.5 million lines of code and studied the effects on maintainability of 315 refactoring commits which fixed 1273 coding issues. We found that single refactorings only make a very little difference (sometimes even decrease maintainability), but a whole refactoring period, in general, can significantly increase maintainability, which can result not only in the local, but also in the global improvement of the code.
    BibTex:
    @article{Szoeke2017,
      author = {Szőke, Gábor and Antal, Gábor and Nagy, Csaba and Ferenc, Rudolf and Gyimóthy, Tibor},
      title = {Empirical study on refactoring large-scale industrial systems and its effects on maintainability},
      journal = {Journal of Systems and Software},
      year = {2017},
      volume = {129},
      pages = {107-126},
      doi = {http://dx.doi.org/10.1016/j.jss.2016.08.071}
    }
  • Designing and Implementing Control Flow Graph for Magic 4th Generation Language
    Richárd Dévai, Judit Jász, Csaba Nagy and Rudolf Ferenc
    Acta Cybernetica. Vol. 21, 3, pp. 419-437, 2014.
    PDF
    Abstract:
    A good compiler which implements many optimizations during its compilation phases must be able to perform several static analysis techniques such as control flow or data flow analysis. Besides compilers, these techniques are common for static analyzers as well to retrieve information from source code, for example for code auditing, quality assurance or testing purposes. Implementing control flow analysis requires handling many special structures of the target language. In our paper we present our experiences in implementing control flow graph (CFG) construction for a special 4th generation language called Magic. While we were designing and implementing the CFG for this language, we identified differences compared to 3rd generation languages mostly because of the unique programming technique of Magic (e.g. data access, parallel task execution, events). Our work was motivated by our industrial partner who needed precise static analysis tools (e.g. for quality assurance or testing purposes) for this language. We believe that our experiences for Magic, as a representative of 4GLs, might be generalized for other languages too.
    BibTex:
    @article{Devai2014,
      author = {Dévai, Richárd and Jász, Judit and Nagy, Csaba and Ferenc, Rudolf},
      title = {Designing and Implementing Control Flow Graph for Magic 4th Generation Language},
      journal = {Acta Cybernetica},
      year = {2014},
      volume = {21},
      number = {3},
      pages = {419-437}
    }
  • Code factoring in GCC on different intermediate languages
    Csaba Nagy, Gábor Lóki, Árpád Beszédes and Tibor Gyimóthy
    ANNALES UNIVERSITATIS SCIENTIARUM BUDAPESTINENSIS DE ROLANDO EOTVOS NOMINATAE Sectio Computatorica - TOMUS XXX. pp. 79-96, 2009.
    PDF
    Abstract:
    Today as handheld devices (smart phones, PDAs, etc.) are becoming increasingly popular, storage capacity becomes more and more important. One way to increase capacity is to optimize static executables on the device. This resulted that code-size optimization gets bigger attention nowadays and new techniques are observed, like code factoring which is still under research. Although GNU GCC is the most common compiler in the open source community and has many implemented algorithms for code-size optimization, the compiler is still weak in these methods, which can be turned on using the `-Os' flag. In this article we would like to give an overview on implementation of different code factoring algorithms (local factoring, sequence abstraction, interprocedural abstraction) on the IPA, Tree, Tree SSA and RTL passes of GCC. The correctness of the implementation was checked, and the results were measured on different architectures with GCC's official Code-Size Benchmark Environment (CSiBE) as a real-world system. These results showed that on the ARM architecture we could achieve 61.53% maximum and 2.58% average extra code-size saving compared to the `-Os' flag of GCC.
    BibTex:
    @article{Nagy2009a,
      author = {Nagy, Csaba and Lóki, Gábor and Beszédes, Árpád and Gyimóthy, Tibor},
      title = {Code factoring in GCC on different intermediate languages},
      journal = {ANNALES UNIVERSITATIS SCIENTIARUM BUDAPESTINENSIS DE ROLANDO EOTVOS NOMINATAE Sectio Computatorica - TOMUS XXX},
      year = {2009},
      pages = {79-96}
    }

Theses

  • Evaluating optimization and reverse engineering techniques on data-intensive systems
    Csaba Nagy
    PhD Thesis, University of Szeged, Szeged, Hungary, dec, 2013.
    PDF
    Abstract:
    BibTex:
    @phdthesis{Nagy2013a,
      author = {Nagy, Csaba},
      title = {Evaluating optimization and reverse engineering techniques on data-intensive systems},
      school = {University of Szeged},
      year = {2013}
    }
  • Extension of GCC with a fully manageable reverse engineering front end
    Csaba Nagy
    Master's Thesis, University of Szeged, Szeged, Hungary, 2007.
    PDF
    Abstract:
    BibTex:
    @mastersthesis{Nagy2007,
      author = {Nagy, Csaba},
      title = {Extension of GCC with a fully manageable reverse engineering front end},
      school = {University of Szeged},
      year = {2007}
    }

Miscellaneous

  • Parsing and Analyzing SQL Queries in Stack Overflow Questions
    Csaba Nagy and Anthony Cleve
    In Preproceedings of the Eight Seminar Series on Advanced Techniques & Tools for Software Evolution (SATToSE 2015). Mons, Belgium, jul 6-8, 2015.
    PDF
    Abstract:
    The rapid growth and increasing popularity of Stack Overflow made it a large knowledge base of several programming topics which also attracts researchers. To mention a few examples, they study actual trends that developers follow design questions of Q&A systems island parsing techniques to analyze posts, recommendation systems, and try model the quality of the posts. In our paper, we introduce an approach to parse and analyze SQL queries in Stack Overflow questions with the main goal to identify common patterns among them. Such similar structures in SQL statements can point to problematic language constructs (e.g. antipatterns) in SQL statements which should be avoided by developers.
    BibTex:
    @inproceedings{Nagy2015_misc,
      author = {Nagy, Csaba and Cleve, Anthony},
      title = {Parsing and Analyzing SQL Queries in Stack Overflow Questions},
      booktitle = {Preproceedings of the Eight Seminar Series on Advanced Techniques & Tools for Software Evolution (SATToSE 2015)},
      year = {2015}
    }
  • Adat-intenzív szoftverrendszerek
    Csaba Nagy
    In SZTE Talent Press Az SZTE Tehetségpont disszeminációs és tudomány-népszerűsítő magazinja. VI, University of Szeged, 2014.
    PDF
    Abstract:
    BibTex:
    @incollection{Nagy2014a_misc,
      author = {Nagy, Csaba},
      title = {Adat-intenzív szoftverrendszerek},
      booktitle = {SZTE Talent Press Az SZTE Tehetségpont disszeminációs és tudomány-népszerűsítő magazinja},
      publisher = {University of Szeged},
      year = {2014},
      number = {VI}
    }
  • A Static Concept Location Technique for Data-Intensive Systems: "Where Was This SQL Query Executed?"
    Csaba Nagy and Anthony Cleve
    In Proceedings of the Software Evolution in Belgium and the Netherlands seminar (BENEVOL 2014). Amsterdam, Netherland, Centrum Wiskunde & Informatica (CWI), nov 27-28, 2014.
    PDF
    Abstract:
    An evolving software system is incrementally modified, changed by its developers during the development and maintenance phases. Before the developers start working on a change they need to identify which parts of the source code implement the feature, and should be touched first during the change. In practice, what they do is a concept location task (also known as feature identification/location) which is 'the process that identifies where a software system implements a specific concept'. There are many existing approaches to support developers in concept location tasks starting from simple pattern matching (so-called `grep' techniques) to more sophisticated methods like IR- based techniques or dependency analyzes. However, none of the existing approaches consider when there is a database in the architecture, which adds further source artifacts or dependencies. Here, we investigate a concept location approach for data-intensive systems, as applications with at least one database server in their architecture which is intensively used by its clients. Specifically, we introduce a static technique to identify the location(s) in the source code where a given SQL query was potentially sent to the database server.
    BibTex:
    @inproceedings{Nagy2014_misc,
      author = {Nagy, Csaba and Cleve, Anthony},
      title = {A Static Concept Location Technique for Data-Intensive Systems: "Where Was This SQL Query Executed?"},
      booktitle = {Proceedings of the Software Evolution in Belgium and the Netherlands seminar (BENEVOL 2014)},
      publisher = {Centrum Wiskunde & Informatica (CWI)},
      year = {2014}
    }
  • Static Security Analysis Based on Input Related Software Faults
    Csaba Nagy
    In Proceedings of the Hungarian-American Sholarship Fund (HAESF) Five Year Anniversary Conference. Hungarian Academy of Sciences, Budapest, Hungary, pp. 12, HEASF, CIEE, sep 18, 2009.
    Abstract:
    BibTex:
    @conference{Nagy2009b_misc,
      author = {Nagy, Csaba},
      title = {Static Security Analysis Based on Input Related Software Faults},
      booktitle = {Proceedings of the Hungarian-American Sholarship Fund (HAESF) Five Year Anniversary Conference},
      publisher = {HEASF, CIEE},
      year = {2009},
      pages = {12}
    }

ACTIVITIES

  • Program Committee Member of the 37th International Conference on Software Maintenance and Evolution (ICSME 2021), NIER Track, Luxembourg City, September 27 - October 1, 2021
  • Program Committee Member of the 9th IEEE Working Conference on Software Visualization (VISSOFT 2021), Luxembourg City, Luxembourg, September 27-28, 2021
  • Program Committee Member of the 2021 Mining Software Repositories Conference (MSR 2021) Registered Report (RR) Track, Virtual Event, May 17–19, 2021
  • Program Committee Member of the 35th IEEE/ACM International Conference on Automated Software Engineering (ASE 2020) Tool Demonstration Track, Melbourne, Australia from September 21-25, 2020
  • Program Committee Member of the 28th International Conference on Program Comprehension 2020 (ICPC 2020) Research Track, Seoul, South Korea, May 23-24, 2020
  • Program Committee Member of the 27th IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER 2020) Tool Demo Track, London, Ontario, February 18-21, 2020
  • Program Committee Member of the International Conference on Technical Debt 2020 (TechDebt 2020) Engineering Track, Seoul, South Korea, May 25-26, 2020
  • Program Committee Member of the 35th International Conference on Software Maintenance and Evolution (ICSME 2019) Artifacts Track, Cleveland, OH, USA, September 30-October 4, 2019
  • Organizing Committee Member of the 2nd International Summer School on Software Engineering (SIESTA 2019), September 3-6, 2019, Termoli, Italy
  • Program Committee Member of the 19th IEEE International Working Conference on Source Code Analysis and Manipulation (SCAM 2019) Engineering Track, Cleveland, OH, USA, September 30-October 1, 2019
  • Program Committee Member of the IEEE 13th International Conference on Research Challenges in Information Science (RCIS 2019) Engineering Track, Brussels, Belgium, 29-31 May 2019
  • Program Committee Member of the 18th IEEE International Working Conference on Source Code Analysis and Manipulation (SCAM 2018) Engineering Track, Madrid, Spain, September 23-24, 2018
  • Organizing Committee Member of the 1st International Summer School on Software Engineering (SIESTA 2018), September 10-12, 2018, Lugano, Switzerland
  • Program Committee Member of the 16th BElgian-NEtherlands software eVOLution symposium (BENEVOL 2017), Antwerp, Belgium, December 4-5, 2017
  • Program Committee Member of the 33rd IEEE International Conference on Software Maintenance and Evolution (ICSME 2017) Industry Track, Shanghai, China, September 17-24, 2017
  • Program Committee Member of the 17th IEEE International Working Conference on Source Code Analysis and Manipulation (SCAM 2017) Engineering Track, Shanghai, China, September 17-18, 2017
  • Program Committee Member of the 25th IEEE International Conference on Program Comprehension (ICPC 2017) Tool Demo Track, Buenos Aires, Argentina, May 22-23, 2017
  • Program Committee Member of the 16th IEEE International Working Conference on Source Code Analysis and Manipulation (SCAM 2016) Engineering Track, Raleigh, North Carolina, USA, October 2-3, 2016
  • Program Committee Member of the 32nd IEEE International Conference on Software Maintenance and Evolution (ICSME 2016) Industry Track, Raleigh, North Carolina, USA, October 2-10, 2016
  • Program Committee Member of the 15th IEEE International Working Conference on Source Code Analysis and Manipulation (SCAM 2015) Tool Demo Track, Bremen, Germany, September 27-28, 2015
  • Program Committee Member of the 8th Seminar Series on Advanced Techniques & Tools for Software Evolution (SATToSE 2015), Mons, Belgium, July 6-8, 2015
  • Web & Publication co-Chair of the 16th European Conference on Software Maintenance and Reengineering (CSMR 2012), Szeged, Hungary, March 27-30, 2012
  • Social Chair of the 8th joint meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on the Foundations of Software Engineering (ESEC/FSE 2011), Szeged, Hungary, September 5-9, 2011

© 2018-2021 Csaba Nagy, All rights reserved.