Date(s) - 22/11/2019
8:30 am - 5:00 pm
The topic of the CHOOSE forum 2019 is Software Testing.
To seed discussions, we sampled the space with five high-profile talks from both academia and industry. Confirmed speakers for this year are: Maurício F. Aniche (Delft University of Technology, The Netherlands), Paola Bisogno (Dove.it, Milan, Italy), Stéphane Ducasse (Inria Lille, France), Milos Gligoric (University of Texas, Austin, USA), Paolo Tonella (Università Della Svizzera Italiana, Lugano, Switzerland).
|08:30 – 08:45||Registration|
|08:45 – 09:00||Welcome and introduction|
|09:00 – 09:45||“Testing in Production”
Maurício F. Aniche, Delft University of Technology, The Netherlands
|09:45 – 10:00||Short Break|
|10:00 – 10:45||“Rotten Green Tests”
Stéphane Ducasse, Inria Lille, France
|10:45 – 11:15||Coffee Break|
|11:15 – 12:00||“System-supported Test Acceleration”
Milos Gligoric, University of Texas, Austin, USA
|12:00 – 12:20||CHOOSE General Assembly|
|12:20 – 13:30||Lunch|
|13:30 – 14:15||“How to test a system based on deep learning”
Paolo Tonella, Universita’ della Svizzera Italiana, Lugano, Switzerland
|14:15 – 14:30||Short Break|
|14:30 – 15:15||“From good to great: the importance of A/B testing in digital product environment”
Paola Bisogno, Dove.it, Milan, Italy
|15:15 – 15:45||Coffee Break|
|15:45 – 16:30||Panel|
|16:30 – 17:15||Closing|
Maurício F. Aniche
Testing in production
Testing in production” used to be a joke among developers. However, given the complexity of the large and distributed systems that take care of important parts of our lives, “testing in development”, or, in other words, prevention, might not be enough anymore. In this talk, I’ll discuss the importance of systems monitoring, logging, and log analysis to modern software systems. I’ll reflect on the current state-of-the-art in industry and research fields, as well as the current open challenges. A great part of this talk is based on the research we conducted at Adyen, a large-scale payment company, that serves companies such as Facebook, Uber, and Spotify.
Maurício is an Assistant Professor in Software Engineering at Delft University of Technology, The Netherlands. Maurício’s line of work focuses on how to make developers more productive during maintenance and testing. His research has been published in top-tier conferences (ICSE, FSE, ASE) and journals (TSE, EMSE). Maurício always had a foot in industry. During his MSc, Maurício co-founded Alura, the biggest e-learning platform for software engineers in Brazil. Because of Alura, Maurício has given training and consultancy on software development and testing to 27 different companies, from 2010 to 2015. Moreover, he published three books focused on practitioners (“OOP and SOLID for ninjas”, “Test-Driven Development in the real world”, and “A Practical Guide on Software Testing”), which, altogether, have sold 10k copies. All these activities have given him a very particular vision on software engineering and testing should be done in practice. Now, fully dedicated to academia, Maurício still (desires and) partners up with companies. In the last two years, Maurício has been working closely with Adyen, a Dutch payment unicorn. His work with the company has been published in prestigious venues such as FSE, as well as ICSE’s and ICSME’s industry tracks.
From good to great: the importance of A/B testing in digital product environment
Since digital products changed a lot from the diffusion of the internet in the 90s, companies aim to understand user expectations and needs more than ever.
A/B test (sometimes called “split test”) compares two or more variations of the same product, in order to understand which one converts better through user behaviour. They’re spread across different contexts and industries, from digital marketing to UX Design, and their purpose could be different depending on the tested element.
Measuring the impact of changes helps companies understand user preferences, identifying pain points and the “bad UX” presence, but also learn how to maximize their revenues and conversions.
From DOM manipulation to heatmaps, let’s find out together the best techniques, practices and tools to A/B test a design and how to understand outcome data. We will also discuss ethics, performance and design rules, as well as learn how to launch an A/B test that will truly make an impact on user experience and companies business.
Paola is an italian digital product designer with more than 15 years of experience. She started as a web master in the early 2000s, both designing and coding websites in web 1.0 era.
After spending some years in digital agencies, she switched in product industry, mainly focused on user experience and interface design.
Currently, she works as UX/UI Designer at Dove.it, the most innovative real estate company in Italy, where she joined the tech team at day one to build the product interface and design system from scratch. Her work is focused mostly on: usability, design patterns, engigneering design process, heuristic evaluation, information architecture, cognitive design and optimize design workflow. She is a jury member of the international design and development award CSS Design Awards and also a speaker in design events and conferences. She leads, with other designers, the biggest community about UX and UI Design in Italy, organizing events and workshops. She is also involved in mentorship programs with the aim of helping young women to start a career in IT and design.
Rotten Green Tests
Unit tests are a tenant of agile programming methodologies, and are widely used to improve code quality and prevent code regression. A passing (green) test is usually taken as a robust sign that the code under test is valid. However, some green tests contain assertions that are never executed. We call such tests Rotten Green Tests.
Rotten Green Tests represent a case worse than a broken test: they report that the code under test is valid, but in fact do not test that validity. We describe an approach to identify rotten green tests by combining simple static and dynamic call-site analyses. Our approach takes into account test helper methods, inherited helpers, and trait compositions, and has been implemented in a tool called DrTest. DrTest reports no false negatives, yet it still reports some false positives due to conditional use or multiple test contexts. Using DrTest we conducted an empirical evaluation of 19,905 real test cases in mature projects of the Pharo ecosystem. The results of the evaluation shows that the tool is effective; it detected 294 tests as rotten– passing tests that contain assertions but that are not executed. First experiences on Java are showing that there are rotten tests.
I’m an Inria Research Director. I lead RMoD team http://rmod.lille.inria.fr. I’m expert in language design and reengineering. I worked on traits. Traits have been introduced in Pharo, Perl, PHP and under a variant into Scala, Groovy and Fortress. I’m expert on software quality, program understanding, program visualisations, reengineering and metamodeling. I’m one of the developer of Moose, an open-source software analysis platform http://www.moosetechnology.org/. I created Synectique a company building dedicated tools for advanced software analyses. I’m one of the leader of Pharo http://www.pharo.org/ a dynamic reflective object-oriented language supporting live programming. I built the industrial Pharo consortium http://consortium.pharo.org. I work regularly with companies (Thales, Wordline, Siemens, Berger-Levrault, Arolla,…) on software evolution problems.
I wrote couple hundred articles and several books. According to google my h-index is 54 for more than 12800 citations. I like to help people becoming what they want and building things.
Systems-Supported Test Acceleration
Developers often maintain regression tests that are executed at each commit to check that recent code changes do not break previously working functionalities. Although regression tests are valuable, they are costly to execute even for large companies, such as Google and Microsoft. My interest in the topic was triggered by a painful experience with running regression tests and observing others waste their time and resources. I will talk about various ways to optimize regression testing and describe several techniues that we have developed, including Ekstazi, RTSLinux, and GVM; the former two optimize regression testing by skipping tests that are not impacted by recent code change, and the latter optimizes test execution by utilizing GPUs. Ekstazi is adopted by open-source projects, RTSLinux supports projects written in multiple programming languages, and GVM is the first system that enables executing Java bytecode interpreters entirely on GPUs.
Milos Gligoric is an Assistant Professor in Electrical and Computer Engineering at the University of Texas at Austin. His main research interest is in software engineering, especially in designing techniques and tools that improve software quality and developers’ productivity. His work has explored test-input generation, test-quality assessment, testing concurrent code, regression testing, proof engineering, system-supported software engineering, and fusion of software engineering and natural language processing. Three of his papers won ACM SIGSOFT Distinguished Paper awards (ICSE 2010, ISSTA 2015, and ESEC/FSE 2019), and three of his papers were invited for journal submissions. Milos’ research has been supported by Google, Huawei, NSF, Runtime Verification, and Samsung. Milos was awarded an NSF CAREER Award (2016). He holds a PhD (2015) from UIUC, and an MS (2009) and BS (2007) from the University of Belgrade, Serbia.
How to test a system based on deep learning
Deep neural networks show promising performance in safety and business critical tasks, such as autonomous driving and financial trading. Hence, their dependability and reliability have become a major concern, which cannot be addressed by resorting to well established software testing and verification practices. In fact, the root cause of a fault in a deep learning based system is quite peculiar and different from traditional software faults. In this talk I will present the research project Precrime that was recently funded by the European Research Council under the ERC Advanced grant program. I will discuss the nature of deep learning faults, presenting a fault taxonomy obtained from multiple sources, such as software repository and forum mining, as well as interviews with developers. Then, I will consider the assessment of the quality of deep learning systems, introducing the notion of frontier of behaviours. Finally, I will describe a technique for misbehaviour prediction that aims at anticipating and preventing failures of such systems.
Paolo Tonella is Full Professor at the Faculty of Informatics and at the Software Institute of Università della Svizzera Italiana (USI) in Lugano, Switzerland. He is Honorary Professor at University College London, UK and he is Affiliated Fellow of Fondazione Bruno Kessler, Trento, Italy, where he has been Head of Software Engineering until mid 2018. Paolo Tonella holds an ERC Advanced grant as Principal Investigator of the project PRECRIME. Paolo Tonella wrote over 150 peer reviewed conference papers and over 50 journal papers. His H-index (according to Google scholar) is 55. He is/was in the editorial board of the ACM Transactions on Software Engineering and Methodology, of the IEEE Transactions on Software Engineering, of Empirical Software Engineering, Springer, and of the Journal of Software: Evolution and Process, Wiley. His current research interests are in software testing, in particular approaches to ensure the dependability of machine learning based systems, automated testing of web applications, and test oracle inference and improvement.