Towards a Benchmark for Scientific Understanding in Humans and Machines.Kristian Gonzalez Barman,Sascha Caron,Tom Claassen &Henk de Regt -2024 -Minds and Machines 34 (1):1-16.detailsScientific understanding is a fundamental goal of science. However, there is currently no good way to measure the scientific understanding of agents, whether these be humans or Artificial Intelligence systems. Without a clear benchmark, it is challenging to evaluate and compare different levels of scientific understanding. In this paper, we propose a framework to create a benchmark for scientific understanding, utilizing tools from philosophy of science. We adopt a behavioral conception of understanding, according to which genuine understanding should be recognized (...) as an ability to perform certain tasks. We extend this notion of scientific understanding by considering a set of questions that gauge different levels of scientific understanding, covering information retrieval, the capability to arrange information to produce an explanation, and the ability to infer how things would be different under different circumstances. We suggest building a Scientific Understanding Benchmark (SUB), formed by a set of these tests, allowing for the evaluation and comparison of scientific understanding. Benchmarking plays a crucial role in establishing trust, ensuring quality control, and providing a basis for performance evaluation. By aligning machine and human scientific understanding we can improve their utility, ultimately advancing scientific understanding and helping to discover new insights within machines. (shrink)
Distinctively generic explanations of physical facts.Erik Weber,Kristian González Barman &Thijs De Coninck -2024 -Synthese 203 (4):1-30.detailsWe argue that two well-known examples (strawberry distribution and Konigsberg bridges) generally considered genuine cases of distinctively _mathematical_ explanation can also be understood as cases of distinctively _generic_ explanation. The latter answer resemblance questions (e.g., why did neither person A nor B manage to cross all bridges) by appealing to ‘generic task laws’ instead of mathematical necessity (as is done in distinctively mathematical explanations). We submit that distinctively generic explanations derive their explanatory force from their role in ontological unification. Additionally, (...) we argue that distinctively generic explanations are better seen as standardly mathematical instead of distinctively mathematical. Finally, we compare and contrast our proposal with the work of Christopher Pincock on abstract explanations in science and the views of Michael Strevens on abstract causal event explanations. (shrink)
Procedure for assessing the quality of explanations in failure analysis.Kristian Gonzalez Barman -2022 -Artificial Intelligence for Engineering Design, Analysis and Manufacturing 36.detailsThis paper outlines a procedure for assessing the quality of failure explanations in engineering failure analysis. The procedure structures the information contained in explanations such that it enables to find weak points, to compare competing explanations, and to provide redesign recommendations. These features make the procedure a good asset for critical reflection on some areas of the engineering practice of failure analysis and redesign. The procedure structures relevant information contained in an explanation by means of structural equations so as to (...) make the relations between key elements more salient. Once structured, the information is examined on its potential to track counterfactual dependencies by offering answers to relevant what-if-things-had-been-different questions. This criterion for explanatory goodness derives from the philosophy of science literature on scientific explanation. The procedure is illustrated by applying it to two case studies, one on Failure Analysis in Mechanical Engineering (a broken vehicle shaft) and one on Failure Analysis in Civil Engineering (a collapse in a convention center). The procedure offers failure analysts a practical tool for critical reflection on some areas of their practice while offering a deeper understanding of the workings of failure analysis (framing it as an explanatory practice). It, therefore, allows to improve certain aspects of the explanatory practices of failure analysis and redesign, but it also offers a theoretical perspective that can clarify important features of these practices. Given the programmatic nature of the procedure and its object (assessing and refining explanations), it extends work on the domain of computational argumentation. (shrink)
IBE in engineering science - the case of malfunction explanation.Kristian González Barman &Dingmar van Eck -2021 -European Journal for Philosophy of Science 11 (1):1-19.detailsIn this paper we investigate how inference to the best explanation (IBE) works in engineering science, focussing on the context of malfunction explanation. While IBE has gotten a lot of attention in the philosophy of science literature, few, if any, philosophical work has focussed on IBE in engineering science practice. We first show that IBE in engineering science has a similar structure as IBE in other scientific domains in the sense that in both settings IBE hinges on the weighing of (...) explanatory virtues. We then proceed to show that, due to the intimate connection between explanation and redesign in engineering science, there is a further engineering domain-specific virtue in terms of which engineering malfunction explanations are evaluated, viz. the virtue of redesign utility. This virtue entails that the explanatory information offered by a malfunction explanation should be instrumental in predicting counterfactual dependencies in redesigned systems. We illustrate and elaborate these points in terms of a number of engineering examples, focussing in particular on the 2009 crash of Air France Flight 447. Our extension of analyses of IBE and explanation to engineering science practice offers new insights by identifying a new explanatory virtue in malfunction explanation: redesign utility. (shrink)
Beyond transparency and explainability: on the need for adequate and contextualized user guidelines for LLM use.Kristian González Barman,Nathan Wood &Pawel Pawlowski -2024 -Ethics and Information Technology 26 (3):1-12.detailsLarge language models (LLMs) such as ChatGPT present immense opportunities, but without proper training for users (and potentially oversight), they carry risks of misuse as well. We argue that current approaches focusing predominantly on transparency and explainability fall short in addressing the diverse needs and concerns of various user groups. We highlight the limitations of existing methodologies and propose a framework anchored on user-centric guidelines. In particular, we argue that LLM users should be given guidelines on what tasks LLMs can (...) do well and which they cannot, which tasks require further guidance or refinement by the user, and context-specific heuristics. We further argue that (some) users should be taught to refine and elaborate adequate prompts, be provided with good procedures for prompt iteration, and be taught efficient ways to verify outputs. We suggest that for users, shifting away from looking at the technology itself, but rather looking at the usage of it within contextualized sociotechnical systems, can help solve many issues related to LLMs. We further emphasize the role of real-world case studies in shaping these guidelines, ensuring they are grounded in practical, applicable strategies. Like any technology, risks of misuse can be managed through education, regulation, and responsible development. (shrink)
Fortifying Trust: Can Computational Reliabilism Overcome Adversarial Attacks?Pawel Pawlowski &Kristian González Barman -2025 -Philosophy and Technology 38 (1):1-19.detailsComputational Reliabilism (CR) has emerged as a promising framework for assessing the trustworthiness of AI systems, particularly in domains where complete transparency is infeasible. However, the rise of sophisticated adversarial attacks poses a significant challenge to CR’s key reliability indicators. This paper critically examines the robustness of CR in the face of evolving adversarial threats, revealing the limitations of verification and validation methods, robustness analysis, implementation history, and expert knowledge when confronted with malicious actors. Our analysis suggests that CR, in (...) its current form, is inadequate to address the dynamic nature of adversarial attacks. We argue that while CR’s core principles remain valuable, the framework must be extended to incorporate adversarial resilience, adaptive reliability criteria, and context-specific reliability thresholds. By embracing these modifications, CR can evolve to provide a more comprehensive and resilient approach to assessing AI reliability in an increasingly adversarial landscape. (shrink)
No categories
Exploring the epistemic and ontic conceptions of Models and Idealizations in Science.Kristian Gonzalez Barman -2023 -Zagadnienia Filozoficzne W Nauce 74:295-301.detailsBook review: Alejandro Cassini & Juan Redmond (eds.), _Models and Idealizations in Science: Artifactual and Fictional Approaches_, Springer Iternational Publishing, Cham 2021, pp.xv+270.
Fictional mechanism explanations: clarifying explanatory holes in engineering science.Kristian González Barman -2022 -European Journal for Philosophy of Science 12 (2):1-19.detailsThis paper discusses a class of mechanistic explanations employed in engineering science where the activities and organization of nonstandard entities are cited as core factors responsible for failures. Given the use of mechanistic language by engineers and the manifestly mechanistic structure of these explanations, I consider several interpretations of these explanations within the new mechanical framework. I argue that these interpretations fail to solve several philosophical problems and propose an account of fictional mechanism explanations instead. According to this account, fictional (...) mechanism explanations provide descriptions of fictional mechanisms that enable the tracking of counterfactual dependencies of the physical system they model by capturing system constraints. Engineers use these models to learn about and understand properties of materials, to build computational simulations of their behaviour, and to design new materials. (shrink)
Quantum mechanical atom models, legitimate explanations and mechanisms.Erik Weber,Merel Lefevere &Kristian Gonzalez Barman -2021 -Foundations of Chemistry 23 (3):407-429.detailsThe periodic table is one of the best-known systems of classification in science. Because of the information it contains, it raises explanation-seeking questions. Quantum mechanical models of the behaviour of electrons may be seen as providing explanations in response to these questions. In this paper we first address the question ‘Do quantum mechanical models of atoms provide legitimate explanations?’ Because our answer is positive, our next question is ‘Are the explanations provided by quantum mechanical models of atoms mechanistic explanations?’. This (...) question is motivated by the fact that in many scientific disciplines, mechanistic explanations are abundant. Because our answer to the second question is negative, our last question is ‘What kind of explanation do quantum mechanical models of atom provide?’ By addressing these questions, we shed light on the nature of an important type of chemical explanation. (shrink)