At their most basic, logic is the study of consequence, andinformation is a commodity. Given this, the interrelationship betweenlogic and information will centre on the informational consequences oflogical actions or operations conceived broadly. The explicitinclusion of the notion ofinformation as an object oflogical study is a recent development. It was by the beginning of thepresent century that a sizable body of existing technical andphilosophical work (with precursors that can be traced back to the1930s) coalesced into the new emerging field of logic and information(see Dunn 2001). This entry is organised thematically, rather thanchronologically. We survey major logical approaches to the study ofinformation, as well as informational understandings of logicsthemselves. We proceed via three interrelated and complementarystances: information-as-range, information-as-correlation, andinformation-as-code.
The core intuition motivating theInformation-as-rangestance, is that aninformational state may be characterised by the range of possibilitiesor configurations that are compatible with the information availableat that state. Acquiring new information corresponds to a reduction ofthat range, thus reducing uncertainty about the actual configurationof affairs. With this understanding, the setting of possible-worldsemantics for epistemic modal logics proves to be rewarding for thestudy of various semantic aspects of information. A prominentphenomenon here isinformation update, which may occur inboth individual and social settings, due to the interaction betweenboth agents and their environment via different types ofepistemicactions. We will see that an epistemic action is any action thatfacilitates the flow of information, hence we will return to epistemicactions themselves throughout.
TheInformation-as-correlation stance focuseson information flow as it is licensed within structured systems formedby systematically correlated components. For example: the number ofrings of a tree trunk can give you information about the time when thetree was born, in virtue of certain regularities of nature that‘connect’ the past and present of trees. Central themes ofthis stance include theaboutness, situatedness, and accessibilityof information in structured information environments.
The key concern of the third stance,Information-as-code,is thesyntax-likestructure of information pieces (theirencoding) and theinference and computation processes that are licensed byvirtue (among other things) of that structure. A most natural logicalsetting to study these informational aspects is the algebraic prooftheory underpinned by a range ofsubstructural logics.Substructural logics have always been a natural home for informationalanalysis, and the recent developments in the area enrich theinformation-as-code stance.
The three stances are by no means incompatible, but neither are theynecessarily reducible to each other. This will be expanded on later inthe entry, and some further topics of research will be illustrated,but for a preview of how the three stances can live together, take thecase of a structured information system composed of several parts.Firstly, the correlations between the parts naturally allow for‘information flow’ in the sense of theinformation-as-correlation stance. Secondly, they also give rise to alocal ranges of possibilities, since the local information availableat one part will be compatible with a certain range of global statesof the system. Thirdly, the combinatorial, syntax-like,proof-theoretical aspects of information can be brought to thissetting in various ways. One of them is treating the correlationalflow of information as a sort of combinatorial system by which localinformation states are combined in syntactic-like ways, fitting aparticular interpretation of substructural logic. One could also addcode-like-structure to the modelling explicitly, for example byassigning local deductive calculi to either the components or localstates of the system. We begin however withinformation asrange
The understanding of information as range has its origins inBar-Hillel and Carnap’s theory of semantic information,Bar-Hillel and Carnap (1952).[1] It is here that theinverse range principle is given itsfirst articulation with regard to the informational content of aproposition. The inverse range principle states that there is aninverse relationship between the information contained by aproposition on the one hand, and the likelihood of that propositionbeing true on the other. That is, the more information carried by aproposition, the less likely it is that the proposition is true.Similarly, the more likely the truth of a proposition, the lessinformation it carries.
The likelihood of the truth of a proposition connects with informationas range via a possible worlds semantics. For any contingentproposition, it will be supported by some possibilities (those whereit is true) and not supported by others (those where it is false).Hence a proposition will be supported by a range of possibilities, an“information range”. Now suppose that there is aprobability distribution across the space of possibilities, and forthe sake of simplicity suppose that the distribution is uniform. Inthis case, the more worlds that support a proposition, the likelierthe proposition’s truth, and, via the inverse relationshipprinciple, the less information it carries. Although information asrange has its origins in quantitative information theory, its role incontemporary qualitative logics of information cannot beoverstated.
Consider the following example due to Johan van Benthem (2011). Awaiter in a cafe receives an order for your table—an espressoand a soda. When the waiter arrives at your table, he asks “Forwhom is the soda?”. After your telling him that the soda is foryou and his giving you your soda, the waiter does not need to askabout the espresso, he can just give it to your cafe-partner. This isbecause the information gained by the waiter from your telling himthat you ordered the soda allows him to eliminate certain openpossibilities from the total range of possibilities such that only oneis left—your friend ordered the espresso.
The waiter case brings several facts about logic and information tothe fore. For one, language is used often to refine informationaloptions in the very way explained in the paragraph above. More subtlyhowever, and perhaps even prior to this, language is used toexchange information, and we bring with us sometimes manyscenarios—specified informationally. These scenarios might beneither known nor believed, but merelyentertained—those about which wewonder.Recent work oninquisitive semantics (Ciardellietal. 2018) provides a logic of such information exchange based oninformational specifications of such wonderings.
Logics of information distinguish regularly betweenhardinformation andsoft information. The terminology is aslight misnomer, as this distinction is not one between differenttypes of informationper se. Rather it is one betweendifferent types of informationstorage. Hard information isfactive, and unrevisable. Hard information is often taken tocorrespond toknowledge. In contrast to hard information,soft information isnon-necessarily-factive, hence revisablein the presence of new information. Soft information, in virtue of itsrevisability, corresponds very closely tobelief. The termsknowledge and belief are conventional, but on the context ofinformation flow, the hard/soft information reading is convenient onaccount of it bringing the informational phenomena to the foreground.At the very least the terminology is increasingly popular, so beingclear on the distinction being one between types of informationstorage as opposed to types of information is important. Although bothhard and soft information are important for our epistemic and doxasticsuccess, in this section we will concentrate mainly on logics of hardinformation flow.[2]
Insection 1.1 we will see how it is that classic epistemic logics exemplify theflow of hard information within the information as range framework. Insection 1.2 we will extend our exposition from logics of hard information-gain tologics of the actions that facilitate the gain of such hardinformation, dynamic epistemic logics. At the end of Section 1.2, wewill expound the important phenomenon ofprivate information,before examining how it is that information as range is captured invariousquantitative frameworks.
In this section we will explore how it is that the elimination ofpossibilities corresponding to information-gain is the starting pointfor research on logics of knowledge and belief that fall under theheading ofepistemic logics. We will begin with classicsingle-agent epistemic logic, before exploring multi-agent epistemiclogics. In both cases, since we will be concentrating on logics ofknowledge as opposed to logics of belief, the information-gained willbe hard information.
Consider the waiter example in more detail. Before receiving the hardinformation that the soda is for you (and for the sake of the examplewe are assuming that the waiting is dealing with hard informationhere), the waiter’s knowledge-base is modelled by a pair ofworlds (hereafterinformation states) \(x\) and \(y\) suchthat in \(x\) you ordered the soda and your friend the espresso, andin \(y\) you ordered the espresso and your friend the soda. Afterreceiving the hard information that the soda is for you, \(y\) iseliminated from the waiter’s knowledge-base, leaving only \(x\).As such, the reduction of the range of possibilities corresponds to aninformation-gain for the waiter. Consider the truth condition foragent \(\alpha\)knows that \(\phi\), written\(K_{\alpha}\phi\):
\[\tag{1} x \Vdash K_{\alpha}\phi \text{ ifffor all } y \text{ s.t. (such that) } R_{\alpha}xy, y \Vdash \phi \]The accessibility relation \(R_{\alpha}\) is an equivalence relationconnecting \(x\) to all information states \(y\) such that \(y\) isindistinguishable from \(x\),given \(\alpha\)’shard information at that state \(x\). That is, given what thewaiter knows when he is in that state. So, if \(x\) was thewaiter’s information state before being informed that youordered the soda, \(y\) would have included the information that youordered the espresso, as each option was as good as the other untilthe waiter was informed otherwise. There is an implicit assumption atwork here—that some state \(z\) say, where you ordered both thesoda and the espresso, is not in the waiter’s information-range.That is, the waiter knows that \(z\) is not a possibility. Onceinformed however, the information states supporting your ordering theespresso are eliminated from the range of information corresponding tothe waiter’s knowledge.
Basic modal logic extends propositional formulas with modal operatorssuch as \(K_{\alpha}\). If \(\mathbf{K}\) is the set of all Kripkemodels then we have the following:
\[\begin{align} \tag{A1} &\mathbf{K} \Vdash K_{\alpha}\phi \wedgeK_{\alpha}(\phi \rightarrow \psi) \rightarrow K_{\alpha}\psi \\\tag{A2} & \mathbf{K} \Vdash \phi \Rightarrow \mathbf{K} \VdashK_{\alpha}\phi \end{align}\]In hard information terms, (A1) states that hard information is closedunder (known) implications. Since the first conjunct states that allstates accessible by \(\alpha\) are \(\phi\) states, \(\alpha\)possesses the hard information that \(\phi\), hence \(\alpha\) alsopossesses the hard information that \(\psi\). (A2) states that if\(\phi\) holds in the set of all models, then \(\alpha\) possesses thehard information that \(\phi\). In other words, (A2) states that alltautologies are known/hard stored by the agent, and (A1) states that\(\alpha\) knows the logical consequences of all propositions that\(\alpha\) knows (be they tautologies or otherwise). That is, theaxioms state that the agent is logical omniscient, or anidealreasoner, a property of agents that we will return to in detailin the sections below.[3]
The framework explored so far concerns single-agent epistemic logic,but reasoning and information flow are very oftenmulti-agentaffairs. Consider again the waiter example. Importantly, thewaiter is only able to execute the relevant reasoning procedurecorresponding to a restriction of the range of information stateson account of your announcement to him with regard to theespresso. That is, it is the verbal interaction between several agentsthat facilitates the information flow that enabled the logicalreasoning to be undertaken.
It is at this point that multi-agent epistemic logic raises newquestions regarding the information in a group. “Everybody in\(G\) possesses the hard information that \(\phi\)” (where \(G\)is any group of agents from a finite set of agents \(G^*)\) written as\(E_G\phi . E_G\) is defined for each \(G \subseteq G^*\) in thefollowing manner:
\[\tag{2} E_G\phi = \bigwedge_{\alpha \in G} K_{\alpha}\phi \]Group knowledge is importantly different fromcommonknowledge (Lewis 1969; Fagin et al. 1995). Common knowledge isthe condition of the group whereeverybody knows that everybodyknows that everybody knows … that \(\phi\). In otherwords, common knowledge concerns the hard information that each agentin the group possesses about the hard information possessed by theother members of the group. That everybody in \(G\) possesses the hardinformation that \(\phi\) does not imply that \(\phi\) is commonknowledge. With group knowledge each agent in the group may possessthe same hard information (hence achieving group knowledge) withoutnecessarily possessing hard information about the hard informationpossessed by the other agents in the group. As noted by van Ditmarsh,van der Hoek, and Kooi (2008: 30), “the number of iterations ofthe \(E\)-operator makes a real difference in practice”.\(C_G\phi\)—the common knowledgethat \(\phi\) formembers of \(G\), is defined as follows:
\[\tag{3} C_G\phi = \bigwedge_{n=0}^{\infty} E^n_G \phi \]To appreciate the difference between \(E\) and \(C\), consider thefollowing “spy example” (originally Barwise 1988 with theenvelope details due to Johan van Benthem).
There are a group of competing spies at a formal dinner. All of themare tasked with the mission of acquiring some secret information frominside the restaurant. Furthermore, it is common knowledge amongstthem that they want the information. Given this much, compare thefollowing:
Very obviously, the two scenarios will elicit very different types ofbehaviour from the spies. The first would be relatively subtle, thelatter dramatically less so. See Vanderschraaf and Sillari (2009) forfurther details.
A still more fine-grained use of S5 based epistemic logics is that ofZhou (2016). Zhou demonstrates that S5 based epistemic logic may beused to model the epistemic states of the agent from the perspectiveof the agent themselves. Hence Zhou refers to such an epistemic logicasinternally epistemic. Zhou then uses a multi-valued logicto model the relationship between the agent’s internal knowledgebase and their external informational environment. In his (2019), vanBenthem argues for an understanding of modal logics in general (bothepistemic and otherwise) as ariing from anexplicitapproachto increasing a logic’s conceptual nuance — in the sensethat they are explicit extensions of classical logic. They wear theirnew conceptual architechture on their sleaves, so to speak. This is incontrast to those logics to which van Benthem refers as resulting fromanimplicit approach.This implicit approach invloves areinterpretation of the meaning of logical vocabulary, as is the casewith intuitionistic logic and relevant logic as conceived oftraditionally. van Benthem’s method of translating betweenequivalient (in a sense) implicit and explicit approaches has as aninstance that between Kit Fine’s (2017) hyperintensionaltruth-maker semantics and informationalised modal logic. This is apromising foray into such translations between a range of informationlogics such as those addressed in this entry.
See the full entry onDynamic Epistemic Logic. As noted above, the waiter example from the beginning of this sectionis as much about information-gain via announcements,epistemicactions, as it is about information structures. In this section,we will outline how it is that the expressive power of multi-agentepistemic logic can be extended so as to capture epistemicactions.
Hard information flow, that is, the flow of information between theknowledge states of two or more agents, can be facilitated by morethan one epistemic action. Two canonical examples areannouncements andobservations. When“announcement” is restricted totrue and publicannouncement, its result on the receiving agent’sknowledge-base is similar to that of an observation (on the assumptionthat the agent believes the content of the announcement). The publicannouncement that \(\phi\) will restrict the model of theagent’s knowledge-base to the information states where \(\phi\)is true, hence “announce \(\phi\)” isan epistemicstate transformer in the sense that it transforms the epistemicstates of the agents in the group, (see van Ditmarsh, van der Hoek,and Kooi 2008: 74).[4]
Dynamic epistemic logics extend the language of non-dynamic epistemiclogics with dynamic operators. In particular,public announcementlogic (PAL) extends the language of epistemic logics with thedynamic announcement operator [\(\phi\)], where [\(\phi]\psi\) is read“after announcement \(\phi\), it is the case that\(\psi\)”. The keyreduction axioms of PAL are asfollows:
\[\begin{alignat}{2}\tag{RA1} &[\phi]p &\text{ iff } &\phi \rightarrow p \text{ (where \(p\) is atomic)} \\ \tag{RA2} &[\phi]\neg \psi &\text{ iff } &\phi \rightarrow \neg[\phi]\psi \\ \tag{RA3} &[\phi](\psi \wedge \chi) &\text{ iff } &[\phi]\psi \wedge[\phi]\chi \\ \tag{RA4} &[\phi][\psi]\chi &\text{ iff } &[\phi \wedge[\phi]\psi]\chi \\ \tag{RA5} &[\phi]K_{\alpha}\psi &\text{ iff } &\phi \rightarrow K_{\alpha}(\phi \rightarrow [\phi]\psi) \end{alignat}\]RA1–RA5 capture the properties of the announcement operator byconnecting what is true before the announcement with what is trueafter the announcement. The axioms are named ‘reduction’axioms because the left-to-right hand direction reduces either thenumber of announcement operators or the complexity of the formulaswithin their scope. For an in depth discussion see Pacuit (2011). RA1states that announcements are truthful. RA5 specifies theepistemic-state-transforming properties of the announcement operator.It states that \(\alpha\) knows that \(\psi\) after the announcementthat \(\phi\)iff \(\phi\) implies that \(\alpha\) knows that\(\psi\) will be true after \(\phi\) is announced in all\(\phi\)-states. The “after \(\phi\) is announced”condition is there to account for the fact that \(\psi\) might changeits truth-value after the announcement. The interaction between thedynamic announcement operator and the knowledge operator is describedcompletely by RA5 (see van Benthem, van Eijck, and Kooi 2006).
Just as adding thecommon knowledge operator \(C\) tomulti-agent epistemic logic extends the expressive capabilities ofmulti-agent epistemic logic, adding \(C\) to PAL results in the moreexpressivepublic announcement logic with common knowledge,(PAC). The exact relationship between public announcements and commonknowledge is captured by theannouncement and common knowledgerule of the logic PAC as the following:
\[\tag{4} \text{From } \chi \rightarrow[\phi]\psi \text{ and } (\chi \wedge \phi) \rightarrow E_G\chi, \text{ infer } \chi \rightarrow [\phi]C_G\psi. \]Again, PAC is the dynamic logic of hard information. The epistemiclogics dealing withsoft information fall within the scope ofbelief revision theory (van Benthem 2004; Segerberg 1998).Recall that hard and soft information are not distinct types ofinformationper se, rather they are distinct types ofinformationstorage. Hard-stored information is unrevisable,whereas soft-stored information is revisable. Variants of PAL thatmodel soft information augment their models withplausibility-orderings on information-states (Baltag and Smets 2008).These orderings are known aspreferential models innon-monotonic logic and belief-revision theory. The logics can be madedynamic in virtue of the orderings changing in the face of newinformation (which is the mark of soft information as opposed to hardinformation). Such plausibility-orderings may be modelledqualitatively via partial orders etc., or modelled quantitatively viaprobability-measures. Such quantitative measures provide a connectionto a broader family of quantitative approaches to semantic informationthat we will examine below. Recent work by Allo (2017) ties the softinformation of dynamic epistemic logic to non-monotonic logics. Thisis an intuitive move. Soft information is information that has beenstored in a revisable way, hence the revisable nature of conclusionsin non-monotonic arguments makes non-monotonic logics a natural fit.On this very topic, see also Chapter 13.7 of van Benthem (2011).
Private information. Private information isan equally important aspect of our social interaction. Considerscenarios where the announcing agent is aware of the privatecommunication whilst other members of the group are not, such asemails in Bcc. Consider also scenarios where the sending agent isnot aware of the private communication, such as asurveillance operation. The system ofdynamic epistemic logic(DEL) models events that turn on private (and public) information bymodelling the agents’ information concerning the eventstaking place in a given communicative scenario (see Baltag etal. 2008; van Ditmarsh et al. 2008; and Pacuit 2011). For an excellentoverview and integration of all of the issues above, see the recentwork of van Benthem (2016), where the author discusses multipleinterrelated levels of logical dynamics, one level of update, andanother of representation. For an extensive collection of papersextending this and related approaches, see Baltag and Smets (2014).Although research into public and private information, most especiallywith regard to information crossing the threshold from one to theother, has been carried out within the framework of dynamic epistemiclogics, recent work explores public and private information andannouncements within the framework ofmulti-valued logics.See Yanget al. (2021).
The modal information theory approach to multi-agent information flowis the subject of a great amount of research. The semantics is notalways carried out in relational terms (i.e., with Kripke Frames) butis done often algebraically (see Blackburn et al. 2001 for details ofthe algebraic approach to modal logic). For more details on algebraicas well as type-theoretic approaches, see the subsection on algebraicand other approaches to modal information theory in the supplementarydocumentAbstract Approaches to Information Structure.
Quantitative approaches toinformation as range also havetheir origins in the inverse relationship principle. Torestate—the motivation being that the less likely the truth of aproposition as expressed in a logical language with respect to aparticular domain, the greater the amount of information encoded bythe relevant formula. This is in contrast to the information measuresin themathematical theory of communication (Shannon 1953[1950]) where such measures are gotten via an inverse relationship onthe expectation of the receiver \(R\) of the receipt of a signal fromsome source \(S\).
Another important aspect of the classical theory of information, isthat it is an entirelystatic theory—it is concernedwith the informational content and measure of particular formulas, andnot with informationflow in any way at all.
The formal details of classical information theory turn on theprobability calculus. These details may be left aside here, as theobvious conceptual point is that logical truths have atruth-likelihood of 1, and therefore an information measure of 0.Bar-Hillel and Carnap did not take this to mean that logical truths,or deductions, were without information yield, only that their theoryof semantic information was not designed to capture such a property.They referred to such a property with the termpsychologicalinformation. See Floridi (2013) for further details.
A quantitative attempt at specifying the information yield ofdeductions was undertaken by Jaakko Hintikka with his theory ofsurface information anddepth information (Hintikka1970, 1973). The theory of surface and depth information extendsBar-Hillel and Carnap’s theory of semantic information from themonadic predicate calculus all the way up to the full polyadicpredicate calculus. This itself is a considerable achievement, butalthough technically astounding, a serious restriction of thisapproach is that it is only a fragment of the deductions carried outwithin full first-order logic that yield a non-zero informationmeasure. The rest of the deductions in the full polyadic predicatecalculus, as well as all of those in the monadic predicate calculusand propositional calculus, measure 0, (see Sequoiah-Grayson 2008).For recent elaborations upon Hintikka’s distinction betweensuface and depth information, both formal and philosophical, seePanahy (2023), Hernandez and Quiroz (2022 [Other Internet Resources]),Negro (2022), and Ramos Mendonça (2022).
The obvious inverse situation with the theory of classical semanticinformation is that logical contradictions, having a truth-likelihoodof 0, will deliver a maximal information measure of 1. Referred to inthe literature as theBar-Hillel-Carnap Semantic Paradox, themost developed quantitative approach to addressing it is the theory ofstrongly semantic information (Floridi 2004). The conceptualmotivation behind strongly semantic information is that for astatement to yield information, it must help us to narrow down the setof possible worlds. That is, it must assist us in the search for theactual world, so to speak (Sequoiah-Grayson 2007). Such acontingency requirement on informativeness is violated byboth logical truths and logical contradictions, both of which measure0 on the theory of strongly semantic information. See Floridi (2013)for further details. See also Brady (2016) for recent work on therelationship between quantitative accounts of information andanalyticity. For a new approach to connecting quantitative andqualitative measures of information, see Harrison-Trainoretal. (2018)
The correlational take on information looks at how the existence ofsystematic connections between the parts of astructuredinformation environment permits that one part may carryinformationabout another. For example: the pattern of pixelsthat appear on the screen of a computer gives information (notnecessarily complete)about the sequence of keys that werepressed by the person who is typing a document, and even a partialsnapshot of the clear starred sky your friend is looking at now willgive you informationabout his possible locations on Earth atthis moment. The focus on structured environments and the aboutness ofinformation goes hand in hand with a third main topic of theinformation-as correlation approach, namely thesituatedness ofinformation, that is, its dependence on the particular setting onwhich an informational signal occurs. Take the starry sky as anexample again: the same pattern of stars, at different moments in timeand locations in space will in general convey different informationabout the location of your friend.
Historically, the first paradigmatic setting of correlated informationwas Shannon’s work on communication (1948), which we alreadymentioned in the last section. Shannon considered a communicationsystem formed by two information sites, a source and a receiver,connected via a noisy channel. He gave conclusive and extremely usefulanswers to questions having to do with the construction ofcommunication codes that help maximising the effectiveness ofcommunication (in terms of bits of information that can betransmitted) while minimizing the possibility of errors caused bychannel noise. As we previously said, Shannon’s concern waspurely quantitative. The logical approach to information ascorrelation builds on Shannon’s ideas, but is concerned withqualitative aspects of information flow, like the ones we highlightedbefore:what informationabout a‘remote’ site (remote in terms of space, time,perspective, etc.) can be drawn out of information that is directlyavailable at a ‘proximal’ site?
Situation theory (Barwise and Perry 1983; Devlin 1991) is themajor logical framework so far that has made these ideas its startingpoint for an analysis of information. Its origin and some of itscentral insights can be found in the project of naturalization of mindand the possibility of knowledge initiated by Fred Dretske (1981),which soon influenced the inception of situation semantics in thecontext of natural language (see Kratzer 2011).
Technically, there are two kinds of developments in situationtheory:
The next three subsections survey some of the basic notions from thistradition: the basic sites of information in situation theory (calledsituations), the basic notion of information flow based oncorrelations between situations, and the mathematical theory ofclassifications and channels mentioned in (b).
The ontologies in (a) span a wide spectrum of entities. They are meantto reflect a particular way in which an agent may carve up a system.Here “a system” can be the world, or a part or aspect ofit, while the agent (or kind of agent) can be an animal species, adevice, a theorist, etc. The list of basic entities includesindividuals, relations (which come with roles attached to them),temporal and spacial locations, and various other things. Distinctiveamong them are thesituations andinfons.
Roughly speaking, situations are highly structured parts of a system,such as a class session, a scene as seen from a certain perspective, awar, etc. Situations are the basic supporters of information. Infons,on the other hand, are the informational issues that situations may ormay not support. The simplest kind of informational issue is whethersome entities \(a_1 , \ldots ,a_n\) stand (or do not stand) in arelation \(R\) when playing the roles \(r_1 , \ldots ,r_n\),respectively. Such basic infon is usually denoted as
\[ \llangle R, r_1 : a_1 , \ldots ,r_n : a_n, i\rrangle. \]where \(i\) is 1 or 0, according to whether the issue is positive ornegative.
Infons are not intrinsic bearers of truth, and they are not claimseither. They are simply informational issues that may or may not besupported by particular situations. We’ll write \(s \models\sigma\) to mean that the situation \(s\) supports the infon\(\sigma\). As an example, a successful transaction whereby Marybought a piece of cheese in the local market is a situation thatsupports the infon
\[ \sigma = \llangle bought, what : cheese, who : Mary, 1\rrangle. \]This situation doesnot support the infon
\[ \llangle bought, what : cheese, who : Mary, 0\rrangle \]because Mary did buy cheese. Nor does the situation support theinfon
\[ \llangle landed, who : Armstrong, where : Moon, 1\rrangle, \]because Armstrong is not part of the situation in question at all.
The discrimination or individuation of a situation by an agent doesnot entail that the agent has full information about it: when wewonder whether the local market is open, we have individuated asituation about which we actuallylack some information. SeeTextor (2012) for a detailed discussion on the nature ofsituation-like entities and their relation with other ontologicalcategories such as the possible worlds used in modal logic.
Besides individuals, relations, locations, situations and basicinfons, there are various kinds of parametric and abstract entities.For example, there is a mechanism oftype abstraction.According to it, if \(y\) is a parameter for situations, then
\[ T_y = [y \mid y \models \llangle bought, what : cheese, who : x, 1\rrangle] \]is the type of situations where somebody buys cheese. There will besome basic types in an ontology, and many other types obtained viaabstraction, as just described.
The collection of ontology entities also includes propositions andconstraints. They are key in the formulation of the basic principlesof information content in situation theory, to be introduced next.
The following are typical statements about “informationflow” as studied in situation theory:
The general scheme has the form
where \(s : T\) is notation for “\(s\) is of type \(T\)”.The idea is that it is concrete parts of the world that act ascarriers of information (the concrete dot in the radar or thefootprints in Zhucheng), and that they do so by virtue of being of acertain type (the dot moving upward or the footprints showing acertain pattern). What each of these concrete instances indicates is afact about another correlated part of the world. For the issues to bediscussed below it will suffice to consider cases where the indicatedfact— \(p\) in the formulation of[IC]—isof the form \(s' : T '\), as in the radar example.
The conditions needed to verify informational signalling in the senseof [\(\mathbf{IC}\)] rely on the existence of law-likeconstraints such as natural laws, necessary laws such asthose of math, or conventions, thanks to which (in part) one situationmay serve as carrier of information about another one. Constraintsspecify thecorrelations that exist between situations ofvarious types, in the following sense: if two types \(T\) and \(T '\)are subject to the constraint \(T \Rightarrow T '\), then for everysituation \(s\) of type \(T\) there is a relevantly connectedsituation \(s'\) of type \(T '\). In the radar example, the relevantcorrelation would be captured by the constraintGoingUpward\(\Rightarrow\)GoingNorth,which says that each situation where aradar point moves upward is connected with another situation where aplane is moving to the north. It is the existence of this constraintthat allows a particular situation where the dot moves to indicatesomething about the connected plane situation.
With this background, the verification principle for informationsignalling in situation theory can be formulated as follows:
[IS Verification] \(s : T\) indicates that \(s' :T'\) if \(T \Rightarrow T '\) and \(s\) isrelevantlyconnected to \(s'\).
The relation \(\Rightarrow\) is transitive. This ensures thatDretske’s Xerox principle holds in this account of informationtransfer, that is, there can be no loss of semantic informationthrough information transfer chains.
[Xerox Principle]: If \(s_1 : T_1\) indicates that\(s_2 : T_2\) and \(s_2 : T_2\) indicates that \(s_3 : T_3\), then\(s_1 : T_1\) indicates that \(s_3 : T_3\).
The[IS Verification] principle deals withinformation thatin principle could be acquired by an agent.The access to some of this information will be blocked, for example,if the agent is oblivious to the correlation that exists between twokinds of situations. In addition, most correlations are not absolute,they admit exceptions. Thus, for the signalling described in[E1] to be really informational, the extracondition that the radar system is working properly must bemet. Conditional versions of the[IS Verification]principle may be used to insist that the carrier situation must meetcertain background conditions. The inability of an agent to keep trackof changes on these background conditions may lead to errors. So, ifthe radar is broken, the dot on the screen may end up moving upwardwhile the plane is moving south. Unless the air controller is able torecognise the problem, that is, unless she realises that thebackground conditions have changed, she may end up giving absurdinstructions to the pilot. Now, instructions are tied to actions. Fora treatment of actions from the situation-theoretical view, we referthe reader to Israel and Perry (1991).
The basic notion of information flow sketched in the previous sectioncan be lifted to a more abstract setting in which the supporters ofinformation are not necessarily situations as concrete parts of theworld, but rather any entity which, as in the case of situations, canbe classified as being of or not of certain types. The mathematicaltheory of distributed systems (Barwise and Seligman 1997) to bedescribed next takes this abstract approach by focusing on informationtransfer within distributed systems in general.
A model of a distributed system in this framework will actually be amodel of akind of distributed system. Accordingly, the modelof the radar-airplane system that we will use as a running examplehere will actually be a model of radar-airplanesystems (inplural). Setting such a model requires describing the architecture ofthe system in terms of its parts and the way they are put togetherinto a whole. Once that is done, one can proceed to see how thatarchitecture enables the flow of information among its parts.
A part of a system (again, really its kind) is modelled by saying howparticular instances of it are classified according to a given set oftypes. In other words, for each part of a system one has aclassification
\[ \mathbf{A} = \langle Instances, Types, \models \rangle, \]where \(\models\) is a binary relation such that \(a \models T\) ifthe instance \(a\) is of type \(T\). In a simplistic analysis of theradar example, one could posit at least three classifications, one forthe monitor screen, one for the flying plane, and one for the wholemonitoring system:
A general version of a ‘part-of’ relation betweenclassifications is needed in order to model the way parts of a systemare assembled together. Consider the case of the monitoring systems.That each one of them has a screen as one of its parts means thatthere is a function that assigns to each instance of theclassificationMonitSit an instance ofScreens. On the other hand, all the ways in which ascreen can be classified (the types ofScreens)intuitively correspond to ways in which the whole screening systemcould be classified: if a screen is part of a monitoring system andthe screen is blinking, say, then the whole monitoring situation isintuitively one of the type ‘its screen is blinking’.Accordingly, a generalised ‘part-of’ relation between anytwo arbitrary classifications \(\mathbf{A}, \mathbf{C}\) can bemodelled via two functions
\[\begin{align}f^{\wedge} &: \textit{Types}_A \rightarrow \textit{Types}_C \\ f^{\vee} &: \textit{Instances}_C \rightarrow \textit{Instances}_A, \end{align}\]the first of which takes every type in \(\mathbf{A}\) to itscounterpart in \(\mathbf{C}\), and the second of which takes everyinstance \(c\) of \(\mathbf{C}\) to its \(\mathbf{A}\)-component.[5]
If \(f : \mathbf{A} \rightarrow \mathbf{C}\) is shortcut notation forthe existence of the two functions above (the pair \(f\) of functionsis called aninfomorphism), then an arbitrary distributedsystem will consist of various classifications related byinfomorphisms. For our purposes, it will suffice here to considerthree classifications \(\mathbf{A}, \mathbf{B}, \mathbf{C}\) togetherwith two infomorphisms
\[\begin{align}f &: \mathbf{A} \rightarrow \mathbf{C} \\ g &: \mathbf{B} \rightarrow \mathbf{C}. \end{align}\]Then, in our example, a simple way to model the radar monitoringsystem would consist of the pair
\[\begin{align}f &: \mathbf{Screens} \rightarrow \mathbf{MonitSit} \\ g &: \mathbf{Planes} \rightarrow \mathbf{MonitSit}. \end{align}\]The common codomain in these cases \((\mathbf{C}\) in the general caseandMonitSit in the example) works as a the core of achannel that connects two parts of the system. The coredetermines the correlations that obtain between the two parts, thusenabling information flow of the kind discussed insection 2.2. This is achieved via two kinds of links. On the one hand, twoinstances \(a\) from \(\mathbf{A}\) and \(b\) from \(\mathbf{B}\) canbe thought to be connected via the channel if they are components ofthe same instance in \(\mathbf{C}\), so the instances of\(\mathbf{C}\) act as connections between components. Thus, in theradar example, a particular screen will be connected to a particularplane if they belong to the same monitoring situation.
On the other hand, suppose that every instance in \(\mathbf{C}\)verifies some relation between types that happen to be counterparts oftypes from \(\mathbf{A}\) and \(\mathbf{B}\). Then such relationcaptures aconstraint on how the parts of the system arecorrelated. In the radar example, the theory of the coreclassificationMonitSit would include constraintssuch asPlainMovingNorth \(\Rightarrow\)DotGoingUp.This regularity of monitoringsituations, which act as connections between radar screen-shots andplanes, reveals a way in which radar screens and monitored planescorrelate with each other. All this leads to the following version ofinformation transfer.
Channel-enabled signalling: Suppose that
\[\begin{align}f &: \mathbf{A} \rightarrow \mathbf{C} \\ g &: \mathbf{B} \rightarrow \mathbf{C}. \end{align}\]Then instance \(a\) being of type \(T\) in \(\mathbf{A}\) indicatesthat instance \(b\) is of type \(T'\) in \(\mathbf{C}\) if \(a\) and\(b\) are connected by a instance from \(\mathbf{C}\) and the relation\(f^{\wedge}(T) \Rightarrow g^{\wedge}(T')\) between the counterpartinterpreted types is satisfied by all instances of \(\mathbf{C}\).
Now, for each classification \(\mathbf{A}\), the collection
\[ L_A = \{T \Rightarrow T' \mid \text{ every instance of } \mathbf{A} \text{ of type } T \text{ is also of type } T'\} \]formed by all theglobal constraints of the classificationcan be thought of as a logic that is intrinsic to \(\mathbf{A}\). Thena distributed system consisting of various classifications andinfomorphisms will have a logic of constraints attached to each partof it,[6] and more sophisticated questions about information flow within thesystem can be formulated.
For example, suppose an infomorfism \(f : \mathbf{A} \rightarrow\mathbf{C}\) is part of the distributed system under study. Then \(f\)naturally transforms each global constraint \(T \Rightarrow T'\) of\(L_{\mathbf{A}}\) into \(f^{\wedge}(T) \Rightarrow f^{\wedge}(T')\),which can always be shown to be an element of \(L_{\mathbf{C}}\). Thismeans that one can reason within \(\mathbf{A}\) and thenreliably draw conclusions about \(\mathbf{C}\). On the otherhand, it can be shown that using preimages under \(f^{\wedge}\) inorder to translate global constraints of \(\mathbf{C}\) doesnot always guarantee the result to be a global constraint of\(\mathbf{A}\). It is then desirable to identify extra conditionsunder which the reliability of the inverse translation can beguaranteed, or at least improved. In a sense, these questions arequalitatively close to the concerns Shannon originally had about noiseand reliability.
Another issue one may want to model is reasoning about a system fromthe perspective of an agent that has onlypartial knowledgeabout the parts of a system. As an example, think of a planecontroller who has only worked with ACME monitors and knows nothingabout electronics. The logic such an agent might use to reason aboutpart \(\mathbf{A}\) of a system (actually partScreens in the case of the controller) will ingeneral consist of some constraints that may not even be global, butsatisfied only by some subset of instances (the ACME monitors). Theagent’s logic may beincomplete in the sense that itmight miss some of the global constraints of the classification (likethe ones involving inner components of the monitor). The agent’slogic may also beunsound, in the sense that there might beinstances out of the awareness of the agent (say monitors ofunfamiliar brands) that falsify some of the agent’s constraints(which do hold of all ACME monitors). A local logic \(L\) in\(\mathbf{A}\) can be “moved” along an infomorphism \(f :\mathbf{A} \rightarrow \mathbf{C}\) in the expected way, that is, itsconstraints are transformed via \(f^{\wedge}\), while its instancesare transformed via \(f^{\vee}\). Natural questions studied in channeltheory concerning these notions include the preservation (or not),under translation, of some desirable properties of local logics, suchas soundness.
A recent development in channel theory (Seligman 2014) uses a moregeneral definition of local logic, in which not all instances in thelogic need to satisfy all its constraints. This version of channeltheory is put to use in two important ways. Firstly, by using locallogics to stand for situations, and with a natural interpretation ofwhat an infon should then be, a reconstruction is produced of the coremachinery of situation theory (barely presented insections 2.1 andsection 2.2). Secondly, it is shown that this version of channel theory can dealwithprobabilistic constraints. The rough idea is that anypair of a classification plus a probability measure over the set ofinstances induces an extended classification with the same set oftypes, and where a constraint holds if and only if the set ofcounterexample instances has measure 0. Notice that this set ofcounterexamples might not be empty. Having probabilistic constraintsis a crucial step towards the effort of formally relating channeltheory to Shannon’s theory of communication.
For an extensive development of the theory of channels sketched here,plus several explorations towards applications, see Barwise andSeligman (1997). See van Benthem (2000) for a study of conditionsunder which constraint satisfiability is preserved underinfomorphisms, and Allo (2009) for an application of this framework toan analysis of the distinction between cognitivestates andcognitivecommodities. Finally, it must be mentioned that thenotion of classification has been around for some years now in theliterature, having being independently studied and introduced undernames such as Chu spaces (Pratt 1995) or Formal Contexts (Ganter andWille 1999).
For information to be computed, it must be handled by thecomputational mechanism in question, and for such a handling to takeplace, the information must beencoded.Information ascode is a stance that takes this encoding-condition veryseriously. The result is the development of fine-grained models ofinformation flow that turn on the syntactic properties of the encodingitself.
To see how this is so, consider again cases involving information flowvia observations. Such observations are informative because we are notomniscient in the normal, God-like sense of the term. We have to goand observe that the cat is on the mat, for example, precisely becausewe are not automatically aware of every fact in the universe.Inferences work in an analogous manner. Deductions are informative forus precisely because we are notlogically omniscient. We haveto reason about matters, sometimes at great length, because we are notautomatically aware of the logical consequences of the body ofinformation with which we are reasoning.
To come full circle—reasoning explicitly with informationrequires handling it, where in this case such handling is cognitiveact. Hence the information in question is encoded in some manner,hence Information as code underpins the development of fine-grainedmodels of information flow that turn on the syntactic properties ofthe encoding itself, as well as the properties of the actions thatunderpin the various information-processing contexts involved.
Such information-processing contexts are not restricted to explicitacts of inferential reasoning by human agents, but includeautomated reasoning andtheorem proving, as well asmachine-based computational procedures in general. Approaches tomodelling the properties of these latter information-processingscenarios fall underalgorithmic information theory.
Insection 3.1, we will explore a major approach to modelling the properties ofinformation-processing within the information as code framework viacategorial information theory. Insection 3.2, we will examine the more general approach to modelling information ascode of which categorial information theory is an instance, themodelling of information as code viasubstructural logics. Insection 3.3 we will lay out the details of several other notable examples oflogics of information flow motivated by the information as codeapproach.
Categorial information theory is a theory of fine-grainedinformation flow whose models are based upon those specified by thecategorial grammars underpinned by the Lambek Calculi, due originallyto Lambek (1958, 1961). The motivation for categorial informationtheory is to provide a logical framework for modelling the propertiesof the very cognitive procedures that underpin deductivereasoning.
The conceptual origin of categorial information theory is found in vanBenthem (1995: 186). Understanding van Benthem’s use of“procedural” to be synonymous with“dynamic”:
[I]t turns out that, in particular, the Lambek Calculus itself permitsof procedural re-interpretation, and thus, categorial calculi may turnout to describe cognitive procedures just as much as the syntactic orsemantic structures which provided their original motivation.
The motivation for categorial information theory is to model thecognitive procedures constituting deductive reasoning. Consider as ananalogy the following example. You arrive home from IKEA with anunassembled table that is still flat-packed in its box. Now thequestion here is this, do you have your table? Well, there is a sensein which you do, and a sense in which you do not. You have your tablein the sense that you have all of the pieces required to construct orgenerate the table, but this is not to say that you have the table inthe sense that you are able touse it. That is, you do nothave the table in any useful form, you have merely pieces of a table.Indeed, getting these table-pieces into their useful form, namely atable, may be a long and arduous process…
The analogy between the table-example above and deductive reasoning isthis. It is said often that the information encoded by (or“contained in” or “expressed by”) theconclusion of a deductive argument is encoded by the premises. So,when you possess the information encoded by the premises of someinstance of deductive reasoning, do you possess the informationencoded by the conclusion? Just as with the table-pieces, you do notpossess the information encoded by the conclusion in any useful form,not until you have put the “information-pieces”constituting the premises together in the correct manner. To be sure,when you possess the information-pieces encoded by the premises, youpossess some of the information required for the construction orgeneration of the information encoded by the conclusion. As with thetable-pieces however, getting the information encoded by theconclusion from the information encoded by the premises may be a longand arduous process. You need also the instructional information thattells you how to combine the information encoded by the premises inthe right way. This information-generation via deductive inference maybe thought of also as the movement of information from implicit toexplicit storage in the mind of the reasoning agent, and it is thecognitive procedures facilitating this storage transfer that motivatecategorial information theory.
Categorial information theory is a theory of dynamic informationprocessing based on themerge/fusion \((\otimes)\) andtyped function \((\rightarrow , \leftarrow)\) operations fromcategorial grammar. The conceptual motivation is to understand theinformation in the mind of an agent as the agent reasons deductivelyto be a database in much the same way as a natural language lexicon isa database (see Sequoiah-Grayson (2013), (2016)). In this case, agrammar will be understood as a set of processing constraintsso imposed as to guarantee information flow, or well-formed strings asoutputs. Recent research on proofs asevents from a verysimilar conceptual starting point may by found in Stefaneas andVandoulakis (2014).
Categorial information theory is strongly algebraic in flavour. Fusion‘\(\otimes\)’ corresponds to the binary compositionoperator ‘.’, and ‘\(\vdash\)’ to the partialorder ‘\(\le\)’ (see Dunn 1993). The merge and functionoperations are related to each other via the familiarresiduationconditions:
\[\begin{align}\tag{5} A \otimes B \vdash C &\text{ iff } B \vdash A \rightarrow C \\ \tag{6} A \otimes B \vdash C &\text{ iff } A \vdash C \leftarrow B \end{align}\]In general, applications for directional function application will berestricted to algebraic analyses of grammatical structures, wherecommuted lexical items will result in non-well-formed strings.
Despite its algebraic nature, the operations can be given theirevaluation conditions via “informationalised” Kripkeframes (Kripke 1963, 1965). An information frame (Restall 1994)\(\mathbf{F}\) is a triple \(\langle S, \sqsubseteq, \bullet\rangle\).\(S\) is a set of information states \(x, y, z\ldots\) .\(\sqsubseteq\) is a partial order of informationaldevelopment/inclusion such that \(x \sqsubseteq y\) is taken to meanthat the information carried by \(y\) is a development of theinformation carried by \(x\), and \(\bullet\) is an operation forcombining information states. In other words, we have a domain with acombination operation. The operation of information combination andthe partial order of information inclusion interrelate as follows:
\[\tag{7} x \sqsubseteq y \text{ iff } x \bullet y \sqsubseteq y \]Reading \(x \Vdash A\) asstate \(x\) carries information of type\(A\), we have it that:
\[\begin{align}\tag{8} x \Vdash A \otimes B &\text{ iff for some } y, z, \in \mathbf{F} \text{ s.t. } y \bullet z \sqsubseteq x, y \Vdash A \text{ and } z \Vdash B. \\ \tag{9} x \Vdash A \rightarrow B &\text{ iff for all } y, z \in \mathbf{F} \text{ s.t. } x \bullet y \sqsubseteq z, \text{ if } y \Vdash A \text{ then } z \Vdash B. \\ \tag{10} x \Vdash B \leftarrow A &\text{ iff for all } y, z \in \mathbf{F} \text{ s.t. } y \bullet x \sqsubseteq z, \text{ if } y \Vdash A \text{ then } z \Vdash B. \end{align}\]At the syntactic level, we read \(X \vdash A\) asprocessing on\(X\) generates information of type A. In this case we areunderstanding \(\vdash\) as an information processing mechanism assuggested by Wansing (1993: 16), such that \(\vdash\) encodes not justthe output of an information processing procedure, but the propertiesof the procedure itself. Just what this processing consists of willdepend on the processing constraints that we set up on our database.These processing constraints will be imposed in order to guarantee anoutput from the processing itself, or to put this another way, inorder to preserve information flow. Such processing constraints arefixed by the presence or absence of variousstructural rules,and structural rules are the business ofsubstructurallogics.
Categorial information theory is precipitated by giving the Lambekcalculi an informational semantics. At a suitable level ofabstraction, the Lambek calculi is seen to be a highly expressivesubstructural logic. Unsurprisingly, by giving aninformational semantics for substructural logics in general, we get afamily of logics that exemplify the information as code approach. Thislogical family is organised by expressive power, with the expressivepower of the logics in question being captured by the presence ofvariousstructural rules.
A structural rule is of the following general form:
\[\tag{11} X \Leftarrow Y \]We may read (11) asany information generated by processing on\(X\) is generated by processing on \(Y\) also. Hence thelong-form of (11) is as follows:
\[\tag{12} \frac{X \vdash A}{Y \vdash A} \]Hence \(X\) is a structured body of information, or “datastructure” as Gabbay (1996: 423) puts it, where the actualarrangement of the information plays a crucial role. Thestructural rules will fix the structure of the information encoded by\(X\), and as such impact upon the granularity of the informationbeing processed.
Consider Weakening, the most familiar of the structural rules(followed by its corresponding frame condition:
\[\begin{align}\tag{Weakening} &A \Leftarrow A \otimes B \\ &x\bullet y \sqsubseteq z \rightarrow x \sqsubseteq z \end{align}\]With Weakening present, we loose track of which pieces of informationwere actually used in an inference. This is precisely why it is thatthe rejection of Weakening is the mark of relevant logics, where thepreservation of bodies of information relevant to the derivation ofthe conclusion is the motivation. By rejecting Weakening, we highlighta certain type of informationaltaxonomy, in the sense thatwe knowwhich bodies of information were used. To preservemore structural detail than simply which bodies of information wereused, we need to consider rejecting further structural rules.
Suppose that we want to record not only which pieces of informationwere used in an inference, but also how often they were used. In thiscase we would reject Contraction:
\[\begin{align}\tag{Contraction} &A \otimes A \Leftarrow A \\ &x \bullet x \sqsubseteq x \end{align}\]Contraction allows the multiple use, without restriction, of a pieceof information. So if keeping a record of the “informationalcost” of the execution of some information processing is aconcern, Contraction will be rejected. The rejection of Contraction isthe mark of linear logics, which were designed for modelling just suchprocessing costs (see Troelstra 1992).
If we wish to preserve theorder of use of pieces ofinformation, then we will reject the structural rule ofCommutation:
\[\begin{align}\tag{Commutation} &A \otimes B \Leftarrow B \otimes A \\ &x \bullet y \sqsubseteq z \rightarrow y \bullet x \sqsubseteq z \end{align}\]Information-order will be of particular concern in temporal settings(consider action-composition) and natural language semantics (Lambek1958), where non-commuting logics first appeared. Commutation comesalso in a more familiar strong form:
\[\begin{align}\tag{Strong Commutation} &(A \otimes B) \otimes D \Leftarrow(A \otimes D) \otimes B \\ &\exists u(x \bullet z \sqsubseteq u \wedge u \bullet y \sqsubseteq w) \rightarrow\\ &\qquad \exists u(x \bullet y \sqsubseteq u \wedge u \bullet z \sqsubseteq w) \end{align}\]The strong form of Commutation results from its combination with thestructural rule of Association:[7]
\[\begin{align}\tag{Association} &A \otimes(B \otimes C) \Leftarrow(A \otimes B) \otimes C \\ &\exists u(x \bullet y \sqsubseteq u \wedge u \bullet z \sqsubseteq w) \rightarrow \\ &\qquad \exists u(y \bullet z \sqsubseteq u \wedge x \bullet u \sqsubseteq w) \end{align}\]Rejecting Association will preserve the precise fine-grainedproperties of the combination of pieces of information.Non-associative logics were introduced originally to capture thecombinatorial properties of language syntax (see Lambek 1961).
In the presence of Commutation, a double implication pair\((\rightarrow , \leftarrow)\) collapses into single implication\(\rightarrow\). In the presence of all of the structural rules,fusion, \(\otimes\), collapses into Boolean conjunction, \(\wedge\).In this case, the residuation conditions outlined in (5) and (6)collapse into a mono-directional function.
The choice of which structural rules to retain obviously depends onjust what informational phenomena is being modelled, so there is astrongpluralism at work. By rejecting Weakening say, we arespeaking ofwhich data were relevant to the process, but aresaying nothing about its multiplicity (in which case we would rejectContraction), its order (in which case we would reject Commutation),or the actual patterns of use (in which case we would rejectAssociation). By allowing Association, Commutation, and Contraction,we have the taxonomy locked down. We might not know the order ormultiplicity of the data that were used, but we do know what types,and exactly what types, were relevant to the successful processing.The canonical contemporary exposition of such an information-basedinterpretation of propositional relevant logic is Mares (2004). Suchan interpretation allows for an elegant treatment of thecontradictions encoded by relevant logics. By distinguishing betweentruth conditions andinformation conditions, weallow for an interpretation of \(x \Vdash A \wedge \neg A\) as\(x\) carries the information that \(A\) and not \(A\). Foran exploration of the distinction between truth-conditions andinformation-conditions withinquantified relevant logic, seeMares (2009).
At such a stage, things are still fairlystatic. By shiftingour attention from static bodies of information, to the manipulationof these bodies, we will reject structural rules beyondWeakening, arriving ultimately at categorial information theory, as itis encoded by the very weakest substructural logics. Hence the weakerwe go, the more “procedural” the flavour of the logicsinvolved. From a dynamic/procedural perspective, linear logics mightbe thought of as a “half way point” between staticclassical logic, and fully procedural categorial information theory.For a detailed exposition of the relationship between linear logic andother formal frameworks in the context of modelling information flow,see Abramsky (2008).
Recent important work by Dunn (2015) ties substructural logics andstructural rules together withinformational relevance in thefollowing way. Dunn makes a distinction betweenprograms anddata, with the former being dynamic and the latter static. Wemay think of programs as conditional statements of the form \(A\rightarrow B\), and of data as atomic propositions \(A, B\) etc.Given these two types of information artefacts, we have three possiblecombinations, program to data combination, program to programcombination, and data to data combination. For program to datacombination, commutation will hold whilst weakening and associationwill fail, and contraction not applying. For program to programcombination association will hold, whilst commutation, weakening fail.As demonstrated in Sequoiah-Grayson (2016), the case of contractionfor program to program combination is more complicated. The exactproperties of data to data combination remain an interesting openissue. The connection with informational relevance is made byinterpreting the partial order relation \(\sqsubseteq\) as markinginformation relevance itself. In this case, \(x \sqsubseteq y\) isread asthe information xis relevant to theinformation y. To what it is exactly that informational relevanceamounts will depend on the precise context of information processingin question. Sequoiah-Grayson (2016) extends the framework about tocontexts of information processing by an agent as the agent reasonsexplicitly. Given that the combination of information states \(x\bullet y\) may sit on the left hand side of the partial orderrelation, the extension is an account of the epistemic relevance ofepistemic actions. For a collection of recent papers exploring theinformation as code approach in depth, see Bimbó (2016). SeeBimbó (2022) for a wide collection of recent papers oninformational relevance and reasoning.
The information as code approach is a very natural perspective oninformation flow, hence there are a number of related frameworks thatexemplify it.
One such approach to analysing information as code is to carry outsuch an analysis in terms of the computational complexity of variouspropositional logics. Such an approach may propose a hierarchy ofpropositional logics that are all decidable in polynomial time, withthis hierarchy being structured by the increasing computationalresources required for the proofs in the various logics.D’Agostino and Floridi (2009) carry out just such an analysis,with their central claim being that this hierarchy may be used torepresent the increasing levels of informativeness of propositionaldeductive reasoning.
Gabbay’s (1993, 1996) framework oflabelled deductivesystems exemplifies the information as code approach in mannervery similar to the informationalised substructural logics ofsection 3.1. An item of data (note that Gabbay refers to both atomic andconditional information as data, in contrast to Dunn andSequoiah-Grayson in the section above) is given as a pair of the form\(x : A\), where \(A\) is a piece of declarative information, and\(x\) is a label for \(A. x\) is a representation of information thatis needed operate on or alter the information encoded by \(A\).Suppose that we have also the data-pair \(y : A \rightarrow B\). Wemay apply \(x\) to \(y\), resulting in the data-pair \(x + y : B\) Inthis case, a database is a configuration of labelled formulas, ordata-pairs (Gabbay 1993: 72). The labels and their correspondingapplication operation are organised by an algebra, and the propertiesof this algebra will impose constraints on the applications operation.Different constraints, of “meta-conditions” as Gabbaycalls them (Gabbay 1993: 77), will correspond to different logics. Forexample, if we were to ignore the labels, then we would have classicallogic, if we were to accept only the derivations which used all of thelabelled assumptions, then we would have relevance logic, and if weaccepted only the derivations which used the labelled assumptionsexactly once, then we would have linear logic. Labels are behavingvery much like possible worlds here, and the short step from possibleworlds to information states makes it obvious how it is that themeta-conditions on labels may be captured by structural rules.
Artemov’s (2008) framework ofjustification logicshares many surface similarities with Gabbay’s system oflabelled deduction. The logic is composed ofjustificationassertions of the form \(x : A\), read as\(x\) is ajustification for \(A\). Justifications themselves are evidentialbases of varying sorts that will vary depending on the context. Theymight be mathematical proofs, sets of causes or counterfactuals, orsomething else that fulfils the justificatory role. What it means for\(x\) to justify \(A\) is not analysed directly in justificationlogic. Rather, attempts are made to characterise the justificationrelation \(x : A\) itself, via various operations and their axioms.The application operation, ‘.’ mimics the applicationoperation ‘+’ from labelled deduction, or the fusion‘\(\otimes\)’ operation from categorial informationtheory. In justification logic, the symbol ‘+’ is reservedfor the representation of joint evidence. Hence ‘\(x +y\)’ is read as ‘the joint evidence of \(x\) and\(y\)’. Application and join are characterised injustification logic by the following axioms respectively:
\[\begin{align}\tag{13} &x : (A \rightarrow B) \rightarrow(y : A \rightarrow(x{.}y) : B) \\ \tag{14} &x : A \rightarrow(x + y) : A, \text{ and } x : A \rightarrow(y + x) : A \end{align}\]The latter axiom characterises the monotonicity of joint evidentialbases. Apart from the commutativity of +, the structural properties ofthe justification operations are currently unexplored, although thepotential for such an exploration is exciting. Justification logic isused to analyse notoriously difficult epistemic problems such as theGettier cases and more. If we take our epistemology to beinformationalised, then the constitution of evidential bases asinformation states places justification logics within the informationas code approach in a straightforward manner. For further details, seeArtemov and Fitting (2012).
Zalta’s work on object theory (Zalta 1983, 1993) provides adifferent way to analyse informational content—understood aspropositional content—and its structure. Motivated bymetaphysical considerations, object theory starts by proposing atheory of objects and relations (usually formulated in a second orderquantified modal language). This theory can then be used to define andcharacterise states of affairs, propositions, situations, possibleworlds, and other related notions. The resulting picture is one whereall these things have internal structure, their algebraic propertiesare axiomatized, and one can therefore reason about them in aclassical proof-theoretical way.
A philosophical point touched by this approach concerns the linkbetween the propositional content (information) expressed by sentencesand the idea of predication. Relevant to this entry is Zalta’s(1993) development of a version of situation theory that follows thisapproach, and where a key element is the usage of two forms ofpredication. Briefly, the formula ‘\(Px\)’ corresponds tothe usual form of predication by exemplification (as in “Obamais American”), while ‘\(xP\)’ corresponds topredication viaencoding. Abstract objects are then definedto be (essentially) encodings of properties, in combinations whichmight not even be made factual. These provisions enable the existenceof information about abstract, possible, or fictional entities. Fordetails on the tradition to which object theory belongs see Textor(2012), McGrath (2012), and King (2012).
While the three approaches discussed above (range, correlations, code)differ in that they emphasise different informational themes, theunderlying notion they aim to clarify is the same (information). It isthen natural to find that the similarities and synergies between theapproaches invite the exploration of ways to combine them. Each one ofthe next subsections illustrates how one could bring together two outof the three approaches.Section 4.1 exemplifies the interface between the info-as-range andinfo-as-correlation views. Sections4.2 and4.3 do the same with the other two pairs of combinations, namely code andcorrelations, and code and ranges.
A central intuition in the information-as-range view is thecorrespondence that exists between information at hand (where this canbe qualified in various ways) and the range of possibilities which arecompatible with such information. On the other hand, a key feature ofthe correlational approach to information is its reliance on astructured information system formed by components that aresystematically connected. In general, many properties of a structuredsystem will actually belocal properties, in that they aredetermined by only some of the components (the fact that there is adot moving upwards in a radar can be determined only by looking at thescreen, even if this behaviour is correlated with the motion of aremote plane, which is another component of the system). If one hasaccess to information pertaining to only a few of the many componentsof a system, a natural notion of range of possibilities arises,consisting of all the possible global configurations of the systemthat are compatible with such local information. This subsectionexpands on this particular way to link the two approaches, but as itwill be noted at the end, this is not the only one and the search forother ways lies ahead as an open area of inquiry.
Formally, the link between ranges and correlations described above maybe approached by using arestricted product state space as amodel of the architecture of the system (van Benthem 2006, van Benthemand Martinez 2008). The basic structures areconstraintmodels, versions of which have been around in the literature forsome years (for example Fagin et al. 1995 in the study of epistemiclogic, and Ghidini and Giunchiglia 2001 in the study of contextdependent reasoning). Constraint models have the form
\[ \mathscr{M} = \langle Comp, States, C, Pred\rangle. \]Here, the basic component spaces are indexed byComp, thestates of each component are taken fromStates (withdifferent components using maybe only a few of the elements ofStates), and the global states of the system are globalvaluations, that is, functions that assign a state to each basiccomponentComp. Not all such functions are allowed, onlythose in \(C\). Finally,Pred is a labelled family ofpredicates (sets of global states).
To see how this fits with the information-as-correlation view,consider again the example of planes being monitored by radars. Asbefore, each monitoring situation will be modelled as having only twoparts, now indexed by the members of \(Comp = \{ screen, plane\}\).The actual instances of screening situations would correspond toglobal states, which in this case — where we have only twocomponents — can be thought of as pairs \((s, b)\) where \(s\)is a particular screen and \(b\) a particular plane. Hence, globalstates connect instances of parts, so representing instances of awhole system. But then a crucial restriction comes into play, becausenot all screens are connected with all planes, only with thosebelonging to the same monitoring situation. The set \(C\) selects onlysuch permissible pairs, thus playing a role similar to that of achannel insection 1. Finally,Pred classifies global states into types, similarto the classification relations ofsection 2.3.
As we said before, some properties of systems are local properties,with only some of the components of the systems being relevant indetermining whether they hold or not. That a monitoring situation isone where the plane is moving North depends only on the plane, not onthe screen. In general, if a property is completely determined bysubset of components \(\mathbf{x}\) then, in what concerns thatproperty, any two global states that agree on \(\mathbf{x}\) should beindistinguishable. In fact, each such \(\mathbf{x}\) induces anequivalence relation of local property determination so that for everytwo global states \(\mathbf{s}, \mathbf{t}\):
\(\mathbf{s} \sim_{\mathbf{x}}\mathbf{t}\) if and only if the valuesof \(\mathbf{s}\) and \(\mathbf{t}\) at each one of the components in\(\mathbf{x}\) are the same.
In this way one gets not only a conceptual but also formal link to theinformation-as-range approach, because constraint models can be usedto interpret a basic modal language with atomic formulas of the form\(P\)—where \(P\) is one of the labels of predicates inPred—and with complex formulas of the form \(\neg \phi,\phi \vee \psi, U\phi\), and \(\Box_{\mathbf{x}}\phi\), where\(\mathbf{x}\) is a partial tuple of components and \(U\) is theuniversal modality. More concretely, given a constraint model\(\mathscr{M}\) and a global state \(s\), the crucial satisfactionconditions are given by:
\[\begin{alignat}{3}\mathscr{M}, \mathbf{s} &\models P &\text{ iff } &\mathbf{s} \in P \\ \mathscr{M}, \mathbf{s} &\models U \phi &\text{ iff } &\mathscr{M}, \mathbf{t} \models \phi \text{ for all } \mathbf{t} \\ \mathscr{M}, \mathbf{s} &\models \Box_{\mathbf{x}} \phi &\text{ iff } &\mathscr{M}, \mathbf{t} \models \phi \text{ for all } \mathbf{t} \sim_{\mathbf{x}} \mathbf{s} \end{alignat}\]The resulting logic is axiomatised by the fusion of \(S_5\) modallogics for the universal modality \(U\) and each one of the\(\Box_{\mathbf{x}}\) modalities, plus the addition of axioms of theform \(U \phi \rightarrow \Box_{\mathbf{x}}\phi\), and\(\Box_{\mathbf{x}}\phi \rightarrow \Box_{\mathbf{y}}\phi\) whenever\(\sim_{\mathbf{y}} \subseteq \sim_{\mathbf{x}}\).
The information-as-range research agenda includes other topics, suchas agency and the dynamics of information update, which can inprinciple be incorporated to the constraint models setting. Forexample, in the case of agency, to the architectural structure of astate system captured by a constraint model one could add epistemicaccessibility relations for a group of agents \(\mathcal{A}\), so toobtainepistemic constraint models of the form
\[ \mathscr{M} = \langle Comp, States, C, Pred, \{\approx_{a}\}_{a\in \mathcal{A}}\rangle. \]where \(\approx_a\) is the equivalence accessibility relation of agent\(a\). Here one could refine the planes and radar example above byadding some agents, say the controller and the pilot. By relying onlyon the controls each agent can see, the controller will not be able todistinguish states that agree on the direction of the plane butdiffer, say, on the metereological conditions around the plane. Thosestates will be related by the controller’s relation in themodel, but not by the pilot’s relation. In principle, this mergeof modal epistemic models and constraint models allows one to study,in a single setting, aspects of both the information-as-range andinformation-as-correlation points of view. The corresponding logicallanguage for epistemic constraint models is the same as for basicconstraint models, expanded with the \(K_i\) modal operators, one peragent. The logic is the fusion of the constraint logic from above anda \(S_5\) logic per each agent \(a\).
There are some newer, different approaches to information modellingthat sit at the intersection of the information as range andinformation as correlation perspectives. One is van Benthem’swork on information tracking (van Benthem 2016). Tracking is a newperspective that addresses both the connections between differentrepresentations of information on the one hand, and the updates onthese connections on the other.
Another development (Baltag 2016) comes from a line of work thatstudies how to capture, in the style of epistemic logics such as thosedescribed insection 1, the properties and dynamics of knowledgede re (Wang and Fan2014). Identifying this kind of knowledge with knowledge of the valueof a variable, Baltag’s insight is to add, to the language ofbasic epistemic logic, the usual first-order resources forconstructing terms and basic formulas (that is, symbols of constants,functions, relations, and variables), plus, crucially, a generalisedconditional knowledge operator \(K_{a}^{t_1 ,\ldots ,t_n}\). Theextended language has now formulas \(K_{a}^{t_1 ,\ldots ,t_n} t\) and\(K_{a}^{t_1 ,\ldots ,t_n} \phi\), with the intended meaning thatagent \(a\) knows the value of term \(t\) (or knows that \(\phi\), forthe second formula), provided it knows the values of terms \(t_1,\ldots ,t_n\). To be able to capture this idea on the semantic side,Kripke models are enriched so that, in addition to the usual set ofinformation states, interpretations for propositional letters, andagents relations, we will also have a domain of objects over whichterms and basic relational formulas are locally interpreted at eachstate (that is, the interpretations can vary from state to state, butthe underlying domain is the same across states). A sound and completeaxiomatisation exists, and the resulting logical system is a sort of ageneral, yet decidable, dependence logic where information aboutcorrelations can be captured via the conditional knowledge operators.Dynamic versions are also obtained where, in addition to the publicannouncement operator \([\phi]\), one has value announcement operators\([t_1,\ldots ,t_n]\), with formula \([t_1,\ldots ,t_n] \phi\) beingread as “after the simultaneous announcement of the values ofterms \(t_1,\ldots ,t_n\), it is the case that \(\phi\)”.
There is recent work (Baltag and van Benthem 2021) that achieves ageneral logic oflocal dependence that recruits semanticinsights like the ones just described so far in this subsection(constraint models and a enriched modal semantics), and shows thatthey can be seen as two faces of the same coin.Yet other links between the approaches have also be found, which aremotivated by other kind of questions and use formalisms that arecloser to the situation-theoretic ones. For example, consider asetting in which agents haveincomplete information about anintended subset of a set of epistemic states. How can a relation ofaccessibility arise from such a setting? (Notice that this isdifferent to the setting of epistemic constraint models describedabove, where agents do have complete information about what holds trueof all the epistemically accessible worlds). One way to address thisquestion (Barwise 1997) is to consider a fixed classification \(A\),the instances of which are the epistemic states, plus a local logicper agent attached to each state. For some states these local logicsmay be incomplete (seesection 2.3), so agents may not have information about everything that holds trueof the intended range of states. Then, roughly, the states accessiblefrom a given state \(s\) and agent \(a\) will be those whoseproperties (types) do not contradict the local logic of \(a\) in\(s\). With these epistemic relations in place, classification \(A\)can be used to interpret a basic modal language.
Logical frameworks that crossover information as code and informationas correlation get their most explicit representation in work thatdoes just this—model the crossover between the two frameworks.Restall (1994) and Mares (1996) give independent proofs of therepresentability of Barwise’s information as correlationchannel-theoretic framework within the information as code approach asexemplified by the substructural logics framework. In this section wewill trace the motivations and the main details of the proof, beforedemonstrating the connection withcategory theory.
The basic steps are these—if we understand information channelsto be information states of a special sort, namely the sort ofinformation state that carries information of conditional types, thenthere is an obvious meeting point between information as correlationas exemplified by channel theory, and information as code asexemplified by informationalised substructural logics. Theintermediate step is to reveal the connection between channelsemantics for conditional types, and the frame semantics forconditionals given by relevance logics.
Starting with the channel theoretic analysis of conditionals, as notedalready, the running motivation behind Barwise’schannel-theoretic framework is that information flow is underpinned byan information channel. Barwise understood conditionals asconstraints in the sense that \(A \rightarrow B\) is aconstraint from \(A\) to \(B\) in the sense of \(A \Rightarrow B\)fromsection 2.2 above. If the information that \(A\) is combined with the informationencoded by the constraint, then the result or output is theinformation that \(B\).
The information that \(A\) and that \(B\) is carried by the situations\(s_1, s_2\ldots\). and the information encoded by the constraint iscarried by an information channel \(c\). Given this, Barwise’sevaluation condition for a constraint is as follows (the condition isgiven here in Barwise’s notation from his later work onconditionals, although in earlier writings such conditions appeared inthe notation given insection 2.2 above):
\[\tag{15} c \models A \rightarrow B \text{ iff for all } s_1, s_2, \text{ if } s_1 \stackrel{c}{\mapsto} s_2 \text{ and } s_1 \models A, \text{ then } s_2 \models B, \]where \(s_1 \stackrel{c}{\mapsto} s_2\) is read as
the information carried by the channel \(c\), when combined with theinformation carried by the situation \(s_1\), results in theinformation carried by the situation \(s_2\).
Obviously enough, this is very close in spirit to (9) in the sectionon information as code above.
As noted above, the intermediate step concerns the ternary relation\(R\) from the early semantics for relevance logic. The semanticclause for the conditional from relevance logic is:
\[\tag{16} x \Vdash A \rightarrow B \text{ iff for all } y, z \in \mathbf{F} \text{ s.t. } Rxyz, \text{ if } y \Vdash A \text{ then } z \Vdash B. \]\(Rxyz\) is, by itself, simply an abstract mathematical entity. Oneway or reading it, the way that became popular in relevance logiccircles, is
\(Rxyz\) iff the result of combining \(x\) with \(y\) is true at\(z\).
Given that the points of evaluation in relevance logics wereunderstood originally as impossible situations (since they may be bothinconsistent and incomplete), the main conceptual move was tounderstand channels to be special types of situations. The full proofsmay be found in Restall (1994) and Mares (1996), and these demonstratethat the expressive power of Barwise’s system may be captured bythe frame semantics of relevance logic. What it is that such“combining” of \(x\) and \(y\) amounts to depends on, ofcourse, which structural rules are operating on the frame in question.As explained in the previous section above, the choice of which rulesto include will depend on the properties of the phenomena beingmodelled.
The final step required for locating the meeting point betweeninformation as code and information as correlation is as follows.Contemporary approaches to relevance and other substructural logicsunderstand the points of evaluation (impossible situations) to beinformation states. There is certainly no constraint on informationthat it be complete or consistent, so the expressibility of impossiblesituations it not sacrificed. Such an informational reading (Paoli2002; Restall 2000; Mares 2004) lends itself to multiple applicationsof various substructural frameworks, and also does away with theontological baggage brought by questions like “what areimpossible situations?” in the “What are possibleworlds?” spirit. An information-state reading of \(Rxyz\) willbe something like
the result of combining the information carried by \(x\) and \(y\)generates the informations carried by \(z\).
Making this explicit results in \(Rxyz\) being written down as \(x\bullet y \sqsubseteq z\), in which case (15) is, via (16), equivalentto (9).
An important structural rule for the composition operation oninformation channels, that is, on information states that carryinformation of conditional types, is that it is associative. What thismeans is that:
\[\tag{17} z \stackrel{x \bullet (y \bullet v)}{\longmapsto} w = z \stackrel{(x \bullet y) \bullet v}{\longmapsto} w. \]Where \(z \Vdash A\) and \(w \Vdash D\), this will be the case for all\(x, y, v\) s.t. \(x \Vdash A \rightarrow\), \(y \Vdash B \rightarrowC\), \(v \Vdash C \rightarrow D\). This is just the first steprequired to demonstrate that channel theory, and its underlyingsubstructural logic, form acategory.
Category theory is an extremely powerful tool in its own right. For athorough introduction see Awodey (2006). For more work on therelationship between various substructural logics and channel theory,see Restall (1994a, 1997, 2006). Further category-theoretic work oninformation flow may be found in Goguen (2004—see Other InternetResources). Recent important work on category-theoretic frameworks forinformation flow that extend toquantifiable/probabilisticframeworks is due to Seligman (2009). Perhaps the most in depthtreatment of information flow in category theoretic terms is to befound in the work of Samson Abramsky, and an excellent overview may befound in his “Information, Processes, and Games” (2008).Recent work on the intersection between information as code andinformation as correlation uses substructural logics (relevance andlinear logics in particular) to model logical proofs as informationsources themselves. A proof is a source of informationparexcellence, and the contributions in the area by Mares (2016) arevital.
Excitingly, there has been a recent surge in the recent development ofinformation logics that combine the flexibility of categorialinformation theory with the subject matter of dynamic epistemic logicsin order to designsubstructural epistemic logics. Sedlar(2015) combines the modal epistemic logics of implicit knowledge andbelief with substructural logics in order to capture the availabilityof evidence for the agent. Aucher (2015, 2014) redefines dynamicepistemic logic as a substructural logic corresponding to the LambekCalculi of categorial information theory. Aucher shows also that thesemantics for DEL can be understood as providing a conceptualfoundation for the semantics of substructural logics in general. SeeHjortland and Roy (2016) for an extension of Aucher’s approachto soft information.
In general, information logic approaches to dynamic epistemicphenomena that combine the DEL of section 1.2 and the substructurallogics of section 3.2 above have grown in popularity considerably. Seefor example Aucher (2016, 2014), Tedder and Bilková(forthcoming), Tedder (2021, 2017), Sedlár,Punčochář, and Tedder (2023),Punčochář, and Sedlár (2021), andSedlár (2021, 2019).
Other logical frameworks that model information as code and rangealong with information about encoding have been developed byVelázquez-Quesada (2009), Liu (2009), Jago (2006), and others.The key element to all of these approaches is the introduction of somesyntactic code to the conceptual architecture of the information asrange approach.
Taking Velázquez-Quesada (2009) as a working example, startwith amodal-access model \(M =\langle S, R, V, Y, Z\rangle\)where \(\langle S, R, V \rangle\) is a Kripke Model, \(Y\) is theaccess set function, and \(Z\) is therule setfunction s.t. (where \(I\) is the set of classical propositionallanguage based on a set of atomic propositions):
A modal-access model is a member of the class of modal access models\(\mathbf{MA}\) iff it satisfies truth for formulas and truthpreservation for rules. \(\mathbf{MA}_k\) models are those\(\mathbf{MA}\) models such that \(R\) is an equivalence relation.
From here, inference is represented as a modal operation adding therule’s conclusion to the access set of information states of theof the agent such that the agent can access both the rule and itspremises. Where \(Y(x)\) is the access set at \(x\), and \(Z(x)\) isthe rule set at \(x\):
Inference on knowledge: Where \(M = \langle S, R, V,Y, Z\rangle \in \mathbf{MA}_k\), and \(\sigma\) is a rule, \(M_k\sigma= \langle S, R, V, Y', Z\rangle\) differs from \(M\) in \(Y'\), givenby \(Y'(x) := Y(x) \cup \{\)conc\((\sigma)\}\) if\(\text{prem}(\sigma) \subseteq Y(x)\) and \(\sigma \in Z(x)\), and by\(Y'(x) := Y(x)\) otherwise.
The dynamic logic for inference on knowledge then incorporates theability to represent “there is a knowledge inferencewith \(\sigma\)after which \(\phi\)holds” (Velázquez-Quesada 2009). It is in justthis sense that such modal information theoretical approaches modelthe outputs of inferential processes, as opposed to the properties ofthe inferential processes that generate such outputs (see the sectiononcategorial information theory for models of such dynamicproperties).
Jago (2009) proposes a rather different approach based upon theelimination of worldsconsidered possible by the agent as theagent reasons deductively. Such epistemic (doxastic) possibilitiesstructure an epistemic (doxastic) space under bounded rationality. Theconnection with information as code is that the modal space isindividuated syntactically, with the worlds corresponding to possibleresults of step-wise rule-governed inferences. The connection withinformation as range is that the rules that he agent does or does nothave access to will impact upon the range of discrimination for theagent. For example, if the agent’s epistemic-base contains twoworlds, a \(\neg \phi\) world and a \(\phi \vee \psi\) world say, thencan refine their epistemic base only if they have access to thedisjunctive syllogism rule.
A subtle but important contribution of Jago’s is the following:the modal space in question will contain only those epistemic optionswhich are notobviously impossible. However, what is or isnot obviously impossible will vary from both agent to agent, as wellas for a single agent over time as that agent refines its logicalacumen. This being the case, the modal space in question hasfuzzy boundaries.
There is a varied list of special topics pertaining to the logicalapproach to information. This section briefly illustrates just acouple of them, which are important regardless of the particularstance one takes (information as range, as correlation, as code). Thefirst topic is the issue of informational equivalence: when are twostructures in the logical approach one is using indistinguishable interms of the information they are meant to encode, convey, or carry?And, when should two pieces of information be taken as equivalent ornot? The answers to this last question touch on the issue of howinformation (or information carriers, or information supporters) canbe combined or structured. This, in turn, has an impact on theproperties logical connectives are expected to behave. The secondtopic in this section focuses on one of the connectives. Namely, itconcerns the various ways in which the idea ofnegativeinformation can be understood conceptually, and properly dealt withformally.
Every logical approach to information comes with its own kind ofinformation structures. Depending on the particular stance and theaspect of information to be stressed, these structures may stand forinformational states, structured syntactic representations, pieces ofinformation understood as commodities, or global structures made upfrom local interrelated informational states or stages. Under whichconditions can two informational structures be considered to beinformationally equivalent?
Addressing this question brings out the need to have it clear at whichlevel of granularity one is testing for equivalence. Theclassical extensional notion of logical equivalence is a coarse, inthat informationally different claims such as 2is even and 2is prime cannot be distinguished, as their extensions willcoincide. Equivalence given by identity at the level ofrepresentations (say syntactic equality) is, on the contrary, toofine-grained in some cases: to a bilingual speaker, the informationthat the shop is closed would be equally conveyed by a sign saying“Closed” as by a sign saying“Geschlossen”, even if the two words aredifferent.
An intermediate notion of equivalence that has proved central to therange, correlational, and code views on information is the relation ofbisimulation between structures. A bisimulation relation between twographs \(G\) and \(H\) (where both the arrows and nodes of the graphsare labelled) is a binary relation \(R\) between the nodes of thegraphs with the property that whenever a node \(g\) of \(G\) isrelated to a node \(h\) of \(H\), then:
A simple example would be the relation between the following twographs (empty set of labels) that relates the point \(x\) with \(a\)and the point \(y\) with the points \(b, c, d\).
\[\genfrac{}{}{0}{1}{x \longrightarrow y}{\phantom{x \longrightarrow}\circlearrowright} \qquad \text{ and } \qquad \genfrac{}{}{0}{1}{a \longrightarrow b \longrightarrow c \longrightarrow d}{\phantom{a \longrightarrow b \longrightarrow c \longrightarrow} \circlearrowright} \]Bisimulation is naturally a central notion for theinformation-as-range perspective because the Kripke models ofsection 1 are precisely labelled graphs. It is a classical result of modallogic that if two states of two models are related by a bisimulation,then the states will satisfy exactly the same modal formulas, and inaddition a first order property of states is definable in the basicmodal language if and only if the property is preserved underbisimulation.
As for the correlational stance, in situation theory bisimulationturns out to be the right notion in determining whether two infonsthat might look structurally different are actually the same as piecesof information. For example, one possible analysis of Liar-like claimsleads to infons that are nested in themselves, such as
\[ \sigma = \llangle \text{True}, \text{what} : \sigma , 0\rrangle. \]One can naturally depict the structure of \(\sigma\) as a labelledgraph, which will be bisimilar to the graph associated with theapparently different infon
\[ \psi = \llangle \text{True}, \text{what} : \llangle \text{True}, \text{what} : \psi , 0\rrangle , 0\rrangle. \]The notion of bisimulation appeared independently in computer science,so it so no surprise that it also features in matters related to theinformation-as-code approach, with its focus on representation andcomputation. In particular, several versions of bisimulation have beenapplied to classes of automata to determine when two of them arebehaviourally equivalent, and data encodings such as
\[ L =\langle 0, L\rangle \text{ and } L = \langle 0, \langle 0, L\rangle \rangle, \]both of which represent the same object (an infinite list of zeroes),can be identified as such by noticing that the graphs that depict thestructure of these two expressions are bisimilar. See Aczel (1988),Barwise and Moss (1996), and Moss (2009) for more information aboutbisimulation an circularity, connections with modal logic, datastructures, and coalgebras.
But there is much more to be said about informational equivalence andthe right level of granularity. To reiterate, the themes highlightedby the various stances on information (partiality, aboutness,encoding, range, dynamics, agency) pose many challenges. For anotherexample: ‘3 is prime’, and ‘the sum of the angles ofa triangle is 180 degrees’ are logically equivalent in thestandard sense, as they are both mathematical truths. But they shouldnot be always taken to be informationally equivalent in general.First, they are about different topics. Second, an agent might knowthat 3 is prime, and yet not know that 180 is the sum of the angles ofa triangle, due to its having only partial knowledge about triangles.Third, even if the agent had enough current knowledge to eventuallyinfer that the sum of the angles of a triangle is 180 degrees, theinference might be hard for this agent, so being told that the sum ofthe angle is 180 would be informative in a way that being told that 3is prime would not.
Information, just as content, meaning, knowledge, belief, and manyagent attitudes (seeing that, suspecting that…) exhibithyperintensional properties. There is an active line of research thatstudies how formal systems can capture these phenomena (see the entryonhyperintensionality). Here, we just note that the formal approaches to hyperintensionalitymost closely related to this entry follow some of thesestrategies:
This entry has focused mostly onpositive information.Formally speaking,negative information is simply theextension-via-negation of the positive fragment of any logic builtaround information-states. Different negation-types will constrain thebehaviour of negative information in various ways. Informally,negative information may be thought of variously as what iscanonically expressed with sentential negation, process exclusion(both propositional and sub-propositional) and more. Even when werestrict ourselves to a single conceptual notion, there may bevigorous philosophical debate as to which formal construction bestcaptures the notion in question. In this section, we run thoughseveral formal analyses of negative information, we examine some ofthe philosophical debates surrounding the suitability of variousformal constructions with respect to particular applications, andexamine the related topic of failure of information flow in thesituation-theoretic sense, which may give raise to misinformation orlack of information in particular settings.
Non-constructive intuitionistic negation, is aimed towards accountingfor negative information in the context of information flow viaobservation. For more details on this point, see the subsectionintuitionistic logics and Beth and Kripke models in the supplementarydocument:Abstract Approaches to Information Structure.
Working with the frames fromsection 3.1, non-constructive intuitionistic negation is defined in terms of theconstructive implication, (21), which is combined with bottom,\(\mathbf{0}\), which holds nowhere, as specified by its framecondition:
\[\tag{18} x \Vdash \mathbf{0} \text{ for no } x \in \mathbf{F} \]Hence intuitionistic negation is defined as follows:
\[\tag{19} -A := A \rightarrow \mathbf{0} \]Hence the frame condition for \(-A\) is as follows:
\[\tag{20} x \Vdash -A [A \rightarrow 0] \text{ iff for all } y \in \mathbf{F}, \text{ s.t. } x \sqsubseteq y, \text{ if } y \Vdash A \text{ then } y \Vdash 0 \](20) states that if \(x\) carries the information that \(-A\), thenthere no state \(y\) such that \(y\) is an informational developmentof \(x\) where \(y\) carries the information that \(A\).
The definition of \(-A\) in terms of \(A \rightarrow \mathbf{0}\)throws up an asymmetry between positive and negative information. Inan information model \(-A\) holds at \(x \in \mathbf{F}\) iff \(A\)does not hold at any \(y \in \mathbf{F}\) such that \(x \sqsubseteqy\). Whilst the verification of \(A\) at \(x \in \mathbf{F}\) onlyinvolves checking \(x\), verifying \(-A\) at \(x \in \mathbf{F}\)involves checking all \(y \in \mathbf{F}\) such that \(x \sqsubseteqy\). According to Gurevich (1977) and Wansing (1993), this asymmetrymeans that intuitionistic logic does not provide an adequate treatmentof negative information, since, unlike the verification of \(A\),there is no way of verifying \(-A\) “on the spot” so tospeak. Gurevich and Wansing’s objection to this asymmetry is acritical response to Grzegorczyk (1964). For arguments in support ofGrzegorczyk’s asymmetry between positive and negativeinformation, see Sequoiah-Grayson (2009). A fully constructivenegation that allows for falsification “on the spot” isknown also asNelson Negation on account of it being embeddedwithin Nelson’s constructive systems (Nelson 1949, 1959). For acontemporary development of these constructive systems, see section2.4.1 of Wansing (1993).
In a static logic setting, negation is, at the very least, used torule out truth (if not to express explicit falsity). In a dynamicsetting, negation will be used to rule out particularprocesses. For a development negative information as processexclusion in the context of categorial information theory seeSequoiah-Grayson (2013). This idea has its origins in the DynamicPredicate Logic of Groenendijk and Stokhof (1991), in particular withtheir development of negative information via negation astest-failure. For an exploration between the conceptions ofnegative information as process exclusion and test-failure, seeSequoiah-Grayson (2010).
In any logic for negation as process-exclusion, the process-exclusionwill benon-directional if the logic in question iscommutative. Directional process-exclusion will result when we removethe structural rule of commutation. For a discussion of therelationship between the formalisation of directional processexclusion as commutation-failure along with symmetry-failure oncompatibility and incompatibility relations on information states, seeSequoiah-Grayson (2011). For an extended discussion of negativeinformation in the context of categorial grammars, see Buszkowski(1995).
Wansing (2016) uses the informational interpretation ofsubstructural logics to launch a thorough investigation of the issuessurrounding negative information outlined above. Wansing’sconclusion is that the symmetry between positive and negativeinformation survives all existent arguments to the contrary. At thetime of writing, this debate is lively and ongoing.There is a bi-directional relation between logic and information. Onthe one hand, information underlies the intuitive understanding ofstandard logical notions such as inference (which may be thought of asthe process that turns implicit information into explicit informaiton)and computation. On the other hand, logic provides a formal frameworkfor the study of information itself.
The logical study of information focuses on some of the mostfundamental qualitative aspects of information. Different stances oninformation naturally highlight some of these aspects more thanothers. Thus, the information-as-range stance most naturallyhighlights agency and the dynamics of information in settings withmultiple agents that can interact with each other. The aboutness ofinformation (information is always about something) is a central themein the information-as-correlation stance. The topic of encodinginformation and its processing (as in the case of formal inference) isat the core of the information-as-code stance. None of thesequalitative aspects of information is exclusive to just one of thestances, even if some stress certain topics more than others. Somethemes such as the structure of information and its relation withinformation content are equally pertinent regardless of the stance.The ways in which information is studied in this entry differs fromother important formal frameworks that study informationquantitatively. For example, Shannon’s statistical theory ofinformation is concerned with things such as optimizing the amount ofdata that can be transmitted via a noisy channel, and theKolmogorov’s complexity theory quantifies the informationalcomplexity of a string as the length of the shortest program thatoutputs it when executed by a fixed universal Turing machine.
The logical analysis of information includes fruitfulreinterpretations of known logical systems (such as epistemic logic orrelevance logic), and new systems that result from attempts to capturefurther aspects of information. Still other logical approaches to theanalysis of information result from combining aspects of two differentstances, as with the constraint systems ofsection 4. New frameworks (situation theory in the 80s) have also resulted fromexploring from scratch what sort of inferences — including thosethat are novel and non-classical — one should allow in order tomodel certain aspects of information.
Looking for interfaces between the three stances is still a nascentdirection of inquiry, discussed here insection 4. A complementary issue is whether the stances can be unified. Thereare several formal frameworks that, beyond serving as potentialsettings for exploring the issue of unification, are abstractmathematical theories of information in their own right. Each of thesegoes well beyond the scope of this entry:
The logical study of information resembles in spirit other moretraditional endeavours, such as the logical study of the concept oftruth or computation: in all these cases the object of logical studyplays a central role in the intuitive understanding of logic itself.The three perspectives on qualitative information presented in thisentry (ranges, correlations, and code) portrait the diverse state ofthe art in this field, where many directions of research are open,both as a way of searching for unifying or interfacing settings forthe different stances, and of deepening the understanding of the mainqualitative features of information (dynamics, aboutness, encoding,interaction, etc.) within each stance itself.
Interested readers may wish to pursue the topics in the supplementarydocument
Abstract Approaches to Information Structure
which covers the topicsintuitionistic logic, Beth and Kripke models, andalgebraic and other approaches to modal information theory and related areas.
How to cite this entry. Preview the PDF version of this entry at theFriends of the SEP Society. Look up topics and thinkers related to this entry at the Internet Philosophy Ontology Project (InPhO). Enhanced bibliography for this entryatPhilPapers, with links to its database.
common knowledge |information |information: semantic conceptions of |logic: dynamic epistemic |propositions: structured |set theory: non-wellfounded |situations: in natural language semantics |states of affairs
The authors would like to extend their thanks to the Editors of theStanford Encyclopaedia of Philosophy, as well as to Johan van Benthem,Olivier Roy, and Eric Pacuit. Their assistance and advice has beeninvaluable.
View this site from another server:
The Stanford Encyclopedia of Philosophy iscopyright © 2023 byThe Metaphysics Research Lab, Department of Philosophy, Stanford University
Library of Congress Catalog Data: ISSN 1095-5054