A somewhat different logic based approach to information theory was developed by Solomonoff, Kolmogorov, Chaitin and others. It has come to be called computational or algorithmic complexity. The algorithmic approach applies to individual instances (as well as ensembles, if required), but has the problem of being highly nonoperational, despite its basis in computational methods. On the other hand, it is operational enough that its definition of a compressed string involves a certain amount of computational overhead that becomes completely insignificant only in the infinite limit. Nonetheless, it gives a mathematically precise definition of information that has a place in computational and logical studies, and in the foundations of randomness and probability.
Let s be mapped isomorphically onto some binary string Ss (so that s and only s can be recovered from the inverse mapping), then the complexity of s is the length in bits of the shortest self-delimiting computer program that produces Ss, minus any computational overhead required to run the program, i.e. C = length(Ss) - O(1). The first (positive) part of this measure is the algorithmic complexity, or Kolmogorov/Chaitin complexity. The second part of the measure, O(1), is a constant (order of magnitude 1) representing the computational overhead required to produce the string Ss. This is the complexity of the program that computes Ss. It is machine dependent, but can be reduced to an arbitrarily small value, mitigating the machine dependence. It is deducted to define the information because we want a machine independent measure that is directly numerically comparable to Shannon information, permitting a formal identification of algorithmic complexity and combinatorial and probabilistic measures of information (note how the mathematics are guided by the intuitions about information rather than the other way around). The overhead is troublesome, nonetheless. On the original definition:
length(Ss) = min{|p|: p{0,1}* & M(p) = Ss} = min{|p|: p|p| being the length of p, which is a binary string (i.e. p{0,1}* & f(p) = s},
The complexity of a program is itself a matter for algorithmic complexity theory. Since a universal Turing machine can duplicate each program on any other Turing machine, there is a partial recursive function f0 for which the algorithmic complexity is less than or equal to the algorithmic complexity, plus a constant involving the computational overhead of duplicating the particular M, calculated using any other f. This is called the Invariance Theorem, a fundamental result of algorithmic complexity theory. Since there is a clear sense in which f0 is optimal, the Invariance Theorem justifies ignoring the language dependence of length(Ss). String maps of highly complex structures can be computed, in general, with the same computational overhead as those of simple structures (the computational overhead is nearly constant), so for complex structures (large I) the negative component of informational complexity is negligible. Furthermore, in comparisons of algorithmic complexity, the overhead drops out except for a very small part required to make comparisons of complexity (even this drops out in comparisons of comparisons of complexity), so the relative algorithmic complexity is almost a direct measure of the relative informational complexity. Nonetheless, the practical problems are not insignificant (Wallace and Dowe 1999).
Gregory Chaitin (1975, 1987, 1998) has proven a number of central results of metalogic and the philosophy of mathematics within information theory. The basis of these proofs is the algorithmic complexity of a formal system. The limits of what can be calculated in a formal system depend on the complexity of the system itself. Hidden in any formal algorithm are the axioms and rules of inference that determine the behaviour of the system and provide the algorithm for testing computations. The information content of these axioms and rules is the complexity of the formal system, n. In a formal system of complexity n it is impossible to prove that a particular series of binary digits is of complexity greater than n+c, where c is a constant that is independent of the particular system employed, but depends only on the implementation of the system. This is the central theorem of algorithmic logic. The intuition behind it is that you can't get more information out of a formal system than you build in, i.e., the information in the recursive closure of a formal system is no more than in the axioms themselves. Now consider a formal proof-checking procedure for a formal axiomatic system A defined within A. This is equivalent to a computer program that outputs 1 if an expression of A is a theorem, and 0 if not. If this program exists, it will have a finite number of bits n. Now take an enumeration of the finite sequences of expressions A, and assume that each place in the enumeration is 1 if the sequence is a proof, and 0 otherwise. This gives a real number that the proof-checker compresses, if it exists. The proof checker has complexity of n+c, if it exists, so the enumeration can have at most a complexity of n+c. Undecidability is shown by showing that the enumeration must have complexity greater than n+c, via the unsovability of the halting problem (formal systems for which there is no halting problem, e.g., those lacking recursion, are complete). Various proofs are possible, but they all show that decidability would require that there are no programs of complexity greater than n+c that do not halt, but the halting problem ensures that there are. The proof that there are theorems of arithmetic that are not provable involves the existence of programs of complexity greater than n+c for any n. For each formal system A strong enough for arithmetic, there will be programs of complexity greater than n+c that do not halt, but cannot be proven not to halt within A (they are too complex). For example, a program selected with more than n+c toss's of a fair coin will almost certainly not halt, but is too complex to be shown within A not to halt. Since each of these non-halting programs is a number, and the number determines whether the program halts, there is an arithmetical theorem that cannot be proven within A. The power of a formal axiomatic system can be defined in terms of the size in bits of the shortest program whose halting problem is undecidable within the formal axiomatic system. The power of an axiomatic system can be increased by adding new independent axioms. Which ones to add, as in the other sciences, is largely a pragmatic matter. Though some axioms will lead to more powerful theories than others, and might be favoured on these grounds alone, tractability and non-trivality are unavoidable issues.