The nature of information is specified by information theory, which began when Shannon and Weaver defined information in terms of the binary choices between physical states [1] (Shannon & Weaver, 1949). One choice from two states is a bit of information, two choices is two bits, and so on. Eight bits, or a byte, is eight choices, so eight electronic on/off switches can store one byte. Eight switches have 256 possible combinations, so one byte can store about one text character. All our kilobytes and megabytes are based on choices between physical states.
If only physical states exist, then all information depends on them, so there is no software without hardware. Information stored as choices between matter states needs matter to exist, so to think that information can also create matter is like thinking a daughter can give birth to her mother, which is impossible.
A choice of one state, or no choice at all, is zero bits, so anything fixed one way contains no information. It follows that a physical book contains no information in itself because it only exists one way. This seems wrong but it isn’t, as hieroglyphics that no-one can read do indeed contain no information. Symbols only contain information if we can read them, which requires a decoding context. For example, the text you are reading now requires the decoding context of the English language, so if you don’t know English, you get no information. If you change the decoding context, like reading every 10th letter, the result will be different information from the same text.
Information theory defines the decoding context of a physical signal as the number of physical states it was chosen from. This number defines the amount of information sent, so one electronic pulse sent down a wire can represent one bit, or it can be one byte as ASCII “1”, or as the first word in a dictionary it can be many bytes. The amount of information in a signal depends not only on the signal itself, but also on its decoding context. If it weren’t so, data compression couldn’t store the same data in a smaller signal, but it can, by better decoding. In general, the information in a physical signal is undefined until its decoding context is known.
The transfer of information between a sender and receiver requires an agreed decoding context, so a receiver can only extract the information a sender put in a signal if they know how to read it.
Given the above definition, processing can be defined as the act of changing information by making new choices. Writing a book is then processing, as it can be written in many ways, and reading a book is also processing, as it can be read in many ways. Processing lets us save data in a physical state and reload it later, given a decoding context. Information is then static, while processing is a dynamic activity.
[1] Mathematically, Information I = Log2(N), for N choice options.