It once seemed that light had energy but no mass, and matter had mass but no energy, until light was found to have relativistic mass, and matter contained the energy needed for a nuclear bomb. It then became apparent that mass and energy were somehow aspects of the same thing.
Mass was originally defined as weight, later refined to be gravitational mass, but Newton’s law that mass needed a force to move it led to the alternative definition of inertial mass. They are different because a weightless object in space still needs a force to move it, so it has inertial mass but not gravitational mass. Momentum is mass times velocity, so a photon with no mass should have no momentum but solar sails move when the sun shines on them, and photons are bent by the sun’s gravity. It followed that a photon has relativistic mass as it moves, so it has momentum (Note 1).
Light in contrast was originally seen as pure energy, which Planck’s equation related to its frequency. Einstein then did for matter what Planck had done for light, namely define its energy. In 1905, he deduced that the energy of matter is its mass times the speed of light squared, or E=mc². This let us build atom bombs but it has never been clear why the energy of matter relates to light at all. If matter is its own inert substance, why does its energy depend on the speed of light?
That matter is made of light however suggests why this is so. In an electron, extreme photons run repeatedly in many channels, so the sum of their energy is that of the electron. Each channel runs a photon at the highest frequency, whose energy is given by Planck’s equation. If these photons spread in two dimensions, Einstein’s equation can be derived from their energy (Note 2). Einstein proved that E=mc2 based on how our physical world behaves, but a processing model can deduce it, so the energy of matter depends on the speed of light because matter is made of light.
Note 1. Relativistic mass is defined by special relativity. Rest mass is mass with no relativistic effects.
Note 2. Let the speed of light c=LP/TP, for Planck length LP and Planck time TP, and a photon’s energy E=h.f, by Planck’s equation. In this model, each electron channel essentially contains an extreme photon at a point with a frequency of 1/TP, so its energy E=h/TP. Now if Planck’s constant (h) is the transfer of one Planck process P over a Planck length in Planck time, h=P.LP/TP. Substituting gives E=P.LP/TP.TP = E=P.c/TP for the energy of one photon. Planck’s relation then also applies to an electron made of photons, except instead of h, the total processing is the electron’s mass me, which transfers over a unit sphere surface in time TP, so the electron’s energy E= me.LP.LP/TP.TP, which is E= me.c2. If all mass arises in the same way, then E= m.c2 in general.