OK so the relation math is easy enough.
If you know the size of an object, say a 12" ruler in the image, and you put the ruler flat against another object like a door. You take a picture of the door with the ruler.
The ruler is 12in, and is found to be 120px tall in the image. the door is found to be 1200px, which when you do the easy math, is 10ft, or 120in.
However that is not how it works, in reality the image has some form of distortion, so a direct relation will not work. It has something to do with gathering light, and focal points/lengths or something. What do I need to account for to make this relationship work?
what we have so far
(12in / 120px) = (unknownHeight / 1200px) (12in / 120px) * 1200px = unkownHeight 0.1ppi * 1200px = 120in or 10ft
But in reality if the ruler is 12in and 120px, the 10ft door will be like 1100px, so we need to account for like the lens or something?
What am I missing?
Yes, the problem is that lenses distort the image from the simple parallel projection you would need to keep the math that simple.
Fortunately, the math and data to correct lens distortions are readily available. Some post-processing software can identify the lens and its settings from the image metadata and, using lens reference data, apply corrections to remove as much distortion as possible.
If you had a finer grid, you could measure the object using the image of the ruler, even if it's seen in a fun-house mirror. If the cm marks are 1t pixels apart on one end and 8 px apart on the other end, it's no sweat if you count N actual marks in the image across the object.
If you don't have the ruler adjacent to the object, but measure various featues at all angles all, over the plane, put 2 rulers at right angles. It's apparent what you do if you use 4 rulers, in a rectangle around the object. Connect lines accross opposite rulers to make a grid, even if the near and far rulers are different sizes: the drawn grid can be used to measure your feature.