How does the human eye compare to modern cameras and lenses?

by mattdm   Last Updated November 29, 2017 04:18 AM

A goal in most photography is to present a scene which resembles what a person who had been there at that moment would have seen. Even when intentionally working outside of that, human vision is the de facto baseline.

So, it seems useful to know something about how the eye compares to our camera technology. Leaving aside issues of psychology, pattern recognition, and color perception as much as possible (because that's a separate question!), how does the human eye compare to a modern camera and lens?

What's the effective resolution? Field of view? Maximum (and minimum) aperture? ISO equivalence? Dynamic range? Do we have anything that is equivalent to shutter speed?

What structures are directly analogous to parts of a camera and lens (the pupil and iris, say), and what features are uniquely human (or found in cameras but not biology)?



Answers 4


(With much help from the Wikipedia article)

Our eyes are a 2 lens system, the first being our outer eye, and the second being a lens just inside of our eye. Our eyes has a fixed focal length, of about 22-24 mm. We have significantly higher resolution near the center than at the edges. The resolution varies significantly based on where in the image you are looking at, but it is around 1.2 arcminutes/line pair in the central region. We have about 6-7 million sensors, thus we have 6-7 megapixels, but they are somewhat different. The pattern of color detectors isn't very uniform, there is different color detection capabilities in the center as compared to the peripheral vision. The field of view is about 90 degrees from the center.

One interesting point is the human eye doesn't ever form a complete "Snapshot", but is more of a continuous system. It can be very hard to tell this, because our brains are very good at correcting for it, but our system is more of a leaky bucket approach to photography, somewhat but not exactly similar to a digital camcorder.

The "Normal" lens is usually chosen to represent the primary area of human focus, thus explaining their differences.

Cameras have different kinds of sensors, but they are usually spread quite uniformly around the sensor. The sensor is always flat (Human's sensor is curved), potentially leading to edge distortions. The resolution is difficult to obtain in the same format as human vision is given, and depends somewhat on the lens, but it can be safely said that the human eye has more resolution in the center of it's focus, but less in the peripheral areas.

PearsonArtPhoto
PearsonArtPhoto
January 31, 2011 14:52 PM

Pixiq has a very interesting article on the subject, just released a few days ago: http://web.archive.org/web/20130102112517/http://www.pixiq.com/article/eyes-vs-cameras

They talk about the ISO equivalence, focusing, aperture, shutter speed, etc... It's subject to discussion, but it's still interesting to read.

The eye itself is a good piece of tech, but the brain do much of the work in assembling the pieces together. For example, we can perceive a very large dynamic range, but this is mainly due to the brain assembling the different regions together without us to realize. Same for the resolution, the eye has good resolution in the center, but really under-performs everywhere else. The brain assemble the details for us. Same for the colors, we only perceive colors in the center, but the brain fools us by caching color information when they go out of center scope.

decasteljau
decasteljau
January 31, 2011 15:50 PM

The human eye really sucks compared to modern camera lenses.

The human visual system, on the other hand, far surpasses any modern camera system (lens, sensor, firmware).

  • The human eye is only sharp in the centre. In fact, it's only sharp in one very, very tiny spot known as the fovea, which is a spot whose diameter is less than one percent of our total angle of view. So we have some serious corner softness going on.

    The human brain is able to correct for this, however. It instructs the eye to make very rapid movements all around a scene so that the sharp part in the middle darts around. The brain then has a pretty awesome in-body image stabilisation, because it takes all these rapid movements and stitches them together to make one, sharp scene - well, at least all the bits the eye landed on while darting around will be sharp.

  • The human eye is quite sensitive to light, but at low light levels no colour information is available. In addition to this, the sharp part in the centre (the fovea) is less sensitive to light.

    Technically it's because the eye has separate photosites called cones for the three colours (red, green, blue), and another different type of photosite called rods that only captures black and white, but is much more efficient.

    The brain stitches all these together to create an excellent full colour image during the day, but even when it's really, really dark it comes up with a soft, colourless image made by all the rods.

  • The eye only has one lens element and it produces terrible chromatic aberration in the form of purple fringing.

    Actually, this fringe is all in the very short wavelengths of light. The human visual system is least sensitive to these blues and violets. In addition to this, it's able to correct for that fringing that does exist in a few ways. First, because the human vision system is only sharp in the middle, and that's where there is the least chromatic aberration. And secondly, because our colour resolution is (outside the fovea) much lower than our brightness resolution, and the brain doesn't tend to use blue when figuring out brightness.

  • We can see in three dimensions. This is partly because we have two eyes, and the brain can do amazing calculations relating to convergence between them. But it's also more advanced than that; as well as the "3D effect" you get from stereo vision, the brain can also reconstruct scenes in three dimensions even when looking at a two-dimensional photo of the scene. It's because it understands cues such as occlusion, shadows, perspective and size clues and uses all these to put together the scene as a 3D space. When we look at a photo of a long hallway we can see that the hallway extends away from us even though we don't have stereo vision, because the brain understands perspective.

thomasrutter
thomasrutter
February 01, 2011 11:48 AM

Let me throw a question back at you: What is the bitrate and bit depth of a vinyl record?

Cameras are devices designed to, as faithfully as possible, reproduce the image that is projected onto their CCD. A human eye is an evolved device who's purpose is simply to enhance survival. It is quite complex and often behaves counter-intuitively. They have very few similarities:

  • An optical structure for focusing light
  • A receptive membrane to detect projected light

The photoreceptors of the retina

The eye itself is not remarkable. We have millions of photoreceptors, but they provide redundant (and ambiguous at the same time!) inputs to our brain. The rod photoreceptors are highly sensitive to light (especially on the blueish side of the spectrum), and can detect a single photon. In darkness, they operate quite well in a mode called scotopic vision. As it gets brighter, such as during twilight, the cone cells begin to wake up. Cone cells require around 100 photons at minimum to detect light. At this brightness, both rod cells and cone cells are active, in a mode called mesopic vision. Rod cells provide a small amount of color information at this time. As it gets brighter, rod cells saturate, and can no longer function as light detectors. This is called photopic vision, and only cone cells will function.

Biological materials are surprisingly reflective. If nothing was done, light that passes through our photoreceptors and hits the back of the eye would reflect at an angle, creating a distorted image. This is solved by the final layer of cells in the retina which absorb light using melanin. In animals that require great night vision, this layer is intentionally reflective, so photons which miss photoreceptors have a chance to hit them on their way back. This is why cats have reflective retinas!

Another difference between a camera and the eye is where the sensors are located. In a camera, they are located immediately in the path of light. In the eye, everything is backwards. The retinal circuitry is between the light and the photoreceptors, so photons must pass through a layer of all sorts of cells, and blood vessels, before finally hitting a rod or cone. This can distort light slightly. Luckily, our eyes automatically calibrate themselves, so we're not stuck staring at a world with bright red blood vessels jetting back and forth!

The center of the eye is where all the high-resolution reception takes place, with the periphery progressively getting less and less sensitive to detail and more and more colorblind (though more sensitive to small amounts of light and movement). Our brain deals with this by rapidly moving our eyes around in a very sophisticated pattern to allow us to get the maximum detail from the world. A camera is actually similar, but rather than using a muscle, it samples each CCD receptor in turn in a rapid scan pattern. This scan is far, far faster than our saccadic movement, but it is also limited to only one pixel at a time. The human eye is slower (and the scanning is not progressive and exhaustive), but it can take in a lot more at once.

Preprocessing done in the retina

The retina itself actually does quite a lot of preprocessing. The physical layout of the cells is designed to process and extract the most relevant information.

While each pixel in a camera has a 1:1 mapping the digital pixel being stored (for a lossless image at least), the rods and cones in our retina behave differently. A single "pixel" is actually a ring of photoreceptors called a receptive field. To understand this, a basic understanding of the circuitry of the retina is required:

retinal circuitry

The main components are the photoreceptors, each of which connect to a single bipolar cell, which in turn connects to a ganglion that reach through the optic nerve to the brain. A ganglion cell receives input from multiple bipolar cells, in a ring called a center-surround receptive field. The center if the ring and the surround of the ring behave as opposites. Light activating the center excites the ganglion cell, whereas light activating the surround inhibits it (an on-center, off-surround field). There are also ganglion cells for which this is reversed (off-center, on-surround).

receptive fields

This technique sharply improves edge detection and contrast, sacrificing acuity in the process. However overlap between receptive fields (a single photoreceptor can act as input to multiple ganglion cells) allows the brain to extrapolate what it is seeing. This means that information heading to the brain is already highly encoded, to the point where a brain-computer interface directly connecting to the optic nerve is unable to produce anything we can recognize. It is encoded this way because, as others have mentioned, our brain provides amazing post-processing capabilities. Since this isn't directly related to the eye, I won't elaborate on them much. The basics are that the brain detects individual lines (edges), then their lengths, then their direction of movement, each in subsequently deeper areas of the cortex, until it is all put together by the ventral stream and the dorsal stream, which serve to process high-resolution color and motion, respectively.

edge contrast

The fovea centralis is the center of the eye and, as others have pointed out, is where most of our acuity comes from. It contains only cone cells, and, unlike the rest of the retina, does have a 1:1 mapping to what we see. A single cone photoreceptor connects to a single bipolar cell which connects to a single ganglion cell.

The specs of the eye

The eye is not designed to be a camera, so there is no way to answer many of these questions in a way you may like.

What's the effective resolution?

In a camera, there is rather uniform accuracy. The periphery is just as good as the center, so it makes sense to measure a camera by the absolute resolution. The eye on the other hand is not only not a rectangle, but different parts of the eye see with different accuracy. Instead of measuring resolution, eyes are most often measured in VA. A 20/20 VA is average. A 20/200 VA makes you legally blind. Another measurement is LogMAR, but it is less common.

Field of view?

When taking into account both eyes, we have a 210 degree horizontal field of view, and a 150 degree vertical field of view. 115 degrees in the horizontal plane are capable of binocular vision. However, only 6 degrees provides us with high-resolution vision.

Maximum (and minimum) aperture?

Typically, the pupil is 4 mm in diameter. Its maximum range is 2 mm (f/8.3) to 8 mm (f/2.1). Unlike a camera, we cannot manually control the aperture to adjust things like exposure. A small ganglion behind the eye, the ciliary ganglion, automatically adjusts the pupil based on ambient light.

ISO equivalence?

You can't directly measure this, as we have two photoreceptor types, each with different sensitivity. At a minimum, we are able to detect a single photon (though that does not guarantee that a photon hitting our retina will hit a rod cell). Additionally, we do not gain anything by staring at something for 10 seconds, so extra exposure means little to us. As a result, ISO is not a good measurement for this purpose.

An in-the-ballpark estimate from astrophotographers seems to be 500-1000 ISO, with daylight ISO being as low as 1. But again, this is not a good measurement to apply to the eye.

Dynamic range?

The dynamic range of the eye itself is dynamic, as different factors come into play for scotopic, mesopic, and photopic vision. This seems to be explored well in How does the dynamic range of the human eye compare to that of digital cameras?.

Do we have anything that is equivalent to shutter speed?

The human eye is more like a video camera. It takes in everything at once, processes it, and sends it to the brain. The closest equivalent it has to shutter speed (or FPS) is the CFF, or Critical Fusion Frequency, also called the Flicker Fusion Rate. This is defined as the transition point where an intermittent light of increasing temporal frequency blends into a single, solid light. The CFF is higher in our periphery (which is why you can sometimes see the flicker of old florescent bulbs only if you look at them indirectly), and it is higher when it is bright. In bright light, our visual system has a CFF of around 60. In darkness, it can get as low as 10.

This isn't the whole story though, because much of this is caused by visual persistence in the brain. The eye itself has a higher CFF (while I can't find a source right now, I seem to remember it being on the order of magnitude of 100), but our brain blurs things together to decrease processing load and to give us more time to analyze a transient stimulus.

Trying to compare a camera and the eye

Eyes and cameras have completely different purposes, even if they seem to superficially do the same thing. Cameras are intentionally built around assumptions that make certain kinds of measurement easy, whereas no such plan came into play for the evolution of the eye.

Baal-zebub
Baal-zebub
November 29, 2017 03:28 AM

Related Questions


How to know if a photo is good -people judge

Updated December 11, 2017 09:18 AM

Does the human eye experience noise as a camera would?

Updated January 22, 2018 01:18 AM