Commentaries on Stanley Klein's Research Articles
1993 - 1996 | 1996 - 1999 | 2000 - 2002 | 2002 - 2006
(Click on a topic heading to expand)
A. Applied Vision
82. Silverstein, D.A. & Klein, S.A. (1994). Restoration of compressed images. Image and Video Compression Majid Rabbani and Robert J. Safranek, Editors, Proc. SPIE 2186, 56-64.
109. Silverstein, D.A. & Klein, S.A. (1996). Precomputing and encoding compressed image enhancement instructions. Submitted to IEEE Image Processing.
These two papers developed a scheme to improve the quality of compressed images without a significant increase in the number of transmitted bits. The core idea is that during
compression one still has access to the original scene or image sequence so several enhancement schemes for restoring the compressed image can be tested and compared. Our trick lies in how to hide
the enhancement information. Manuscript #109 calculates the very minor increase in bits for a very large improvement in fidelity. We have applied for a patent for these ideas with the support of the
UC Technology Transfer office.
86. Klein, S.A. & Silverstein, D.A. (1994). Image compression, fidelity metrics and models of human observer. Proc. of the International Picture Coding Symposium, PCS'94. 96 - 99.
This paper at the International Picture Coding annual meeting presented a brief summary of our groups' activity in the realm of image compression. We describe several areas
where a good model of the human visual system could be used to improve image compression algorithms. As a simple example we present the Fourier amplitudes of the DCT basis functions (derived by us
previously) and discuss the connection to the contrast sensitivity function. We also summarize our image enhancement schemes (articles #82 and #109).
102. Carney, T., Klein, S.A. & Hu, Q.J. (1996). Visual masking near spatiotemporal edges. Human Vision and Electronic Imaging. BE Rogowitz & JP Allebach, Eds, Proc. SPIE 2657, 2657-45.
This is the first of our articles based on data from Qingmin Hu's PhD thesis. He measured the visibility of a thin line probe, flashed for one frame in the presence of a
masking bar of variable width that was flashed on for 500 msec. The time of the probe relative to the bar onset was varied over a wide range, with most data gathered for asynchronies between -50 and
+ 50 msec. This is a combination of Crawford masking and the Westheimer effect. Of special interest is that we used all possible polarities of probe and mask, including cases where only the periphery
contained the mask and the probe region was static. The resulting data is quite rich with some surprising polarity specific effects whereby the peak masking threshold elevation depends on the
relative polarity and timing of test and mask. We are pursuing these effects both because we have found that they are difficult to predict and because they may be important for compression of image
sequences.
A2. Correcting CRT Nonlinearity
83. Hu, Q.J. & Klein, S.A.(1994). Correcting the adjacent pixel nonlinearity on video monitors. Human Vision, Visual Processing and Digital Display V BE Rogowitz & JP Allebach, Eds, Proc.
SPIE 2179, 37-46.
91. Klein, S.A., Carney, T. & Hu, Q.(1995) Improved lookup table to correct CRT pixel nonlinearity. Human Vision, Visual Processing and Digital Display VI BE Rogowitz & JP Allebach, Eds,
Proc. SPIE 2411, 170 - 179.
104. Klein, S.A., Hu, Q.J. & Carney, T.(1996). The adjacent pixel nonlinearity: Problems and solutions. Vision Res., 36, 3167-3181.
- close section -
103. Klein, S.A. & Carney, T.(1996). Transducer function for energy (viewprint) models. Society for Informational Display 96 Digest 27, 425-428.
I was shocked to discover that we didn't know the precise shape of the ideal observer's transducer function for detecting a sinusoidal granting when the phase was randomized from trial to trail.
Although this situation is briefly discussed by Green and Swets we weren't aware of anyone actually calculating the transducer shape. We carried out Monte Carlo simulations for the ideal observer
with phase uncertainty and determined the shape of the transducer function. This calculation is quite relevant to the issue of masking in real images since it is often the case the phase of the error
signal is not known. Thus the d' reduction produced by phase uncertainty should be incorporated into general fidelity metrics. In addition we discussed energy (complex cell) representations where
phase is discarded vs. more standard filter or Fourier (simple cell) representations where phase is preserved. We pointed out the usefulness of the energy approach to fidelity metrics.
- close section -
B. Understanding vision in the periphery and in amblyopia (expand)
87. Levi, D.M., Klein, S.A. & Wang, H.(1994a). Amblyopic and peripheral Vernier acuity: A test-pedestal approach. Vision Research 34, 3265 - 3292.
88. Levi, D.M., Klein, S.A. & Wang, H.(1994b). Discrimination of position and contrast in amblyopic and peripheral vision. Vision Research 34, 3293 - 3314.
This pair of papers represents our application of the test-pedestal approach to clarifying the amblyopic deficit. #87 uses multipole stimuli (edge, line, dipole and
quadrupole) and #88 uses sinusoidal stimuli. These papers provide the cleanest evidence for the differences between strabismic and anisometropic amblyopia. We find that the anisometropic loss in
vernier acuity can be fully accounted for by the loss in visibility such as would be expected from loss of high spatial frequency mechanisms and/or problems with contrast processing. For strabismic
amblyopes we find an extra loss on the vernier position task beyond what can be explained by a degraded contrast sensitivity function. We have made this point in previous papers, but in this pair of
papers based on the test-pedestal approach the data make the case without requiring additional assumptions.
101. Levi, D.M. and Klein, S.A.(1996). Limitations on position coding imposed by undersampling and univariance. Vision Research, 36, 2111-2120.
This paper is part of a long term debate with Robert Hess on the nature of the amblyopic loss. The present topic is concerned with whether the loss found in strabismic
amblyopia might be due to undersampling. Hess believes the loss is due to scrambling of connections rather than undersampling. Dr. Levi and I believe that both factors contribute. In a 1993 Vision
Research article Hess and Field claimed that the principle of univariance eliminates the possibility that undersampling might be responsible. I article #101 we present several arguments, several
simulations and some data that point out a number of fallacies in the Hess & Field position. The new data discussed in the next article (#108) provides further evidence for our position.
108. Wang, H., Levi, D.M. & Klein, S.A.(1996). Spatial uncertainty and sampling efficiency in amblyopic position acuity. Submitted to Vision Research.
About 5 months ago we submitted a paper to Vision Research that measured the effective number of samples available to anisometropic and strabismic amblyopes. We used the
methodology described in article #100. We found that both anisometropic and strabismic amblyopes had increased levels of intrinsic uncertainty, si but only strabismic amblyopes had a dramatic loss of
effective number of samples. This technique provides a powerful way of discriminating between the different factors contributing to the amblyopic loss. We are sorry we didn't have this data available
when we wrote article #101.
- close section -
C. Evoked Potentials (expand)
84. Baseler, H.A., Sutter, E.E., Klein, S.A. & Carney, T.(1994). The topography of visual evoked response properties across the visual field. Electroenceph. &
Clinical Neurophys. 90, 65-81.
96. Klein, S.A. & Carney, T.(1995).The usefulness of the Laplacian in principal component analysis and dipole source localization. Brain Topography, 8, 91 - 108.
These two articles lay the groundwork for my most ambitious research project: the localization of sources of the visual evoked potential (VEP). Over the past forty years there
have been numerous attempts to solve this problem, but once more than two sources are present the source locations become unreliable. Interest in this area has received a tremendous boost because
functional MRI and the growth of cognitive neuroscience has increased attention to this field. The VEP has a thousand-fold better temporal resolution than f-MRI and we believe it also has much better
signal to noise properties. We have developed a number of new approaches, detailed in articles #84 and #96, that make us optimistic that we can succeed where previous investigators have failed. Our
most important innovation is to combine the standard multielectrode technique (my lab has a 64 channel setup) with Erich Sutter's multiple stimulus method: 56 small visual field areas are
simultaneously recorded. The continuity of adjacent retinotopic areas is used to disambiguate the source solutions. This project is Scott Slotnick's PhD thesis area. In article #96, we demonstrate
how the source localization can be substantially improved by having the search algorithm work on the Laplacian of the electrode voltages rather than on the voltages themselves.
134. Slotnick, S.D., Klein, S.A., Carney, T., Sutter, E.E. & Dastmalchi, S.D.(1999). Using multi-stimulus VEP source localization to obtain a retinotopic map of
human primary visual cortex. Clinical Neurophysiology. 110, 1793-1800.
Slotnick S.D., Klein S.A., Carney, T. & Sutter, E.(1999) A direct method for estimating the visual space scaling factor of the human visual cortex. Submitted to Nature.
These two articles are part of Scott Slotnick's PhD Dissertation and represent our first step in implementing our new method of isolating the VEP sources. The crowded, rapidly
flickering, high spatial frequency stimulus that we used preferentially excites primary visual cortex, V1. We fit the data with a single dipole source (one source for each of the 60 stimulus patches)
and was able to account for more than 50% of the variance (noise included). The first article goes into all the methodological details, the most important one being our use of a common time function
for nearby sources. The article demonstrate coherent motion of the dipoles as expected from the topography of V1. The second article uses the source localization data to estimate the cortical
magnification factor (mm of cortex per degrees of visual field). Two methods were used to estimate magnification. Both estimates are in good agreement with anatomy and psychophysics. This is the
first, accurate, direct estimate of the magnification factor in normal individuals.
- close section -
D. Explaining hyperacuity and using hyperacuity as a probe (expand)
81. Klein, S.A.(1993). Fidelity metrics and the test-pedestal approach to spatial vision. Computational Vision Based on Neurobiology, Teri B. Lawton, Editor, Proc. SPIE 2054, 142 - 154.
This paper gives an overview of the approach guiding a substantial portion of my recent research. The idea is fairly simple. The test-pedestal approach claims that under
optimal conditions many discrimination thresholds can be predicted from the detection threshold of the difference between the two patterns to be discriminated. This approach is reminiscent of
Campbell and Robson's finding that the a square wave grating can be discriminated from the fundamental sinusoidal component, when the third harmonic is at its detection threshold (including dipper
function facilitation). One section of this paper shows how optimal bisection thresholds of one arc sec can be predicted using this approach. This finding calls into question recent theories of
contrast gain control since we find minimal gain control in the presence of a very strong mask.
89. Carney, T., Silverstein, D.A. & Klein, S.A.(1995) Vernier acuity during image rotation and translation: Visual performance limits. Vision Research 35, 1951 - 1964.
About twenty years ago Westheimer and McKee reported that vernier acuity was not severely degraded by image motion when the presentation duration was brief. Since that finding
seemed to be difficult to explain we embarked on a number of experiments. We measured three-dot vernier acuity for linear translation and for circular rotation. We found that with linear motion the
results agreed with Westheimer and McKee. At high velocities we found that the vernier threshold could be expressed as a fixed temporal uncertainty of 1 msec. With circular motion the temporal
uncertainty was about 5 msec. An added feature of our experiments is that we explored the effects of contrast reduction. We put in a lot of thought on the question of whether standard mechanisms
could do the job. Our conclusion was that all the data was compatible with the characteristics and capabilities of standard cortical orientation tuned neurons.
98. Baldo, M.V.C & Klein, S.A.(1995). Extrapolation or attention shift? Nature, 378, 565-566.
Romi Nijhawan reported an interesting bias in the following vernier acuity experiment. Two dots are rotating slowly and continuously around their midpoint. A second pair of
dots are flashed just outside the first pair. The observer's task is to control the phase of the moving dots so that they appear aligned with the flashed dots. The result is that in order to achieve
alignment the moving dots must be retarded. Nijhawan argues that the perceived location of a moving stimulus is extrapolated into the future. An alternative explanation is that flashed dots are
processed slower than moving dots. We proposed a third explanation. We carried out a number of experiments in which the spatial locations of the moving and flashed dots was varied. We also varied the
strength of the dots. We concluded that a better explanation of all the data is that the flash is a signal for the observer to shift attention to the moving dots to determine their location. The time
delay observed in these experiments represents the time for shifting attention. This time increases with increasing distance to the moving dots, compatible with previous experiments on attention
shifts.
100. Wang, H., Levi, D.M. & Klein, S.A.(1996). Intrinsic uncertainty and integration efficiency in bisection acuity. Vision Research 36, 717-739.
Barlow introduced a method for estimating an observer's efficiency by measuring how much external noise can be added to the stimulus before thresholds double. We measured
bisection thresholds in normal observers. The middle line was solid. The two reference lines were sampled with the number of samples varying from 2 to 40 (continuous). The samples were taken at
random positions along the line. External jitter was added to the samples. The variance in location of one of the reference lines is given by (si2 + se2)/k, where si is the intrinsic uncertainty of
each dot, se is the external jitter added to each dot and k is the effective number of dots that are averaged by the observer to get the estimated line position. A large amount of data was collected
and we were able to estimate si and k. In this paper we reported results for separations of 2 and 16 min (we are presently collecting data on a much wider range of separations and the results are
similar to the 16 min separation but with some twists). The data reveal clear differences in processing between the filter regime (2 min separation) and the local sign regime (16 min separation). The
filter regime, but not the local sign regime is highly sensitive to stimulus contrast.
106. Carney, T. & Klein, S.A.(1996). Resolution acuity is better than vernier acuity. Vision Res. in press.
This is our most recent foray into the test-pedestal framework. We compared vernier acuity, resolution acuity and contrast discrimination. It is typically difficult to compare
different tasks such as two-line resolution to two-line vernier because the test stimuli are so different. In the test-pedestal framework the test can be identical across tasks. Consider the three
tasks of dipole contrast discrimination, line vernier acuity and edge resolution. We show that in each of these cases the test pattern is a dipole (a pair of adjacent opposite polarity lines). By
measuring thresholds in dipole strength units and varying the pedestal strength, these three tasks can be compared. We find that thresholds for resolution are lowest, then comes contrast
discrimination, with vernier thresholds being highest. This reverses the usual story where thresholds are measured in minutes or seconds of arc. We have a detailed discussion of this outcome. This
test-pedestal approach makes it clear that hyperacuity thresholds are not surprisingly good. As was discussed in connection with article #81, the challenge is to explain why the masking in certain
hyperacuity tasks (such as bisection) is small.
107. Beard, B.L., Levi, D.M. & Klein, S.A.(1995). Temporal vernier acuity: The cortical magnification factor measured by psychophysics. Submitted to Vision Research.
In 1984 Dennis Levi and I defined E2, the eccentricity at which thresholds doubled, as a means for specifying the cortical magnification factor. We discriminated between
cortical magnification which we felt was relevant to hyperacuity, and retinal magnification which we felt was relevant to resolution. This paper reviews a number of other psychophysical approaches
that people have used since 1984 to assess cortical magnification. There is tremendous confusion in the literature because researchers often use a stimulus that doesn't tap the local sign regime
which is appropriate for cortical magnification. We point out flaws in previous methods. This paper presents a large set of new data in which temporal asynchrony of the stimulus is used to get foveal
thresholds, while staying in the local sign regime. Our estimates of cortical magnification based on this methodology are compatible with estimates based on anatomy and physiology.
- close section -