Author | Title | Year | Journal/Proceedings | Reftype | DOI/URL |
---|---|---|---|---|---|
Atick, J.J., Li, Z. & Redlich, A.N. | Understanding retinal color coding from first principles [BibTeX] |
1992 | Neural Computation Vol. 4(4), pp. 559―572 |
article | URL |
BibTeX:
@article{atick_understanding_1992, author = {J. J Atick and Z. Li and A. N Redlich}, title = {Understanding retinal color coding from first principles}, journal = {Neural Computation}, year = {1992}, volume = {4}, number = {4}, pages = {559―572}, url = {http://redwood.berkeley.edu/w/images/e/e0/16-atick-nc-1992.pdf} } |
|||||
Atick, J.J. & Redlich, A.N. | What does the retina know about natural scenes? [BibTeX] |
1992 | Neural Computation Vol. 4(2), pp. 196―210 |
article | URL |
BibTeX:
@article{atick_whatretina_1992, author = {J. J Atick and A. N Redlich}, title = {What does the retina know about natural scenes?}, journal = {Neural Computation}, year = {1992}, volume = {4}, number = {2}, pages = {196―210}, url = {http://redwood.berkeley.edu/w/images/6/69/08-atick-nc-1992.pdf} } |
|||||
Attneave, F. | Some informational aspects of visual perception. [BibTeX] |
1954 | Psychol Rev Vol. 61(3), pp. 183―193 |
article | URL |
BibTeX:
@article{attneave_informational_1954, author = {F. Attneave}, title = {Some informational aspects of visual perception.}, journal = {Psychol Rev}, year = {1954}, volume = {61}, number = {3}, pages = {183―193}, url = {http://redwood.berkeley.edu/w/images/8/8a/01-attneave-pr-1954.pdf} } |
|||||
Baddeley, R. | Searching for filters with 'interesting' output distributions: an uninteresting direction to explore? | 1996 | Network Vol. 7(2), pp. 409―421 |
article | DOI URL |
Abstract: It has been independently proposed, by Barlow, Field, Intrator and co-workers, that the receptive fields of neurons in V1 are optimized to generate 'sparse', Kurtotic, or 'interesting' output probability distributions. We investigate the empirical evidence for this further and argue that filters can produce 'interesting' output distributions simply because natural images have variable local intensity variance. If the proposed filters have zero DC, then the probability distribution of filter outputs (and hence the output Kurtosis) is well predicted simply from these effects of variable local variance. This suggests that finding filters with high output Kurtosis does not necessarily signal interesting image structure.It is then argued that finding filters that maximize output Kurtosis generates filters that are incompatible with observed physiology. In particular the optimal difference-of-Gaussian (DOG) filter should have the smallest possible scale, an on-centre off-surround cell should have a negative DC, and that the ratio of centre width to surround width should approach unity. This is incompatible with the physiology. Further, it is also predicted that oriented filters should always be oriented in the vertical direction, and of all the filters tested, the filter with the highest output Kurtosis has the lowest signal-to-noise ratio (the filter is simply the difference of two neighbouring pixels). Whilst these observations are not incompatible with the brain using a sparse representation, it does argue that little significance should be placed on finding filters with highly Kurtotic output distributions. It is therefore argued that other constraints are required in order to understand the development of visual receptive fields. | |||||
BibTeX:
@article{baddeley_searching_1996, author = {R. Baddeley}, title = {Searching for filters with 'interesting' output distributions: an uninteresting direction to explore?}, journal = {Network}, year = {1996}, volume = {7}, number = {2}, pages = {409―421}, url = {http://dx.doi.org/10.1088/0954-898X/7/2/021}, doi = {{10.1088/0954-898X/7/2/021}} } |
|||||
Balboa, R.M., Tyler, C.W. & Grzywacz, N.M. | Occlusions contribute to scaling in natural images | 2001 | Vision Research Vol. 41(7), pp. 955-64 |
article | URL |
Abstract: Spatial power spectra from natural images fall approximately as the square of spatial frequency, a property also called scale invariance (scaling). Various theories for visual receptive fields consider scale invariance key. Two hypotheses have been advanced in the literature for why natural images obey scale invariance. The first is that these images have luminance edges, whose spectra fall as frequency squared. The second is that scale invariance follows from natural images being essentially a collage of independent, constant-intensity regions, whose sizes follow a power-law distribution. Recently, an argument by example was made against the first hypothesis. Here we refute that argument and show that the first hypothesis is consistent with the scaling under a wide variety of distributions of sizes. There are two reasons for this: first, for every frequency, the log-log slope of the rotationally averaged power spectrum of an image is the weighted mean of the log-log slopes from the independent regions of the image formed by objects occluding one another. Second, the log-log slopes of the spectrum envelope for a constant-intensity region are 0 and -3 for frequencies corresponding to periods much larger and much smaller than the region's size, respectively. Therefore, it is not surprising that natural images have log-log slopes between -1.5 and -3, with a mean near -2. | |||||
BibTeX:
@article{balboa_occlusions_2001, author = {R M Balboa and C W Tyler and N M Grzywacz}, title = {Occlusions contribute to scaling in natural images}, journal = {Vision Research}, year = {2001}, volume = {41}, number = {7}, pages = {955--64}, note = {PMID: 11248280}, url = {http://www.ncbi.nlm.nih.gov/pubmed/11248280} } |
|||||
Barlow, H.B. | Possible principles underlying the transformation of sensory messages [BibTeX] |
1961 | Sensory Communication, pp. 217―234 | article | URL |
BibTeX:
@article{barlow_possible_1961, author = {H. B Barlow}, title = {Possible principles underlying the transformation of sensory messages}, journal = {Sensory Communication}, year = {1961}, pages = {217―234}, url = {http://redwood.berkeley.edu/w/images/f/fd/02-barlow-pr-1954.pdf} } |
|||||
Bell, A.J. & Sejnowski, T.J. | An information-maximization approach to blind separation and blind deconvolution [BibTeX] |
1995 | Neural Computation Vol. 7(6), pp. 1129―1159 |
article | URL |
BibTeX:
@article{bell_information-maximization_1995, author = {A. J Bell and T. J Sejnowski}, title = {An information-maximization approach to blind separation and blind deconvolution}, journal = {Neural Computation}, year = {1995}, volume = {7}, number = {6}, pages = {1129―1159}, url = {ftp://ftp.cnl.salk.edu/pub/tony/bell.blind.ps.Z} } |
|||||
Bell, AJ. & Sejnowski, TJ. | The "independent components" of natural scenes are edge filters. | 1997 | Vision Res Vol. 37(23), pp. 3338, 3327 |
article | URL |
Abstract: It has previously been suggested that neurons with line and edge selectivities found in primary visual cortex of cats and monkeys form a sparse, distributed representation of natural scenes, and it has been reasoned that such responses should emerge from an unsupervised learning algorithm that attempts to find a factorial code of independent visual features. We show here that a new unsupervised learning algorithm based on information maximization, a nonlinear "infomax" network, when applied to an ensemble of natural scenes produces sets of visual filters that are localized and oriented. Some of these filters are Gabor-like and resemble those produced by the sparseness-maximization network. In addition, the outputs of these filters are as independent as possible, since this infomax network performs Independent Components Analysis or ICA, for sparse (super-gaussian) component distributions. We compare the resulting ICA filters and their associated basis functions, with other decorrelating filters produced by Principal Components Analysis (PCA) and zero-phase whitening filters (ZCA). The ICA filters have more sparsely distributed (kurtotic) outputs on natural scenes. They also resemble the receptive fields of simple cells in visual cortex, which suggests that these neurons form a natural, information-theoretic coordinate system for natural images. | |||||
BibTeX:
@article{bell_independent_1997, author = {AJ Bell and TJ Sejnowski}, title = {The "independent components" of natural scenes are edge filters.}, journal = {Vision Res}, year = {1997}, volume = {37}, number = {23}, pages = {3338, 3327}, url = {http://view.ncbi.nlm.nih.gov/pubmed/9425547} } |
|||||
Bethge, M. | Factorial coding of natural images: how effective are linear models in removing higher-order dependencies? | 2006 | Journal of the Optical Society of America A Vol. 23(6), pp. 1253-1268 |
article | DOI URL |
Abstract: The performance of unsupervised learning models for natural images is evaluated quantitatively by means of information theory. We estimate the gain in statistical independence (the multi-information reduction) achieved with independent component analysis (ICA), principal component analysis (PCA), zero-phase whitening, and predictive coding. Predictive coding is translated into the transform coding framework, where it can be characterized by the constraint of a triangular filter matrix. A randomly sampled whitening basis and the Haar wavelet are included in the comparison as well. The comparison of all these methods is carried out for different patch sizes, ranging from 2×2 to 16×16 pixels. In spite of large differences in the shape of the basis functions, we find only small differences in the multi-information between all decorrelation transforms (5% or less) for all patch sizes. Among the second-order methods, PCA is optimal for small patch sizes and predictive coding performs best for large patch sizes. The extra gain achieved with ICA is always less than 2 In conclusion, the edge filters found with ICA lead to only a surprisingly small improvement in terms of its actual objective. | |||||
BibTeX:
@article{bethge_factorial_2006, author = {Matthias Bethge}, title = {Factorial coding of natural images: how effective are linear models in removing higher-order dependencies?}, journal = {Journal of the Optical Society of America A}, year = {2006}, volume = {23}, number = {6}, pages = {1253--1268}, url = {http://josaa.osa.org/abstract.cfm?URI=josaa-23-6-1253}, doi = {{10.1364/JOSAA.23.001253}} } |
|||||
Brady, N. | Spatial scale interactions and image statistics | 1997 | Perception Vol. 26(9), pp. 1089-100 |
article | URL |
Abstract: In natural scenes and other broadband images, spatial variations in luminance occur at a range of scales or frequencies. It is generally agreed that the visual image is initially represented by the activity of separate frequency-tuned channels, and this notion is supported by physiological evidence for a stage of multi-resolution filtering in early visual processing. The question whether these channels can be accessed as independent sources of information in the normal course of events is a more contentious one. In the psychophysical study of both motion and spatial vision, there are examples of tasks in which fine-scale structure dominates perception or performance and obscures information at coarser scales. It is argued here that one important factor determining the relative salience of information from different spatial scales in broadband images is the distribution of response activity across spatial channels. The special case of natural scenes that have characteristic 'scale-invariant' power spectra in which image contrast is roughly constant in equal octave frequency bands is considered. A review is presented of evidence which suggests that the sensitivity of frequency-tuned filters in the visual system is matched to this image statistic, so that, on average, different channels respond with equal activity to natural scenes. Under these conditions, the visual system does appear to have independent access to information at different spatial scales and spatial scale interactions are not apparent. | |||||
BibTeX:
@article{brady_spatial_1997, author = {N Brady}, title = {Spatial scale interactions and image statistics}, journal = {Perception}, year = {1997}, volume = {26}, number = {9}, pages = {1089--100}, note = {PMID: 9509145}, url = {http://www.ncbi.nlm.nih.gov/pubmed/9509145} } |
|||||
Brady, N. & Field, D.J. | Local contrast in natural images: normalisation and coding efficiency | 2000 | Perception Vol. 29(9), pp. 1041-55 |
article | URL |
Abstract: The visual system employs a gain control mechanism in the cortical coding of contrast whereby the response of each cell is normalised by the integrated activity of neighbouring cells. While restricted in space, the normalisation pool is broadly tuned for spatial frequency and orientation, so that a cell's response is adapted by stimuli which fall outside its 'classical' receptive field. Various functions have been attributed to divisive gain control: in this paper we consider whether this output nonlinearity serves to increase the information carrying capacity of the neural code. 46 natural scenes were analysed with the use of oriented, frequency-tuned filters whose bandwidths were chosen to match those of mammalian striate cortical cells. The images were logarithmically transformed so that the filters responded to a luminance ratio or contrast. In the first study, the response of each filter was calibrated relative to its response to a grating stimulus, and local image contrast was expressed in terms of the familiar Michelson metric. We found that the distribution of contrasts in natural images is highly kurtotic, peaking at low values and having a long exponential tail. There is considerable variability in local contrast, both within and between images. In the second study we compared the distribution of response activity before and after implementing contrast normalisation, and noted two major changes. Response variability, both within and between scenes, is reduced by normalisation, and the entropy of the response distribution is increased after normalisation, indicating a more efficient transfer of information. | |||||
BibTeX:
@article{brady_local_2000, author = {N Brady and D J Field}, title = {Local contrast in natural images: normalisation and coding efficiency}, journal = {Perception}, year = {2000}, volume = {29}, number = {9}, pages = {1041--55}, note = {PMID: 11144818}, url = {http://www.ncbi.nlm.nih.gov/pubmed/11144818} } |
|||||
Burton, G.J. & Moorhead, I.R. | Color and spatial structure in natural scenes | 1987 | Applied Optics Vol. 26(1), pp. 157-170 |
article | URL |
Abstract: Digitized records of terrain scenes were produced using a technique of photographic colorimetry. Each record consisted of three tristimulus images (X, Y, and Z) which were analyzed for their color statistics, spatial frequency content, and image correlation. Interactions between color and space were examined using a cone receptor transformation. It is shown that the scene amplitude spectra follow an approximate reciprocal variation with frequency, and that the correlation function can be described by a one-step autoregressive model. The results are discussed in terms of methods for optimum image coding in human and machine vision. | |||||
BibTeX:
@article{burton_color_1987, author = {G. J. Burton and Ian R. Moorhead}, title = {Color and spatial structure in natural scenes}, journal = {Applied Optics}, year = {1987}, volume = {26}, number = {1}, pages = {157--170}, url = {http://ao.osa.org/abstract.cfm?URI=ao-26-1-157} } |
|||||
Carlsson, G., Ishkhanov, T., de Silva, V. & Zomorodian, A. | On the Local Behavior of Spaces of Natural Images | 2008 | International Journal of Computer Vision Vol. 76(1), pp. 1-12 |
article | DOI URL |
Abstract: Abstract In this study we concentrate on qualitative topological analysis of the local behavior of the space of natural images. To this end, we use a space of 3 by 3 high-contrast patches ℳ. We develop a theoretical model for the high-density 2-dimensional submanifold of ℳ showing that it has the topology of the Klein bottle. Using our topological software package PLEX we experimentally verify our theoretical conclusions. We use polynomial representation to give coordinatization to various subspaces of ℳ. We find the best-fitting embedding of the Klein bottle into the ambient space of ℳ. Our results are currently being used in developing a compression algorithm based on a Klein bottle dictionary. | |||||
BibTeX:
@article{carlsson_local_2008, author = {Gunnar Carlsson and Tigran Ishkhanov and Vin de Silva and Afra Zomorodian}, title = {On the Local Behavior of Spaces of Natural Images}, journal = {International Journal of Computer Vision}, year = {2008}, volume = {76}, number = {1}, pages = {1--12}, url = {http://dx.doi.org/10.1007/s11263-007-0056-x}, doi = {http://dx.doi.org/10.1007/s11263-007-0056-x} } |
|||||
Caywood, M.S., Willmore, B. & Tolhurst, D.J. | Independent components of color natural scenes resemble V1 neurons in their spatial and color tuning | 2004 | Journal of Neurophysiology Vol. 91(6), pp. 2859-73 |
article | DOI URL |
Abstract: It has been hypothesized that mammalian sensory systems are efficient because they reduce the redundancy of natural sensory input. If correct, this theory could unify our understanding of sensory coding; here, we test its predictions for color coding in the primate primary visual cortex (V1). We apply independent component analysis (ICA) to simulated cone responses to natural scenes, obtaining a set of colored independent component (IC) filters that form a redundancy-reducing visual code. We compare IC filters with physiologically measured V1 neurons, and find great spatial similarity between IC filters and V1 simple cells. On cursory inspection, there is little chromatic similarity; however, we find that many apparent differences result from biases in the physiological measurements and ICA analysis. After correcting these biases, we find that the chromatic tuning of IC filters does indeed resemble the population of V1 neurons, supporting the redundancy-reduction hypothesis. | |||||
BibTeX:
@article{caywood_independent_2004, author = {Matthew S Caywood and Benjamin Willmore and David J Tolhurst}, title = {Independent components of color natural scenes resemble V1 neurons in their spatial and color tuning}, journal = {Journal of Neurophysiology}, year = {2004}, volume = {91}, number = {6}, pages = {2859--73}, note = {PMID: 14749316}, url = {http://www.ncbi.nlm.nih.gov/pubmed/14749316}, doi = {http://dx.doi.org/10.1152/jn.00775.2003} } |
|||||
Chandler, D.M. & Field, D.J. | Estimates of the information content and dimensionality of natural scenes from proximity distributions. | 2007 | J Opt Soc Am A Opt Image Sci Vis Vol. 24(4), pp. 922―941 |
article | |
Abstract: Natural scenes, like most all natural data sets, show considerable redundancy. Although many forms of redundancy have been investigated (e.g., pixel distributions, power spectra, contour relationships, etc.), estimates of the true entropy of natural scenes have been largely considered intractable. We describe a technique for estimating the entropy and relative dimensionality of image patches based on a function we call the proximity distribution (a nearest-neighbor technique). The advantage of this function over simple statistics such as the power spectrum is that the proximity distribution is dependent on all forms of redundancy. We demonstrate that this function can be used to estimate the entropy (redundancy) of 3x3 patches of known entropy as well as 8x8 patches of Gaussian white noise, natural scenes, and noise with the same power spectrum as natural scenes. The techniques are based on assumptions regarding the intrinsic dimensionality of the data, and although the estimates depend on an extrapolation model for images larger than 3x3, we argue that this approach provides the best current estimates of the entropy and compressibility of natural-scene patches and that it provides insights into the efficiency of any coding strategy that aims to reduce redundancy. We show that the sample of 8x8 patches of natural scenes used in this study has less than half the entropy of 8x8 white noise and less than 60% of the entropy of noise with the same power spectrum. In addition, given a finite number of samples (textless2(20)) drawn randomly from the space of 8x8 patches, the subspace of 8x8 natural-scene patches shows a dimensionality that depends on the sampling density and that for low densities is significantly lower dimensional than the space of 8x8 patches of white noise and noise with the same power spectrum. | |||||
BibTeX:
@article{chandler_estimates_2007, author = {Damon M Chandler and David J Field}, title = {Estimates of the information content and dimensionality of natural scenes from proximity distributions.}, journal = {J Opt Soc Am A Opt Image Sci Vis}, year = {2007}, volume = {24}, number = {4}, pages = {922―941} } |
|||||
Chandler, D.M. & Hemami, S.S. | VSNR: a wavelet-based visual signal-to-noise ratio for natural images | 2007 | IEEE Transactions on Image Processing: A Publication of the IEEE Signal Processing Society Vol. 16(9), pp. 2284-98 |
article | URL |
Abstract: This paper presents an efficient metric for quantifying the visual fidelity of natural images based on near-threshold and suprathreshold properties of human vision. The proposed metric, the visual signal-to-noise ratio (VSNR), operates via a two-stage approach. In the first stage, contrast thresholds for detection of distortions in the presence of natural images are computed via wavelet-based models of visual masking and visual summation in order to determine whether the distortions in the distorted image are visible. If the distortions are below the threshold of detection, the distorted image is deemed to be of perfect visual fidelity (VSNR = infinity) and no further analysis is required. If the distortions are suprathreshold, a second stage is applied which operates based on the low-level visual property of perceived contrast, and the mid-level visual property of global precedence. These two properties are modeled as Euclidean distances in distortion-contrast space of a multiscale wavelet decomposition, and VSNR is computed based on a simple linear sum of these distances. The proposed VSNR metric is generally competitive with current metrics of visual fidelity; it is efficient both in terms of its low computational complexity and in terms of its low memory requirements; and it operates based on physical luminances and visual angle (rather than on digital pixel values and pixel-based dimensions) to accommodate different viewing conditions. | |||||
BibTeX:
@article{chandler_vsnr:wavelet-based_2007, author = {Damon M Chandler and Sheila S Hemami}, title = {VSNR: a wavelet-based visual signal-to-noise ratio for natural images}, journal = {IEEE Transactions on Image Processing: A Publication of the IEEE Signal Processing Society}, year = {2007}, volume = {16}, number = {9}, pages = {2284--98}, note = {PMID: 17784602}, url = {http://www.ncbi.nlm.nih.gov/pubmed/17784602} } |
|||||
Dan, Y., Atick, J.J. & Reid, R.C. | Efficient coding of natural scenes in the lateral geniculate nucleus: experimental test of a computational theory. | 1996 | J Neurosci Vol. 16(10), pp. 3351―3362 |
article | URL |
Abstract: A recent computational theory suggests that visual processing in the retina and the lateral geniculate nucleus (LGN) serves to recode information into an efficient form (Atick and Redlich, 1990). Information theoretic analysis showed that the representation of visual information at the level of the photoreceptors is inefficient, primarily attributable to a high degree of spatial and temporal correlation in natural scenes. It was predicted, therefore, that the retina and the LGN should recode this signal into a decorrelated form or, equivalently, into a signal with a "white" spatial and temporal power spectrum. In the present study, we tested directly the prediction that visual processing at the level of the LGN temporarily whitens the natural visual input. We recorded the responses of individual neurons in the LGN of the cat to natural, time-varying images (movies) and, as a control, to white-noise stimuli. Although there is substantial temporal correlation in natural inputs (Dong and Atick, 1995b), we found that the power spectra of LGN responses were essentially white. Between 3 and 15 Hz, the power of the responses had an average variation of only +/-10.3 Thus, the signals that the LGN relays to visual cortex are temporarily decorrelated. Furthermore, the responses of X-cells to natural inputs can be well predicted from their responses to white-noise inputs. We therefore conclude that whitening of natural inputs can be explained largely by the linear filtering properties (Enroth-Cugell and Robson, 1966). Our results suggest that the early visual pathway is well adapted for efficient coding of information in the natural visual environment, in agreement with the prediction of the computational theory. | |||||
BibTeX:
@article{dan_efficient_1996, author = {Y. Dan and J. J. Atick and R. C. Reid}, title = {Efficient coding of natural scenes in the lateral geniculate nucleus: experimental test of a computational theory.}, journal = {J Neurosci}, year = {1996}, volume = {16}, number = {10}, pages = {3351―3362}, url = {http://redwood.berkeley.edu/w/images/7/70/10-dan-jons-1996.pdf} } |
|||||
Daugman, J.G. | Uncertainty relation for resolution in space, spatial frequency, and orientation optimized by two-dimensional visual cortical filters [BibTeX] |
1985 | J. Opt. Soc. Am. A Vol. 2(7), pp. 1160–1169 |
article | URL |
BibTeX:
@article{daugman_uncertainty_1985, author = {John G. Daugman}, title = {Uncertainty relation for resolution in space, spatial frequency, and orientation optimized by two-dimensional visual cortical filters}, journal = {J. Opt. Soc. Am. A}, year = {1985}, volume = {2}, number = {7}, pages = {1160–1169}, url = {http://josaa.osa.org/abstract.cfm?URI=josaa-2-7-1160} } |
|||||
Doi, E., Inui, T., Lee, Te-Won., Wachtler, T. & Sejnowski, T.J. | Spatiochromatic receptive field properties derived from information-theoretic analyses of cone mosaic responses to natural scenes | 2003 | Neural Computation Vol. 15(2), pp. 397-417 |
article | DOI URL |
Abstract: Neurons in the early stages of processing in the primate visual system efficiently encode natural scenes. In previous studies of the chromatic properties of natural images, the inputs were sampled on a regular array, with complete color information at every location. However, in the retina cone photoreceptors with different spectral sensitivities are arranged in a mosaic. We used an unsupervised neural network model to analyze the statistical structure of retinal cone mosaic responses to calibrated color natural images. The second-order statistical dependencies derived from the covariance matrix of the sensory signals were removed in the first stage of processing. These decorrelating filters were similar to type I receptive fields in parvo- or konio-cellular LGN in both spatial and chromatic characteristics. In the subsequent stage, the decorrelated signals were linearly transformed to make the output as statistically independent as possible, using independent component analysis. The independent component filters showed luminance selectivity with simple-cell-like receptive fields, or had strong color selectivity with large, often double-opponent, receptive fields, both of which were found in the primary visual cortex (V1). These results show that the "form" and "color" channels of the early visual system can be derived from the statistics of sensory signals. | |||||
BibTeX:
@article{doi_spatiochromatic_2003, author = {Eizaburo Doi and Toshio Inui and Te-Won Lee and Thomas Wachtler and Terrence J Sejnowski}, title = {Spatiochromatic receptive field properties derived from information-theoretic analyses of cone mosaic responses to natural scenes}, journal = {Neural Computation}, year = {2003}, volume = {15}, number = {2}, pages = {397--417}, note = {PMID: 12590812}, url = {http://www.ncbi.nlm.nih.gov/pubmed/12590812}, doi = {http://dx.doi.org/10.1162/089976603762552960} } |
|||||
Dong, D.W. & Atick, J.J. | Statistics of natural time-varying images [BibTeX] |
1995 | Network: Computation in Neural Systems Vol. 6(3), pp. 345―358 |
article | URL |
BibTeX:
@article{dong_statistics_1995, author = {D. W Dong and J. J Atick}, title = {Statistics of natural time-varying images}, journal = {Network: Computation in Neural Systems}, year = {1995}, volume = {6}, number = {3}, pages = {345―358}, url = {http://redwood.berkeley.edu/w/images/c/cc/09-dong-network-1995.pdf} } |
|||||
Einhäuser, W., Koch, C. & Makeig, S. | The duration of the attentional blink in natural scenes depends on stimulus category | 2007 | Vision Research Vol. 47(5), pp. 597-607 |
article | DOI URL |
Abstract: Humans comprehend the "gist" of even a complex natural scene within a small fraction of a second. If, however, observers are asked to detect targets in a sequence of rapidly presented items, recognition of a target succeeding another target by about a third of a second is severely impaired, the "attentional blink" (AB) [Raymond, J. E., Shapiro, K. L., & Arnell, K. M. (1992). Temporary suppression of visual processing in an RSVP task: an attentional blink? Journal of Experimental Psychology. Human Perception and Performance, 18, 849-860]. Since most experiments on the AB use well controlled but artificial stimuli, the question arises whether the same phenomenon occurs for complex, natural stimuli, and if so, whether its specifics depend on stimulus category. Here we presented rapid sequences of complex stimuli (photographs of objects, scenes and faces) and asked observers to detect and remember items of a specific category (either faces, watches, or both). We found a consistent AB for both target categories but the duration of the AB depended on the target category. | |||||
BibTeX:
@article{einhuser_duration_2007, author = {Wolfgang Einhäuser and Christof Koch and Scott Makeig}, title = {The duration of the attentional blink in natural scenes depends on stimulus category}, journal = {Vision Research}, year = {2007}, volume = {47}, number = {5}, pages = {597--607}, note = {PMID: 17275058}, url = {http://www.ncbi.nlm.nih.gov/pubmed/17275058}, doi = {http://dx.doi.org/10.1016/j.visres.2006.12.007} } |
|||||
Falconbridge, M.S., Stamps, R.L. & Badcock, D.R. | A Simple Hebbian/Anti-Hebbian Network Learns the Sparse, Independent Components of Natural Images | 2005 | Neural Comp. Vol. 18(2), pp. 415-429 |
article | URL |
Abstract: Slightly modified versions of an early Hebbian/anti-Hebbian neural network are shown to be capable of extracting the sparse, independent linear components of a prefiltered natural image set. An explanation for this capability in terms of a coupling between two hypothetical networks is presented. The simple networks presented here provide alternative, biologically plausible mechanisms for sparse, factorial coding in early primate vision. | |||||
BibTeX:
@article{falconbridge_simple_2005, author = {Michael S. Falconbridge and Robert L. Stamps and David R. Badcock}, title = {A Simple Hebbian/Anti-Hebbian Network Learns the Sparse, Independent Components of Natural Images}, journal = {Neural Comp.}, year = {2005}, volume = {18}, number = {2}, pages = {415--429}, url = {http://neco.mitpress.org/cgi/content/abstract/18/2/415} } |
|||||
Felsen, G. & Dan, Y. | A natural approach to studying vision | 2005 | Nature Neuroscience Vol. 8(12), pp. 1643-6 |
article | DOI URL |
Abstract: An ultimate goal of systems neuroscience is to understand how sensory stimuli encountered in the natural environment are processed by neural circuits. Achieving this goal requires knowledge of both the characteristics of natural stimuli and the response properties of sensory neurons under natural stimulation. Most of our current notions of sensory processing have come from experiments using simple, parametric stimulus sets. However, a growing number of researchers have begun to question whether this approach alone is sufficient for understanding the real-life sensory tasks performed by the organism. Here, focusing on the early visual pathway, we argue that the use of natural stimuli is vital for advancing our understanding of sensory processing. | |||||
BibTeX:
@article{felsen_natural_2005, author = {Gidon Felsen and Yang Dan}, title = {A natural approach to studying vision}, journal = {Nature Neuroscience}, year = {2005}, volume = {8}, number = {12}, pages = {1643--6}, note = {PMID: 16306891}, url = {http://www.ncbi.nlm.nih.gov/pubmed/16306891}, doi = {http://dx.doi.org/10.1038/nn1608} } |
|||||
Field, D.J. | What is the goal of sensory coding? [BibTeX] |
1994 | Neural Computation Vol. 6(4), pp. 559―601 |
article | URL |
BibTeX:
@article{field_what_1994, author = {D. J Field}, title = {What is the goal of sensory coding?}, journal = {Neural Computation}, year = {1994}, volume = {6}, number = {4}, pages = {559―601}, url = {http://redwood.berkeley.edu/w/images/0/0f/13-field-nc-1994.pdf} } |
|||||
Field, D.J. | Scale-invariance and Self-similar 'Wavelet' Transforms: an Analysis of Natural Scenes and Mammalian Visual Systems. [BibTeX] |
1993 | Wavelets, Fractals and Fourier Transforms: New Developments and New Applications., pp. 151-193 | inbook | URL |
BibTeX:
@inbook{field_scale-invariance_1993, author = {David J Field}, title = {Scale-invariance and Self-similar 'Wavelet' Transforms: an Analysis of Natural Scenes and Mammalian Visual Systems.}, booktitle = {Wavelets, Fractals and Fourier Transforms: New Developments and New Applications.}, publisher = {Oxford University Press.}, year = {1993}, pages = {151--193}, url = {http://redwood.psych.cornell.edu/papers/field-1993.pdf} } |
|||||
Field, D.J. | Relations between the statistics of natural images and the response properties of cortical cells. | 1987 | J Opt Soc Am A Vol. 4(12), pp. 2379―2394 |
article | URL |
Abstract: The relative efficiency of any particular image-coding scheme should be defined only in relation to the class of images that the code is likely to encounter. To understand the representation of images by the mammalian visual system, it might therefore be useful to consider the statistics of images from the natural environment (i.e., images with trees, rocks, bushes, etc). In this study, various coding schemes are compared in relation to how they represent the information in such natural images. The coefficients of such codes are represented by arrays of mechanisms that respond to local regions of space, spatial frequency, and orientation (Gabor-like transforms). For many classes of image, such codes will not be an efficient means of representing information. However, the results obtained with six natural images suggest that the orientation and the spatial-frequency tuning of mammalian simple cells are well suited for coding the information in such images if the goal of the code is to convert higher-order redundancy (e.g., correlation between the intensities of neighboring pixels) into first-order redundancy (i.e., the response distribution of the coefficients). Such coding produces a relatively high signal-to-noise ratio and permits information to be transmitted with only a subset of the total number of cells. These results support Barlow's theory that the goal of natural vision is to represent the information in the natural environment with minimal redundancy. | |||||
BibTeX:
@article{field_relations_1987, author = {D. J. Field}, title = {Relations between the statistics of natural images and the response properties of cortical cells.}, journal = {J Opt Soc Am A}, year = {1987}, volume = {4}, number = {12}, pages = {2379―2394}, url = {http://redwood.berkeley.edu/w/images/e/e3/06-field-josa-1987.pdf} } |
|||||
Field, D.J. & Brady, N. | Visual sensitivity, blur and the sources of variability in the amplitude spectra of natural scenes | 1997 | Vision Research Vol. 37(23), pp. 3367-83 |
article | URL |
Abstract: A number of researchers have suggested that in order to understand the response properties of cells in the visual pathway, we must consider the statistical structure of the natural environment. In this paper, we focus on one aspect of that structure, namely, the correlational structure which is described by the amplitude or power spectra of natural scenes. We propose that the principle insight one gains from considering the image spectra is in understanding the relative sensitivity of cells tuned to different spatial frequencies. This study employs a model in which the peak sensitivity is constant as a function of frequency with linear bandwith increasing (i.e., approximately constant in octaves). In such a model, the "response magnitude" (i.e., vector length) of cells increases as a function of their optimal (or central) spatial frequency out to about 20 cyc/deg. The result is a code in which the response to natural scenes, whose amplitude spectra typically fall as 1/f, is roughly constant out to 20 cyc/deg. An important consideration in evaluating this model of sensitivity is the fact that natural scenes show considerable variability in their amplitude spectra, with individual scenes showing falloffs which are often steeper or shallower than 1/f. Using a new measure of image structure (the "rectified contrast spectrum" or "RCS") on a set of calibrated natural images, it is shown that a large part of the variability in the spectra is due to differences in the sparseness of local structure at different scales. That is, an image which is "in focus" will have structure (e.g., edges) which has roughly the same magnitude across scale. That is, the loss of high frequency energy in some images is due to the reduction of the number of regions that contain structure rather than the amplitude of that structure. An "in focus" image will have structure (e.g., edges) across scale that have roughly equal magnitude but may vary in the area covered by structure. The slope of the RCS was found to provide a reasonable prediction of physical blur across a variety of scenes in spite of the variability in their amplitude spectra. It was also found to produce a good prediction of perceived blur as judged by human subjects. | |||||
BibTeX:
@article{field_visual_1997, author = {D J Field and N Brady}, title = {Visual sensitivity, blur and the sources of variability in the amplitude spectra of natural scenes}, journal = {Vision Research}, year = {1997}, volume = {37}, number = {23}, pages = {3367--83}, note = {PMID: 9425550}, url = {http://www.ncbi.nlm.nih.gov/pubmed/9425550} } |
|||||
Friedman, J.H. | Exploratory Projection Pursuit [BibTeX] |
1987 | Journal of the American Statistical Association Vol. 82(397), pp. 249―266 |
article | URL |
BibTeX:
@article{friedman_exploratory_1987, author = {J. H Friedman}, title = {Exploratory Projection Pursuit}, journal = {Journal of the American Statistical Association}, year = {1987}, volume = {82}, number = {397}, pages = {249―266}, url = {http://redwood.berkeley.edu/w/images/9/9f/11-friedman-jasa-1987.pdf} } |
|||||
Geisler, W.S. | Visual perception and the statistical properties of natural scenes | 2008 | Annual Review of Psychology Vol. 59, pp. 167-92 |
article | DOI URL |
Abstract: The environments in which we live and the tasks we must perform to survive and reproduce have shaped the design of our perceptual systems through evolution and experience. Therefore, direct measurement of the statistical regularities in natural environments (scenes) has great potential value for advancing our understanding of visual perception. This review begins with a general discussion of the natural scene statistics approach, of the different kinds of statistics that can be measured, and of some existing measurement techniques. This is followed by a summary of the natural scene statistics measured over the past 20 years. Finally, there is a summary of the hypotheses, models, and experiments that have emerged from the analysis of natural scene statistics. | |||||
BibTeX:
@article{geisler_visual_2008, author = {Wilson S Geisler}, title = {Visual perception and the statistical properties of natural scenes}, journal = {Annual Review of Psychology}, year = {2008}, volume = {59}, pages = {167--92}, note = {PMID: 17705683}, url = {http://www.ncbi.nlm.nih.gov/pubmed/17705683}, doi = {http://dx.doi.org/10.1146/annurev.psych.58.110405.085632} } |
|||||
Graham, D.J. & Field, D.J. | Statistical regularities of art images and natural scenes: spectra, sparseness and nonlinearities | 2007 | Spatial Vision Vol. 21(1-2), pp. 149-64 |
article | DOI URL |
Abstract: Paintings are the product of a process that begins with ordinary vision in the natural world and ends with manipulation of pigments on canvas. Because artists must produce images that can be seen by a visual system that is thought to take advantage of statistical regularities in natural scenes, artists are likely to replicate many of these regularities in their painted art. We have tested this notion by computing basic statistical properties and modeled cell response properties for a large set of digitized paintings and natural scenes. We find that both representational and non-representational (abstract) paintings from our sample (124 images) show basic similarities to a sample of natural scenes in terms of their spatial frequency amplitude spectra, but the paintings and natural scenes show significantly different mean amplitude spectrum slopes. We also find that the intensity distributions of paintings show a lower skewness and sparseness than natural scenes. We account for this by considering the range of luminances found in the environment compared to the range available in the medium of paint. A painting's range is limited by the reflective properties of its materials. We argue that artists do not simply scale the intensity range down but use a compressive nonlinearity. In our studies, modeled retinal and cortical filter responses to the images were less sparse for the paintings than for the natural scenes. But when a compressive nonlinearity was applied to the images, both the paintings' sparseness and the modeled responses to the paintings showed the same or greater sparseness compared to the natural scenes. This suggests that artists achieve some degree of nonlinear compression in their paintings. Because paintings have captivated humans for millennia, finding basic statistical regularities in paintings' spatial structure could grant insights into the range of spatial patterns that humans find compelling. | |||||
BibTeX:
@article{graham_statistical_2007, author = {Daniel J Graham and David J Field}, title = {Statistical regularities of art images and natural scenes: spectra, sparseness and nonlinearities}, journal = {Spatial Vision}, year = {2007}, volume = {21}, number = {1-2}, pages = {149--64}, note = {PMID: 18073056}, url = {http://www.ncbi.nlm.nih.gov/pubmed/18073056}, doi = {http://dx.doi.org/10.1163/156856807782753877} } |
|||||
Greene, M.R. & Oliva, A. | Recognition of natural scenes from global properties: seeing the forest without representing the trees | 2009 | Cognitive Psychology Vol. 58(2), pp. 137-76 |
article | DOI URL |
Abstract: Human observers are able to rapidly and accurately categorize natural scenes, but the representation mediating this feat is still unknown. Here we propose a framework of rapid scene categorization that does not segment a scene into objects and instead uses a vocabulary of global, ecological properties that describe spatial and functional aspects of scene space (such as navigability or mean depth). In Experiment 1, we obtained ground truth rankings on global properties for use in Experiments 2-4. To what extent do human observers use global property information when rapidly categorizing natural scenes? In Experiment 2, we found that global property resemblance was a strong predictor of both false alarm rates and reaction times in a rapid scene categorization experiment. To what extent is global property information alone a sufficient predictor of rapid natural scene categorization? In Experiment 3, we found that the performance of a classifier representing only these properties is indistinguishable from human performance in a rapid scene categorization task in terms of both accuracy and false alarms. To what extent is this high predictability unique to a global property representation? In Experiment 4, we compared two models that represent scene object information to human categorization performance and found that these models had lower fidelity at representing the patterns of performance than the global property model. These results provide support for the hypothesis that rapid categorization of natural scenes may not be mediated primarily though objects and parts, but also through global properties of structure and affordance. | |||||
BibTeX:
@article{greene_recognition_2009, author = {Michelle R Greene and Aude Oliva}, title = {Recognition of natural scenes from global properties: seeing the forest without representing the trees}, journal = {Cognitive Psychology}, year = {2009}, volume = {58}, number = {2}, pages = {137--76}, note = {PMID: 18762289}, url = {http://www.ncbi.nlm.nih.gov/pubmed/18762289}, doi = {http://dx.doi.org/10.1016/j.cogpsych.2008.06.001} } |
|||||
Griffiths, T.L. & Tenenbaum, J.B. | From Algorithmic to Subjective Randomness [BibTeX] |
2004 | In Advances in Neural Information Processing Systems Vol. 16, pp. 2004 |
article | DOI URL |
BibTeX:
@article{griffiths_algorithmic_2004, author = {Thomas L Griffiths and Joshua B Tenenbaum}, title = {From Algorithmic to Subjective Randomness}, journal = {In Advances in Neural Information Processing Systems}, year = {2004}, volume = {16}, pages = {2004}, url = {http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.3.2509}, doi = {http://dx.doi.org/10.1.1.3.2509} } |
|||||
Griffiths, T.L. & Tenenbaum, J.B. | Probability, algorithmic complexity, and subjective randomness [BibTeX] |
2003 | Proceedings of the 25th Annual Conference of the Cognitive Science Society | inproceedings | |
BibTeX:
@inproceedings{griffiths_probability_2003, author = {T. L. Griffiths and J. B. Tenenbaum}, title = {Probability, algorithmic complexity, and subjective randomness}, booktitle = {Proceedings of the 25th Annual Conference of the Cognitive Science Society}, year = {2003} } |
|||||
Hagerhall, C.M., Purcell, T. & Taylor, R. | Fractal dimension of landscape silhouette outlines as a predictor of landscape preference [BibTeX] |
2004 | Journal of Environmental Psychology Vol. 24(2), pp. 247-255 |
article | |
BibTeX:
@article{hagerhall_fractal_2004, author = {C. M. Hagerhall and T. Purcell and R. Taylor}, title = {Fractal dimension of landscape silhouette outlines as a predictor of landscape preference}, journal = {Journal of Environmental Psychology}, year = {2004}, volume = {24}, number = {2}, pages = {247--255} } |
|||||
Hansen, B.C., Essock, E.A., Zheng, Y. & DeFord, J.K. | Perceptual anisotropies in visual processing and their relation to natural image statistics | 2003 | Network (Bristol, England) Vol. 14(3), pp. 501-26 |
article | URL |
Abstract: The amplitude spectra of natural scenes are typically biased in terms of the amount of content at the cardinal orientations relative to the oblique orientations. This anisotropic distribution has been related to the 'oblique effect' (the greater visual sensitivity for simple line/grating stimuli at cardinal compared to oblique orientations). However, we have recently shown that with complex visual stimuli possessing broadband spatial content (i.e. random phase noise patterns), sensitivity for detecting oriented manipulations of amplitude is best for oblique orientations, and worst for horizontal orientations (the 'horizontal effect'). Here we investigated this effect with respect to the phase spectra of natural scenes. Oriented manipulations of both amplitude and phase were made on a set of natural scene images that were dominated by naturally occurring structure at one of four orientations in order to determine whether the presence of predominant scene content, carried by the Fourier phase spectra, altered the ability to detect an oriented increment of amplitude. The horizontal effect was observed regardless of any scene's content bias. In addition, a content-dependent effect was observed which could be related to the presence of spatial structure conveyed by the phase spectra of this set of natural scenes. Results are evaluated in the context of a divisive normalization model. | |||||
BibTeX:
@article{hansen_perceptual_2003, author = {Bruce C Hansen and Edward A Essock and Yufeng Zheng and J Kevin DeFord}, title = {Perceptual anisotropies in visual processing and their relation to natural image statistics}, journal = {Network (Bristol, England)}, year = {2003}, volume = {14}, number = {3}, pages = {501--26}, note = {PMID: 12938769}, url = {http://www.ncbi.nlm.nih.gov/pubmed/12938769} } |
|||||
van Hateren, J.H. & van der Schaaf, A. | Independent component filters of natural images compared with simple cells in primary visual cortex. | 1998 | Proceedings of the Royal Society B: Biological Sciences Vol. 265(1394), pp. 359–366 |
article | URL |
Abstract: Properties of the receptive fields of simple cells in macaque cortex were compared with properties of independent component filters generated by independent component analysis (ICA) on a large set of natural images. Histograms of spatial frequency bandwidth, orientation tuning bandwidth, aspect ratio and length of the receptive fields match well. This indicates that simple cells are well tuned to the expected statistics of natural stimuli. There is no match, however, in calculated and measured distributions for the peak of the spatial frequency response: the filters produced by ICA do not vary their spatial scale as much as simple cells do, but are fixed to scales close to the finest ones allowed by the sampling lattice. Possible ways to resolve this discrepancy are discussed. | |||||
BibTeX:
@article{van_hateren_independent_1998, author = {J H van Hateren and A van der Schaaf}, title = {Independent component filters of natural images compared with simple cells in primary visual cortex.}, journal = {Proceedings of the Royal Society B: Biological Sciences}, year = {1998}, volume = {265}, number = {1394}, pages = {359–366}, note = {PMC1688904}, url = {http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=1688904&rendertype=abstract} } |
|||||
Hays, J. & Efros, A.A. | Scene completion using millions of photographs | 2007 | ACM SIGGRAPH 2007 papers, pp. 4 | inproceedings | DOI URL |
Abstract: What can you do with a million images? In this paper we present a new image completion algorithm powered by a huge database of photographs gathered from the Web. The algorithm patches up holes in images by finding similar image regions in the database that are not only seamless but also semantically valid. Our chief insight is that while the space of images is effectively infinite, the space of semantically differentiable scenes is actually not that large. For many image completion tasks we are able to find similar scenes which contain image fragments that will convincingly complete the image. Our algorithm is entirely data-driven, requiring no annotations or labelling by the user. Unlike existing image completion methods, our algorithm can generate a diverse set of results for each input image and we allow users to select among them. We demonstrate the superiority of our algorithm over existing image completion approaches. | |||||
BibTeX:
@inproceedings{hays_scene_2007, author = {James Hays and Alexei A. Efros}, title = {Scene completion using millions of photographs}, booktitle = {ACM SIGGRAPH 2007 papers}, publisher = {ACM}, year = {2007}, pages = {4}, url = {http://portal.acm.org/citation.cfm?id=1275808.1276382}, doi = {http://dx.doi.org/10.1145/1275808.1276382} } |
|||||
Hoyer, P.O. & Hyvärinen, A. | A multi-layer sparse coding network learns contour coding from natural images | 2002 | Vision Research Vol. 42(12), pp. 1593-605 |
article | URL |
Abstract: An important approach in visual neuroscience considers how the function of the early visual system relates to the statistics of its natural input. Previous studies have shown how many basic properties of the primary visual cortex, such as the receptive fields of simple and complex cells and the spatial organization (topography) of the cells, can be understood as efficient coding of natural images. Here we extend the framework by considering how the responses of complex cells could be sparsely represented by a higher-order neural layer. This leads to contour coding and end-stopped receptive fields. In addition, contour integration could be interpreted as top-down inference in the presented model. | |||||
BibTeX:
@article{hoyer_multi-layer_2002, author = {Patrik O Hoyer and Aapo Hyvärinen}, title = {A multi-layer sparse coding network learns contour coding from natural images}, journal = {Vision Research}, year = {2002}, volume = {42}, number = {12}, pages = {1593--605}, note = {PMID: 12074953}, url = {http://www.ncbi.nlm.nih.gov/pubmed/12074953} } |
|||||
Hoyer, P.O. & Hyvärinen, A. | Independent component analysis applied to feature extraction from colour and stereo images | 2000 | Network (Bristol, England) Vol. 11(3), pp. 191-210 |
article | URL |
Abstract: Previous work has shown that independent component analysis (ICA) applied to feature extraction from natural image data yields features resembling Gabor functions and simple-cell receptive fields. This article considers the effects of including chromatic and stereo information. The inclusion of colour leads to features divided into separate red/green, blue/yellow, and bright/dark channels. Stereo image data, on the other hand, leads to binocular receptive fields which are tuned to various disparities. The similarities between these results and the observed properties of simple cells in the primary visual cortex are further evidence for the hypothesis that visual cortical neurons perform some type of redundancy reduction, which was one of the original motivations for ICA in the first place. In addition, ICA provides a principled method for feature extraction from colour and stereo images; such features could be used in image processing operations such as denoising and compression, as well as in pattern recognition. | |||||
BibTeX:
@article{hoyer_independent_2000, author = {P O Hoyer and A Hyvärinen}, title = {Independent component analysis applied to feature extraction from colour and stereo images}, journal = {Network (Bristol, England)}, year = {2000}, volume = {11}, number = {3}, pages = {191--210}, note = {PMID: 11014668}, url = {http://www.ncbi.nlm.nih.gov/pubmed/11014668} } |
|||||
Hsiao, W.H. & Millane, R.P. | Effects of occlusion, edges, and scaling on the power spectra of natural images | 2005 | Journal of the Optical Society of America. A, Optics, Image Science, and Vision Vol. 22(9), pp. 1789-97 |
article | URL |
Abstract: The circularly averaged power spectra of natural image ensembles tend to have a power-law dependence on spatial frequency with an exponent of approximately -2. This phenomenon has been attributed to object occlusion, the presence of edges, and scaling of object sizes (self-similarity) in natural scenes, although the relative importance of these properties is still unclear. A detailed examination of the effects of occlusion, edges, and self-similarity on the behavior of the power spectrum is conducted using a simple model of natural images. Numerical simulations show that edges and self-similarity are necessary for a power-law power spectrum over a wide range of spatial frequencies. Object occlusion is not an essential factor. A theoretical analysis for images containing nonoccluding objects supports these results. | |||||
BibTeX:
@article{hsiao_effects_2005, author = {W H Hsiao and R P Millane}, title = {Effects of occlusion, edges, and scaling on the power spectra of natural images}, journal = {Journal of the Optical Society of America. A, Optics, Image Science, and Vision}, year = {2005}, volume = {22}, number = {9}, pages = {1789--97}, note = {PMID: 16211805}, url = {http://www.ncbi.nlm.nih.gov/pubmed/16211805} } |
|||||
Huang, J. & Mumford, D. | Statistics of natural images and models [BibTeX] |
1999 | Vol. 1Computer Vision and Pattern Recognition, 1999. IEEE Computer Society Conference on. |
inproceedings | DOI |
BibTeX:
@inproceedings{huang_statistics_1999, author = {Jinggang Huang and D. Mumford}, title = {Statistics of natural images and models}, booktitle = {Computer Vision and Pattern Recognition, 1999. IEEE Computer Society Conference on.}, year = {1999}, volume = {1}, doi = {{10.1109/CVPR.1999.786990}} } |
|||||
Hyvärinen, A., Gutmann, M. & Hoyer, P.O. | Statistical model of natural stimuli predicts edge-like pooling of spatial frequency channels in V2 | 2005 | BMC Neuroscience Vol. 6, pp. 12 |
article | DOI URL |
Abstract: BACKGROUND: It has been shown that the classical receptive fields of simple and complex cells in the primary visual cortex emerge from the statistical properties of natural images by forcing the cell responses to be maximally sparse or independent. We investigate how to learn features beyond the primary visual cortex from the statistical properties of modelled complex-cell outputs. In previous work, we showed that a new model, non-negative sparse coding, led to the emergence of features which code for contours of a given spatial frequency band. RESULTS: We applied ordinary independent component analysis to modelled outputs of complex cells that span different frequency bands. The analysis led to the emergence of features which pool spatially coherent across-frequency activity in the modelled primary visual cortex. Thus, the statistically optimal way of processing complex-cell outputs abandons separate frequency channels, while preserving and even enhancing orientation tuning and spatial localization. As a technical aside, we found that the non-negativity constraint is not necessary: ordinary independent component analysis produces essentially the same results as our previous work. CONCLUSION: We propose that the pooling that emerges allows the features to code for realistic low-level image features related to step edges. Further, the results prove the viability of statistical modelling of natural images as a framework that produces quantitative predictions of visual processing. | |||||
BibTeX:
@article{hyvrinen_statistical_2005, author = {Aapo Hyvärinen and Michael Gutmann and Patrik O Hoyer}, title = {Statistical model of natural stimuli predicts edge-like pooling of spatial frequency channels in V2}, journal = {BMC Neuroscience}, year = {2005}, volume = {6}, pages = {12}, note = {PMID: 15715907}, url = {http://www.ncbi.nlm.nih.gov/pubmed/15715907}, doi = {http://dx.doi.org/10.1186/1471-2202-6-12} } |
|||||
Hyvärinen, A. & Hoyer, P. | Emergence of phase- and shift-invariant features by decomposition of natural images into independent feature subspaces | 2000 | Neural Computation Vol. 12(7), pp. 1705-20 |
article | URL |
Abstract: Olshausen and Field (1996) applied the principle of independence maximization by sparse coding to extract features from natural images. This leads to the emergence of oriented linear filters that have simultaneous localization in space and in frequency, thus resembling Gabor functions and simple cell receptive fields. In this article, we show that the same principle of independence maximization can explain the emergence of phase- and shift-invariant features, similar to those found in complex cells. This new kind of emergence is obtained by maximizing the independence between norms of projections on linear subspaces (instead of the independence of simple linear filter outputs). The norms of the projections on such "independent feature subspaces" then indicate the values of invariant features. | |||||
BibTeX:
@article{hyvrinen_emergence_2000, author = {A Hyvärinen and P Hoyer}, title = {Emergence of phase- and shift-invariant features by decomposition of natural images into independent feature subspaces}, journal = {Neural Computation}, year = {2000}, volume = {12}, number = {7}, pages = {1705--20}, note = {PMID: 10935923}, url = {http://www.ncbi.nlm.nih.gov/pubmed/10935923} } |
|||||
Hyvärinen, A. & Hoyer, P.O. | A two-layer sparse coding model learns simple and complex cell receptive fields and topography from natural images | 2001 | Vision Research Vol. 41(18), pp. 2413-23 |
article | URL |
Abstract: The classical receptive fields of simple cells in the visual cortex have been shown to emerge from the statistical properties of natural images by forcing the cell responses to be maximally sparse, i.e. significantly activated only rarely. Here, we show that this single principle of sparseness can also lead to emergence of topography (columnar organization) and complex cell properties as well. These are obtained by maximizing the sparsenesses of locally pooled energies, which correspond to complex cell outputs. Thus, we obtain a highly parsimonious model of how these properties of the visual cortex are adapted to the characteristics of the natural input. | |||||
BibTeX:
@article{hyvrinen_two-layer_2001, author = {A Hyvärinen and P O Hoyer}, title = {A two-layer sparse coding model learns simple and complex cell receptive fields and topography from natural images}, journal = {Vision Research}, year = {2001}, volume = {41}, number = {18}, pages = {2413--23}, note = {PMID: 11459597}, url = {http://www.ncbi.nlm.nih.gov/pubmed/11459597} } |
|||||
Hyvärinen, A., Hoyer, P.O. & Inki, M. | Topographic independent component analysis | 2001 | Neural Computation Vol. 13(7), pp. 1527-58 |
article | DOI URL |
Abstract: In ordinary independent component analysis, the components are assumed to be completely independent, and they do not necessarily have any meaningful order relationships. In practice, however, the estimated "independent" components are often not at all independent. We propose that this residual dependence structure could be used to define a topographic order for the components. In particular, a distance between two components could be defined using their higher-order correlations, and this distance could be used to create a topographic representation. Thus, we obtain a linear decomposition into approximately independent components, where the dependence of two components is approximated by the proximity of the components in the topographic representation. | |||||
BibTeX:
@article{hyvrinen_topographic_2001, author = {A Hyvärinen and P O Hoyer and M Inki}, title = {Topographic independent component analysis}, journal = {Neural Computation}, year = {2001}, volume = {13}, number = {7}, pages = {1527--58}, note = {PMID: 11440596}, url = {http://www.ncbi.nlm.nih.gov/pubmed/11440596}, doi = {http://dx.doi.org/10.1162/089976601750264992} } |
|||||
Hyvärinen, A., Hurri, J. & Hoyer, P.O. | Natural Image Statistics — A probabilistic approach to early computational vision | 2008 | book | ||
Abstract: From the preface: This book is both an introductory textbook and a research monograph on modelling the statistical structure of natural images. In very simple terms, ``natural images'' are photographs of the typical environment where we live. In this book, their statistical structure is described using a number of statistical models whose parameters are estimated from image samples. Our main motivation for exploring natural image statistics is computational modelling of biological visual systems. A theoretical framework which is gaining more and more support considers the properties of the visual system to be reflections of the statistical structure of natural images, because of evolutionary adaptation processes. Another motivation for natural image statistics research is in computer science and engineering, where it helps in development of better image processing and computer vision methods. The book is targeted for advanced undergraduate students, graduate students and researchers in vision science, computational neuroscience, computer vision and image processing. It can also be read as an introduction to the area by people with a background in mathematical disciplines (mathematics, statistics, theoretical physics). Due to the multidisciplinary nature of the subject, the book has been written so as to be accessible to an audience coming from very different backgrounds such as psychology, computer science, electrical engineering, neurobiology, mathematics, statistics and physics. | |||||
BibTeX:
@book{hyvrinen_natural_2008, author = { Aapo Hyvärinen and Jarmo Hurri and Patrik O. Hoyer}, title = {Natural Image Statistics — A probabilistic approach to early computational vision}, publisher = {Springer-Verlag}, year = {2008}, edition = {11 Dec 2008 preprint} } |
|||||
Johnson, A.P. & Baker, C.L. | First- and second-order information in natural images: a filter-based approach to image statistics | 2004 | Journal of the Optical Society of America. A, Optics, Image Science, and Vision Vol. 21(6), pp. 913-25 |
article | URL |
Abstract: Previous analyses of natural image statistics have dealt mainly with their Fourier power spectra. Here we explore image statistics by examining responses to biologically motivated filters that are spatially localized and respond to first-order (luminance-defined) and second-order (contrast- or texture-defined) characteristics. We compare the distribution of natural image responses across filter parameters for first- and second-order information. We find that second-order information in natural scenes shows the same self-similarity previously described for first-order information but has substantially less orientational anisotropy. The magnitudes of the two kinds of information, as well as their mutual unsigned correlation, are much stronger for particular combinations of filter parameters in natural images but not in unstructured fractal images having the same power spectra. | |||||
BibTeX:
@article{johnson_first-_2004, author = {Aaron P Johnson and Curtis L Baker}, title = {First- and second-order information in natural images: a filter-based approach to image statistics}, journal = {Journal of the Optical Society of America. A, Optics, Image Science, and Vision}, year = {2004}, volume = {21}, number = {6}, pages = {913--25}, note = {PMID: 15191171}, url = {http://www.ncbi.nlm.nih.gov/pubmed/15191171} } |
|||||
Johnson, A.P., Kingdom, F.A.A. & Baker, C.L. | Spatiochromatic statistics of natural scenes: first- and second-order information and their correlational structure | 2005 | Journal of the Optical Society of America. A, Optics, Image Science, and Vision Vol. 22(10), pp. 2050-9 |
article | URL |
Abstract: Spatial filters that mimic receptive fields of visual cortex neurons provide an efficient representation of achromatic image structure, but the extension of this idea to chromatic information is at an early stage. Relatively few studies have looked at the statistical relationships between the modeled responses to natural scenes of the luminance (LUM), red-green (RG), and blue-yellow (BY) postreceptoral channels of the primate visual system. Here we consider the correlations among these channel responses in terms of pixel, first-order, and second-order information. First-order linear filtering was implemented by convolving the cosine-windowed images with oriented Gabor functions, whose gains were scaled to give equal amplitude response across spatial frequency to random fractal images. Second-order filtering was implemented via a filter-rectify-filter cascade, with Gabor functions for both first- and second-stage filters. Both signed and unsigned filter responses were obtained across a range of filter parameters (spatial frequency, 2-64 cycles/image; orientation, 0-135 degrees). The filter responses to the LUM channel images were larger than those for either RG or BY channel images. Cross correlations between the first-order channel responses and between the first- and second-order channel responses were measured. Results showed that the unsigned correlations between first-order channel responses were higher than expected on the basis of previous studies and that first-order channel responses were highly correlated with LUM, but not with RG or BY, second-order responses. These findings imply that course-scale color information correlates well with course-scale changes of fine-scale texture. | |||||
BibTeX:
@article{johnson_spatiochromatic_2005, author = {Aaron P Johnson and Frederick A A Kingdom and Curtis L Baker}, title = {Spatiochromatic statistics of natural scenes: first- and second-order information and their correlational structure}, journal = {Journal of the Optical Society of America. A, Optics, Image Science, and Vision}, year = {2005}, volume = {22}, number = {10}, pages = {2050--9}, note = {PMID: 16277276}, url = {http://www.ncbi.nlm.nih.gov/pubmed/16277276} } |
|||||
Jones-Smith, K. & Mathur, H. | Fractal Analysis: Revisiting Pollock's drip paintings [BibTeX] |
2006 | Nature Vol. 444(7119), pp. {E9-E10} |
article | DOI URL |
BibTeX:
@article{jones-smith_fractal_2006, author = {Katherine Jones-Smith and Harsh Mathur}, title = {Fractal Analysis: Revisiting Pollock's drip paintings}, journal = {Nature}, year = {2006}, volume = {444}, number = {7119}, pages = {E9--E10}, url = {http://dx.doi.org/10.1038/nature05398}, doi = {http://dx.doi.org/10.1038/nature05398} } |
|||||
Karklin, Y. & Lewicki, M.S. | Emergence of complex cell properties by learning to generalize in natural scenes | 2009 | Nature Vol. 457(7225), pp. 83-6 |
article | DOI URL |
Abstract: A fundamental function of the visual system is to encode the building blocks of natural scenes-edges, textures and shapes-that subserve visual tasks such as object recognition and scene understanding. Essential to this process is the formation of abstract representations that generalize from specific instances of visual input. A common view holds that neurons in the early visual system signal conjunctions of image features, but how these produce invariant representations is poorly understood. Here we propose that to generalize over similar images, higher-level visual neurons encode statistical variations that characterize local image regions. We present a model in which neural activity encodes the probability distribution most consistent with a given image. Trained on natural images, the model generalizes by learning a compact set of dictionary elements for image distributions typically encountered in natural scenes. Model neurons show a diverse range of properties observed in cortical cells. These results provide a new functional explanation for nonlinear effects in complex cells and offer insight into coding strategies in primary visual cortex (V1) and higher visual areas. | |||||
BibTeX:
@article{karklin_emergence_2009, author = {Yan Karklin and Michael S Lewicki}, title = {Emergence of complex cell properties by learning to generalize in natural scenes}, journal = {Nature}, year = {2009}, volume = {457}, number = {7225}, pages = {83--6}, note = {PMID: 19020501}, url = {http://www.ncbi.nlm.nih.gov/pubmed/19020501}, doi = {http://dx.doi.org/10.1038/nature07481} } |
|||||
Karklin, Y. & Lewicki, M.S. | A hierarchical Bayesian model for learning nonlinear statistical regularities in nonstationary natural signals | 2005 | Neural Computation Vol. 17(2), pp. 397-423 |
article | DOI URL |
Abstract: Capturing statistical regularities in complex, high-dimensional data is an important problem in machine learning and signal processing. Models such as principal component analysis (PCA) and independent component analysis (ICA) make few assumptions about the structure in the data and have good scaling properties, but they are limited to representing linear statistical regularities and assume that the distribution of the data is stationary. For many natural, complex signals, the latent variables often exhibit residual dependencies as well as nonstationary statistics. Here we present a hierarchical Bayesian model that is able to capture higher-order nonlinear structure and represent nonstationary data distributions. The model is a generalization of ICA in which the basis function coefficients are no longer assumed to be independent; instead, the dependencies in their magnitudes are captured by a set of density components. Each density component describes a common pattern of deviation from the marginal density of the pattern ensemble; in different combinations, they can describe nonstationary distributions. Adapting the model to image or audio data yields a nonlinear, distributed code for higher-order statistical regularities that reflect more abstract, invariant properties of the signal. | |||||
BibTeX:
@article{karklin_hierarchical_2005, author = {Yan Karklin and Michael S Lewicki}, title = {A hierarchical Bayesian model for learning nonlinear statistical regularities in nonstationary natural signals}, journal = {Neural Computation}, year = {2005}, volume = {17}, number = {2}, pages = {397--423}, note = {PMID: 15720773}, url = {http://www.ncbi.nlm.nih.gov/pubmed/15720773}, doi = {http://dx.doi.org/10.1162/0899766053011474} } |
|||||
Karklin, Y. & Lewicki, M.S. | Learning higher-order structures in natural images | 2003 | Network (Bristol, England) Vol. 14(3), pp. 483-99 |
article | URL |
Abstract: The theoretical principles that underlie the representation and computation of higher-order structure in natural images are poorly understood. Recently, there has been considerable interest in using information theoretic techniques, such as independent component analysis, to derive representations for natural images that are optimal in the sense of coding efficiency. Although these approaches have been successful in explaining properties of neural representations in the early visual pathway and visual cortex, because they are based on a linear model, the types of image structure that can be represented are very limited. Here, we present a hierarchical probabilistic model for learning higher-order statistical regularities in natural images. This non-linear model learns an efficient code that describes variations in the underlying probabilistic density. When applied to natural images the algorithm yields coarse-coded, sparse-distributed representations of abstract image properties such as object location, scale and texture. This model offers a novel description of higher-order image structure and could provide theoretical insight into the response properties and computational functions of lower level cortical visual areas. | |||||
BibTeX:
@article{karklin_learning_2003, author = {Yan Karklin and Michael S Lewicki}, title = {Learning higher-order structures in natural images}, journal = {Network (Bristol, England)}, year = {2003}, volume = {14}, number = {3}, pages = {483--99}, note = {PMID: 12938768}, url = {http://www.ncbi.nlm.nih.gov/pubmed/12938768} } |
|||||
Kay, K.N., Naselaris, T., Prenger, R.J. & Gallant, J.L. | Identifying natural images from human brain activity [BibTeX] |
2008 | Nature Vol. 452(7185), pp. 352-355 |
article | DOI URL |
BibTeX:
@article{kay_identifying_2008, author = {Kendrick N. Kay and Thomas Naselaris and Ryan J. Prenger and Jack L. Gallant}, title = {Identifying natural images from human brain activity}, journal = {Nature}, year = {2008}, volume = {452}, number = {7185}, pages = {352--355}, url = {http://dx.doi.org/10.1038/nature06713}, doi = {http://dx.doi.org/10.1038/nature06713} } |
|||||
Kayser, C., Körding, K.P. & König, P. | Processing of complex stimuli and natural scenes in the visual cortex | 2004 | Current Opinion in Neurobiology Vol. 14(4), pp. 468-73 |
article | DOI URL |
Abstract: A major part of vision research builds on the assumption that processing of visual stimuli can be understood on the basis of knowledge about the processing of simplified, artificial stimuli. Recent experimental advances, however, show that a combination of responses to simplified stimuli does not adequately describe responses to natural visual scenes. The systems performance exceeds the performance predicted from understanding its basic constituents. This highlights the fact that the visual system is specifically adapted to the properties of its everyday input and can only fully be understood when probed with naturalistic stimuli. | |||||
BibTeX:
@article{kayser_processing_2004, author = {Christoph Kayser and Konrad P Körding and Peter König}, title = {Processing of complex stimuli and natural scenes in the visual cortex}, journal = {Current Opinion in Neurobiology}, year = {2004}, volume = {14}, number = {4}, pages = {468--73}, note = {PMID: 15302353}, url = {http://www.ncbi.nlm.nih.gov/pubmed/15302353}, doi = {http://dx.doi.org/10.1016/j.conb.2004.06.002} } |
|||||
Kording, K.P., Kayser, C., Einhauser, W. & Konig, P. | How Are Complex Cell Properties Adapted to the Statistics of Natural Stimuli? | 2004 | J Neurophysiol Vol. 91(1), pp. 206-212 |
article | DOI URL |
Abstract: Sensory areas should be adapted to the properties of their natural stimuli. What are the underlying rules that match the properties of complex cells in primary visual cortex to their natural stimuli? To address this issue, we sampled movies from a camera carried by a freely moving cat, capturing the dynamics of image motion as the animal explores an outdoor environment. We use these movie sequences as input to simulated neurons. Following the intuition that many meaningful high-level variables, e.g., identities of visible objects, do not change rapidly in natural visual stimuli, we adapt the neurons to exhibit firing rates that are stable over time. We find that simulated neurons, which have optimally stable activity, display many properties that are observed for cortical complex cells. Their response is invariant with respect to stimulus translation and reversal of contrast polarity. Furthermore, spatial frequency selectivity and the aspect ratio of the receptive field quantitatively match the experimentally observed characteristics of complex cells. Hence, the population of complex cells in the primary visual cortex can be described as forming an optimally stable representation of natural stimuli. | |||||
BibTeX:
@article{kording_are_2004, author = {Konrad P. Kording and Christoph Kayser and Wolfgang Einhauser and Peter Konig}, title = {How Are Complex Cell Properties Adapted to the Statistics of Natural Stimuli?}, journal = {J Neurophysiol}, year = {2004}, volume = {91}, number = {1}, pages = {206--212}, url = {http://jn.physiology.org/cgi/content/abstract/91/1/206}, doi = {http://dx.doi.org/10.1152/jn.00149.2003} } |
|||||
Koroutchev, K. & Dorronsoro, J. | A New Information Measure for Natural Images | 2003 | Artificial Neural Nets Problem Solving Methods, pp. 1052 | inbook | URL |
Abstract: Although natural images are a very small subset of all images, the direct computation of their block densities is not possible. On the other hand, the success of some image processing methods, most particularly, fractal compression, indicates that they somehow are able to capture at least part of the natural image statistics. In this work we shall show how a concrete procedure, hash based fractal image compression, can be used to derive quite precise mean-and-variance normalized block statistics. We shall use them to define an image entropy measure and a an image representation and discuss their relationship with other widely used image information measures. | |||||
BibTeX:
@inbook{koroutchev_new_2003, author = {Kostadin Koroutchev and José Dorronsoro}, title = {A New Information Measure for Natural Images}, booktitle = {Artificial Neural Nets Problem Solving Methods}, publisher = {Springer}, year = {2003}, pages = {1052}, url = {http://dx.doi.org/10.1007/3-540-44869-1_66} } |
|||||
Koroutchev, K. & Dorronsoro, J.R. | Factorization of Natural 4 × 4 Patch Distributions | 2004 | Statistical Methods in Video Processing, pp. 165-174 | inbook | URL |
Abstract: The lack of sufficient machine readable images makes impossible the direct computation of natural image 4 × 4 block statistics and one has to resort to indirect approximated methods to reduce their domain space. A natural approach to this is to collect statistics over compressed images; if the reconstruction quality is good enough, these statistics will be sufficiently representative. However, a requirement for easier statistics collection is that the method used provides a uniform representation of the compression information across all patches, something for which codebook techniques are well suited. We shall follow this approach here, using a fractal compression–inspired quantization scheme to approximate a given patch B by a triplet ( D B, μ B, σ B) with σ B the patch’s contrast, μ B its brightness and D B a codebook approximation to the mean–variance normalization ( B – μ B)/ σ B of B. The resulting reduction of the domain space makes feasible the computation of entropy and mutual information estimates that, in turn, suggest a factorization of the approximation of p( B) ≃ p( D B, μ B, σ B) as , with Φ being a high contrast correction. | |||||
BibTeX:
@inbook{koroutchev_factorization_2004, author = {Kostadin Koroutchev and José R. Dorronsoro}, title = {Factorization of Natural 4 × 4 Patch Distributions}, booktitle = {Statistical Methods in Video Processing}, publisher = {Springer}, year = {2004}, pages = {165--174}, url = {http://www.springerlink.com/content/xeac54w0bpu1127x} } |
|||||
Koroutchev, K. & Dorronsoro, J.R. | Statistical Structure of Natural 4 × 4 Image Patches | 2004 | Structural, Syntactic, and Statistical Pattern Recognition, pp. 452-460 | inbook | URL |
Abstract: The direct computation of natural image block statistics is unfeasible due to the huge domain space. In this paper we shall propose a procedure to collect block statistics on compressed versions of natural 4 × 4 patches. If the reconstructed blocks are close enough to the original ones, these statistics can clearly be quite representative of the true natural patch statistics. We shall work with a fractal image compression–inspired codebook scheme, in which we will compute for each block B its contrast σ, brightness � and a normalized codebook approximation D B of ( B– �)/ σ. Entropy and mutual information estimates suggest a first order approximation p( B) ≃ p( D B) p( μ) p( σ) of the probabibility p( B) of a given natural block, while a more precise approximation can be written as p(B) textbackslashsimeq p(DtextasciicircumB) p(textbackslashmu) p(textbackslashsigma) textbackslashPhi(textbartextbartextbackslashnabla Btextbartextbar). We shall also study the structure of p( σ) and p( D), the more relevant probability components. The first one presents an exponential behavior for non flat patches, while p( D) behaves uniformly with respecto to volume in patch space. | |||||
BibTeX:
@inbook{koroutchev_statistical_2004, author = {Kostadin Koroutchev and José R. Dorronsoro}, title = {Statistical Structure of Natural 4 × 4 Image Patches}, booktitle = {Structural, Syntactic, and Statistical Pattern Recognition}, publisher = {Springer}, year = {2004}, pages = {452--460}, url = {http://www.springerlink.com/content/mwb8333ydehx5l2v} } |
|||||
Koroutchev, K. & Dorronsoro, J.R. | Hash--Like Fractal Image Compression with Linear Execution Time | 2003 | Pattern Recognition and Image Analysis, pp. 395-402 | inbook | URL |
Abstract: The main computational cost in Fractal Image Analysis (FIC) comes from the required range-domain full block comparisons. In this work we propose a new algorithm for this comparison, in which actual full block comparison is preceded by a very fast hash--like search of those domains close to a given range block, resulting in a performance linear with respect to the number of pixels. Once the algorithm is detailed, its results will be compared against other state--of--the--art methods in FIC. | |||||
BibTeX:
@inbook{koroutchev_hash--like_2003, author = {Kostadin Koroutchev and José R. Dorronsoro}, title = {Hash--Like Fractal Image Compression with Linear Execution Time}, booktitle = {Pattern Recognition and Image Analysis}, publisher = {Springer}, year = {2003}, pages = {395--402}, url = {http://www.springerlink.com/content/dtm09nq7b3wh5xbd} } |
|||||
Langer, M.S. | Large-scale failures of f(-alpha) scaling in natural image spectra | 2000 | Journal of the Optical Society of America. A, Optics, Image Science, and Vision Vol. 17(1), pp. 28-33 |
article | URL |
Abstract: Several studies have demonstrated that the power spectra of natural image ensembles scale as f(-alpha). A stronger claim that has been made is that the power spectra of single natural images typically also scale as f(-alpha). Results are presented that challenge this latter claim. The results are based on a method for estimating large-scale structure in single images that compares aliasing artifacts produced by image windows of different shape. Failures of f(-alpha) scaling are found at large scales in many natural images. These failures cannot be accounted for by f(-alpha) scaling models such as a linear superposition model or a model based on two-dimensional occlusions in the image plane. The results imply that claims about f(-alpha) scaling in single natural images have been exaggerated. The results also offer insight into why such failures of f(-alpha) scaling occur. | |||||
BibTeX:
@article{langer_large-scale_2000, author = {M S Langer}, title = {Large-scale failures of f(-alpha) scaling in natural image spectra}, journal = {Journal of the Optical Society of America. A, Optics, Image Science, and Vision}, year = {2000}, volume = {17}, number = {1}, pages = {28--33}, note = {PMID: 10641835}, url = {http://www.ncbi.nlm.nih.gov/pubmed/10641835} } |
|||||
Laughlin, S. | A simple coding procedure enhances a neuron's information capacity. | 1981 | Z Naturforsch [C] Vol. 36(9-10), pp. 910―912 |
article | URL |
Abstract: The contrast-response function of a class of first order interneurons in the fly's compound eye approximates to the cumulative probability distribution of contrast levels in natural scenes. Elementary information theory shows that this matching enables the neurons to encode contrast fluctuations most efficiently. | |||||
BibTeX:
@article{laughlin_simple_1981, author = {S. Laughlin}, title = {A simple coding procedure enhances a neuron's information capacity.}, journal = {Z Naturforsch [C]}, year = {1981}, volume = {36}, number = {9-10}, pages = {910―912}, url = {http://redwood.berkeley.edu/w/images/2/2f/04-laughlin-zn-1981.pdf} } |
|||||
Lee, A.B., Huang, J. & Mumford, D. | Random collage model for natural images [BibTeX] |
2000 | Int’l J. of Computer Vision | article | DOI URL |
BibTeX:
@article{lee_random_2000, author = {Ann B Lee and Jinggang Huang and David Mumford}, title = {Random collage model for natural images}, journal = {Int’l J. of Computer Vision}, year = {2000}, url = {http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.29.3052}, doi = {http://dx.doi.org/10.1.1.29.3052} } |
|||||
Lee, A.B., Mumford, D. & Huang, J. | Occlusion Models for Natural Images: A Statistical Study of a Scale-Invariant Dead Leaves Model [BibTeX] |
2001 | International Journal of Computer Vision Vol. 41(1), pp. 35―59 |
article | URL |
BibTeX:
@article{lee_occlusion_2001, author = {A. B Lee and D. Mumford and J. Huang}, title = {Occlusion Models for Natural Images: A Statistical Study of a Scale-Invariant Dead Leaves Model}, journal = {International Journal of Computer Vision}, year = {2001}, volume = {41}, number = {1}, pages = {35―59}, url = {http://www.springerlink.com/content/q554t2v606p21649/fulltext.pdf} } |
|||||
Lee, A.B., Pedersen, K.S. & Mumford, D. | The complex statistics of high contrast patches in natural images [BibTeX] |
2001 | IEEE Workshop on Statistical and Computational Theories of Vision, Vancouver, Canada | inproceedings | |
BibTeX:
@inproceedings{lee_complex_2001, author = {A. B. Lee and K. S. Pedersen and D. Mumford}, title = {The complex statistics of high contrast patches in natural images}, booktitle = {IEEE Workshop on Statistical and Computational Theories of Vision, Vancouver, Canada}, year = {2001} } |
|||||
Lee, Te-Won., Wachtler, T. & Sejnowski, T.J. | Color opponency is an efficient representation of spectral properties in natural scenes | 2002 | Vision Research Vol. 42(17), pp. 2095-103 |
article | URL |
Abstract: The human visual system encodes the chromatic signals conveyed by the three types of retinal cone photoreceptors in an opponent fashion. This opponency is thought to reduce redundant information by decorrelating the photoreceptor signals. Correlations in the receptor signals are caused by the substantial overlap of the spectral sensitivities of the receptors, but it is not clear to what extent the properties of natural spectra contribute to the correlations. To investigate the influences of natural spectra and photoreceptor spectral sensitivities, we attempted to find linear codes with minimal redundancy for trichromatic images assuming human cone spectral sensitivities, or hypothetical non-overlapping cone sensitivities, respectively. The resulting properties of basis functions are similar in both cases. They are non-orthogonal, show strong opponency along an achromatic direction (luminance edges) and along chromatic directions, and they achieve a highly efficient encoding of natural chromatic signals. Thus, color opponency arises for the encoding of human cone signals, i.e. with strongly overlapping spectral sensitivities, but also under the assumption of non-overlapping spectral sensitivities. Our results suggest that color opponency may in part be a result of the properties of natural spectra and not solely a consequence of the cone spectral sensitivities. | |||||
BibTeX:
@article{lee_color_2002, author = {Te-Won Lee and Thomas Wachtler and Terrence J Sejnowski}, title = {Color opponency is an efficient representation of spectral properties in natural scenes}, journal = {Vision Research}, year = {2002}, volume = {42}, number = {17}, pages = {2095--103}, note = {PMID: 12169429}, url = {http://www.ncbi.nlm.nih.gov/pubmed/12169429} } |
|||||
Long, F., Yang, Z. & Purves, D. | Spectral statistics in natural scenes predict hue, saturation, and brightness | 2006 | Proceedings of the National Academy of Sciences of the United States of America Vol. 103(15), pp. 6013-8 |
article | DOI URL |
Abstract: The perceptual color qualities of hue, saturation, and brightness do not correspond in any simple way to the physical characteristics of retinal stimuli, a fact that poses a major obstacle for any explanation of color vision. Here we test the hypothesis that these basic color attributes are determined by the statistical covariations in the spectral stimuli that humans have always experienced in typical visual environments. Using a database of 1,600 natural images, we analyzed the joint probability distributions of the physical variables most relevant to each of these perceptual qualities. The cumulative density functions derived from these distributions predict the major colorimetric functions that have been reported in psychophysical experiments over the last century. | |||||
BibTeX:
@article{long_spectral_2006, author = {Fuhui Long and Zhiyong Yang and Dale Purves}, title = {Spectral statistics in natural scenes predict hue, saturation, and brightness}, journal = {Proceedings of the National Academy of Sciences of the United States of America}, year = {2006}, volume = {103}, number = {15}, pages = {6013--8}, note = {PMID: 16595630}, url = {http://www.ncbi.nlm.nih.gov/pubmed/16595630}, doi = {http://dx.doi.org/10.1073/pnas.0600890103} } |
|||||
Mante, V., Frazor, R.A., Bonin, V., Geisler, W.S. & Carandini, M. | Independence of luminance and contrast in natural scenes and in the early visual system | 2005 | Nature Neuroscience Vol. 8(12), pp. 1690-7 |
article | DOI URL |
Abstract: The early visual system is endowed with adaptive mechanisms that rapidly adjust gain and integration time based on the local luminance (mean intensity) and contrast (standard deviation of intensity relative to the mean). Here we show that these mechanisms are matched to the statistics of the environment. First, we measured the joint distribution of luminance and contrast in patches selected from natural images and found that luminance and contrast were statistically independent of each other. This independence did not hold for artificial images with matched spectral characteristics. Second, we characterized the effects of the adaptive mechanisms in lateral geniculate nucleus (LGN), the direct recipient of retinal outputs. We found that luminance gain control had the same effect at all contrasts and that contrast gain control had the same effect at all mean luminances. Thus, the adaptive mechanisms for luminance and contrast operate independently, reflecting the very independence encountered in natural images. | |||||
BibTeX:
@article{mante_independence_2005, author = {Valerio Mante and Robert A Frazor and Vincent Bonin and Wilson S Geisler and Matteo Carandini}, title = {Independence of luminance and contrast in natural scenes and in the early visual system}, journal = {Nature Neuroscience}, year = {2005}, volume = {8}, number = {12}, pages = {1690--7}, note = {PMID: 16286933}, url = {http://www.ncbi.nlm.nih.gov/pubmed/16286933}, doi = {http://dx.doi.org/10.1038/nn1556} } |
|||||
Marr, D. | Vision : a computational investigation into the human representation and processing of visual information [BibTeX] |
1982 | book | ||
BibTeX:
@book{marr_vision_1982, author = {David Marr}, title = {Vision : a computational investigation into the human representation and processing of visual information}, publisher = {W.H. Freeman}, year = {1982} } |
|||||
Miyawaki, Y., Uchida, H., Yamashita, O., aki Sato, M., Morito, Y., Tanabe, H.C., Sadato, N. & Kamitani, Y. | Visual Image Reconstruction from Human Brain Activity using a Combination of Multiscale Local Image Decoders | 2008 | Neuron Vol. 60(5), pp. 915-929 |
article | DOI URL |
Abstract: Summary Perceptual experience consists of an enormous number of possible states. Previous fMRI studies have predicted a perceptual state by classifying brain activity into prespecified categories. Constraint-free visual image reconstruction is more challenging, as it is impractical to specify brain activity for all possible images. In this study, we reconstructed visual images by combining local image bases of multiple scales, whose contrasts were independently decoded from fMRI activity by automatically selecting relevant voxels and exploiting their correlated patterns. Binary-contrast, 10 � 10-patch images (2100 possible states) were accurately reconstructed without any image prior on a single trial or volume basis by measuring brain activity only for several hundred random images. Reconstruction was also used to identify the presented image among millions of candidates. The results suggest that our approach provides an effective means to read out complex perceptual states from brain activity while discovering information representation in multivoxel patterns. | |||||
BibTeX:
@article{miyawaki_visual_2008, author = {Yoichi Miyawaki and Hajime Uchida and Okito Yamashita and Masa-aki Sato and Yusuke Morito and Hiroki C. Tanabe and Norihiro Sadato and Yukiyasu Kamitani}, title = {Visual Image Reconstruction from Human Brain Activity using a Combination of Multiscale Local Image Decoders}, journal = {Neuron}, year = {2008}, volume = {60}, number = {5}, pages = {915--929}, url = {http://www.sciencedirect.com/science/article/B6WSS-4V4113M-P/2/7090c83d0a4ceb1d68dd47806653ec43}, doi = {http://dx.doi.org/10.1016/j.neuron.2008.11.004} } |
|||||
Motoyoshi, I., Nishida, S., Sharan, L. & Adelson, E.H. | Image statistics and the perception of surface qualities [BibTeX] |
2007 | Nature Vol. advance online publication |
article | DOI |
BibTeX:
@article{motoyoshi_image_2007, author = {Isamu Motoyoshi and Shin'ya Nishida and Lavanya Sharan and Edward H Adelson}, title = {Image statistics and the perception of surface qualities}, journal = {Nature}, year = {2007}, volume = {advance online publication}, doi = {http://dx.doi.org/10.1038/nature05724} } |
|||||
Mumford, D. & Gidas, B. | Stochastic models for generic images [BibTeX] |
2001 | Quarterly of Applied Mathematics Vol. 59(1), pp. 85―111 |
article | URL |
BibTeX:
@article{mumford_stochastic_2001, author = {D. Mumford and B. Gidas}, title = {Stochastic models for generic images}, journal = {Quarterly of Applied Mathematics}, year = {2001}, volume = {59}, number = {1}, pages = {85―111}, url = {http://www.dam.brown.edu/people/mumford/Papers/Generic5.pdf} } |
|||||
Nevado, A., Turiel, A. & Parga, N. | Scene dependence of the non-Gaussian scaling properties of natural images | 2000 | Network (Bristol, England) Vol. 11(2), pp. 131-52 |
article | URL |
Abstract: We report results on the scaling properties of changes in contrast of natural images in different visual environments. This study confirms the existence, in a vast class of images, of a multiplicative process relating the variations in contrast seen at two different scales, as was found in Turiel et al (Turiel A, Mato G, Parga N and Nadal J-P 1998 Self-Similarity Properties of Natural Images: Proc. NIPS'97 (Cambridge, MA: MIT Press), Turiel A, Mato G, Parga N and Nadal J-P 1998 Phys. Rev. Lett. 80 1098-101). But it also shows that the scaling exponents are not universal: even if most images follow the same type of statistics, they do it with different values of the distribution parameters. Motivated by these results, we also present the analysis of a generative model of images that reproduces those properties and that has the correct power spectrum. Possible implications for visual processing are also discussed. | |||||
BibTeX:
@article{nevado_scene_2000, author = {A Nevado and A Turiel and N Parga}, title = {Scene dependence of the non-Gaussian scaling properties of natural images}, journal = {Network (Bristol, England)}, year = {2000}, volume = {11}, number = {2}, pages = {131--52}, note = {PMID: 10880003}, url = {http://www.ncbi.nlm.nih.gov/pubmed/10880003} } |
|||||
Olshausen, B.A. | Learning sparse, overcomplete representations of time-varying natural images | 2003 | Vol. 1Image Processing, 2003. ICIP 2003. Proceedings. 2003 International Conference on, pp. I―41―4vol.1 |
inproceedings | DOI |
Abstract: I show how to adapt an overcomplete dictionary of space-time functions so as to represent time-varying natural images with maximum sparsity. The basis functions are considered as part of a probabilistic model of image sequences, with a sparse prior imposed over the coefficients. Learning is accomplished by maximizing the log-likelihood of the model, using natural movies as training data. The basis functions that emerge are space-time inseparable functions that resemble the motion-selective receptive fields of simple-cells in mammalian visual cortex. When the coefficients are computed via matching-pursuit in space and time, one obtains a punctuate, spike-like representation of continuous time-varying images. It is suggested that such a coding scheme may be at work in the visual cortex. | |||||
BibTeX:
@inproceedings{olshausen_learning_2003, author = {B. A Olshausen}, title = {Learning sparse, overcomplete representations of time-varying natural images}, booktitle = {Image Processing, 2003. ICIP 2003. Proceedings. 2003 International Conference on}, year = {2003}, volume = {1}, pages = {I―41―4vol.1}, doi = {{10.1109/ICIP.2003.1246893}} } |
|||||
Olshausen, B.A. & Field, D.J. | Sparse coding of sensory inputs. | 2004 | Curr Opin Neurobiol Vol. 14(4), pp. 481―487 |
article | DOI URL |
Abstract: Several theoretical, computational, and experimental studies suggest that neurons encode sensory information using a small number of active neurons at any given point in time. This strategy, referred to as 'sparse coding', could possibly confer several advantages. First, it allows for increased storage capacity in associative memories; second, it makes the structure in natural signals explicit; third, it represents complex data in a way that is easier to read out at subsequent levels of processing; and fourth, it saves energy. Recent physiological recordings from sensory neurons have indicated that sparse coding could be a ubiquitous strategy employed in several different modalities across different organisms. | |||||
BibTeX:
@article{olshausen_sparse_2004, author = {Bruno A Olshausen and David J Field}, title = {Sparse coding of sensory inputs.}, journal = {Curr Opin Neurobiol}, year = {2004}, volume = {14}, number = {4}, pages = {481―487}, url = {http://dx.doi.org/10.1016/j.conb.2004.07.007}, doi = {http://dx.doi.org/10.1016/j.conb.2004.07.007} } |
|||||
Olshausen, B.A. & Field, D.J. | Sparse coding with an overcomplete basis set: a strategy employed by V1? | 1997 | Vision Res Vol. 37(23), pp. 3311―3325 |
article | DOI URL |
Abstract: The spatial receptive fields of simple cells in mammalian striate cortex have been reasonably well described physiologically and can be characterized as being localized, oriented, and bandpass, comparable with the basis functions of wavelet transforms. Previously, we have shown that these receptive field properties may be accounted for in terms of a strategy for producing a sparse distribution of output activity in response to natural images. Here, in addition to describing this work in a more expansive fashion, we examine the neurobiological implications of sparse coding. Of particular interest is the case when the code is overcomplete―i.e., when the number of code elements is greater than the effective dimensionality of the input space. Because the basis functions are non-orthogonal and not linearly independent of each other, sparsifying the code will recruit only those basis functions necessary for representing a given input, and so the input-output function will deviate from being purely linear. These deviations from linearity provide a potential explanation for the weak forms of non-linearity observed in the response properties of cortical simple cells, and they further make predictions about the expected interactions among units in response to naturalistic stimuli. | |||||
BibTeX:
@article{olshausen_sparse_1997, author = {B. A. Olshausen and D. J. Field}, title = {Sparse coding with an overcomplete basis set: a strategy employed by V1?}, journal = {Vision Res}, year = {1997}, volume = {37}, number = {23}, pages = {3311―3325}, url = {http://dx.doi.org/10.1016/S0042-6989(97)00169-7}, doi = {{10.1016/S0042-6989(97)00169-7}} } |
|||||
Olshausen, B.A. & Field, D.J. | Emergence of simple-cell receptive field properties by learning a sparse code for natural images. | 1996 | Nature Vol. 381(6583), pp. 607―609 |
article | DOI URL |
Abstract: The receptive fields of simple cells in mammalian primary visual cortex can be characterized as being spatially localized, oriented and bandpass (selective to structure at different spatial scales), comparable to the basis functions of wavelet transforms. One approach to understanding such response properties of visual neurons has been to consider their relationship to the statistical structure of natural images in terms of efficient coding. Along these lines, a number of studies have attempted to train unsupervised learning algorithms on natural images in the hope of developing receptive fields with similar properties, but none has succeeded in producing a full set that spans the image space and contains all three of the above properties. Here we investigate the proposal that a coding strategy that maximizes sparseness is sufficient to account for these properties. We show that a learning algorithm that attempts to find sparse linear codes for natural scenes will develop a complete family of localized, oriented, bandpass receptive fields, similar to those found in the primary visual cortex. The resulting sparse image code provides a more efficient representation for later stages of processing because it possesses a higher degree of statistical independence among its outputs. | |||||
BibTeX:
@article{olshausen_emergence_1996, author = {B. A. Olshausen and D. J. Field}, title = {Emergence of simple-cell receptive field properties by learning a sparse code for natural images.}, journal = {Nature}, year = {1996}, volume = {381}, number = {6583}, pages = {607―609}, url = {http://dx.doi.org/10.1038/381607a0}, doi = {http://dx.doi.org/10.1038/381607a0} } |
|||||
Portilla, J. & Simoncelli, E.P. | A Parametric Texture Model Based on Joint Statistics of Complex Wavelet Coefficients [BibTeX] |
2000 | International Journal of Computer Vision Vol. 40(1), pp. 49―70 |
article | URL |
BibTeX:
@article{portilla_parametric_2000, author = {J. Portilla and E. P Simoncelli}, title = {A Parametric Texture Model Based on Joint Statistics of Complex Wavelet Coefficients}, journal = {International Journal of Computer Vision}, year = {2000}, volume = {40}, number = {1}, pages = {49―70}, url = {http://www.springerlink.com/content/r244h74572250895/fulltext.pdf} } |
|||||
P�rraga, C.A., Brelstaff, G., Troscianko, T. & Moorehead, I.R. | Color and luminance information in natural scenes | 1998 | Journal of the Optical Society of America A Vol. 15(3), pp. 563-569 |
article | DOI URL |
Abstract: The spatial filtering applied by the human visual system appears to be low pass for chromatic stimuli and band pass for luminance stimuli. Here we explore whether this observed difference in contrast sensitivity reflects a real difference in the components of chrominance and luminance in natural scenes. For this purpose a digital set of 29 hyperspectral images of natural scenes was acquired and its spatial frequency content analyzed in terms of chrominance and luminance defined according to existing models of the human cone responses and visual signal processing. The statistical 1/f amplitude spatial-frequency distribution is confirmed for a variety of chromatic conditions across the visible spectrum. Our analysis suggests that natural scenes are relatively rich in high-spatial-frequency chrominance information that does not appear to be transmitted by the human visual system. This result is unlikely to have arisen from errors in the original measurements. Several reasons may combine to explain a failure to transmit high-spatial-frequency chrominance: (a) its minor importance for primate visual tasks, (b) its removal by filtering applied to compensate for chromatic aberration of the eye’s optics, and (c) a biological bottleneck blocking its transmission. In addition, we graphically compare the ratios of luminance to chrominance measured by our hyperspectral camera and those measured psychophysically over an equivalent spatial-frequency range. | |||||
BibTeX:
@article{prraga_color_1998, author = {C. A. P�rraga and G. Brelstaff and T. Troscianko and I. R. Moorehead}, title = {Color and luminance information in natural scenes}, journal = {Journal of the Optical Society of America A}, year = {1998}, volume = {15}, number = {3}, pages = {563--569}, url = {http://josaa.osa.org/abstract.cfm?URI=josaa-15-3-563}, doi = {{10.1364/JOSAA.15.000563}} } |
|||||
Rao, R.P. & Ballard, D.H. | Predictive coding in the visual cortex: a functional interpretation of some extra-classical receptive-field effects. | 1999 | Nat Neurosci Vol. 2(1), pp. 79―87 |
article | DOI URL |
Abstract: We describe a model of visual processing in which feedback connections from a higher- to a lower-order visual cortical area carry predictions of lower-level neural activities, whereas the feedforward connections carry the residual errors between the predictions and the actual lower-level activities. When exposed to natural images, a hierarchical network of model neurons implementing such a model developed simple-cell-like receptive fields. A subset of neurons responsible for carrying the residual errors showed endstopping and other extra-classical receptive-field effects. These results suggest that rather than being exclusively feedforward phenomena, nonclassical surround effects in the visual cortex may also result from cortico-cortical feedback as a consequence of the visual system using an efficient hierarchical strategy for encoding natural images. | |||||
BibTeX:
@article{rao_predictive_1999, author = {R. P. Rao and D. H. Ballard}, title = {Predictive coding in the visual cortex: a functional interpretation of some extra-classical receptive-field effects.}, journal = {Nat Neurosci}, year = {1999}, volume = {2}, number = {1}, pages = {79―87}, url = {http://dx.doi.org/10.1038/4580}, doi = {http://dx.doi.org/10.1038/4580} } |
|||||
Redies, C. | A universal model of esthetic perception based on the sensory coding of natural stimuli | 2007 | Spatial Vision Vol. 21(1-2), pp. 97-117 |
article | DOI URL |
Abstract: Philosophers have pointed out that there is a close relation between the esthetics of art and the beauty of natural scenes. Supporting this similarity at the experimental level, we have recently shown that visual art and natural scenes share fractal-like, scale-invariant statistical properties. Moreover, evidence from neurophysiological experiments shows that the visual system uses an efficient (sparse) code to process optimally the statistical properties of natural stimuli. In the present work, a hypothetical model of esthetic perception is described that combines both lines of evidence. Specifically, it is proposed that an artist creates a work of art so that it induces a specific resonant state in the visual system. This resonant state is thought to be based on the adaptation of the visual system to natural scenes. The proposed model is universal and predicts that all human beings share the same general concept of esthetic judgment. The model implies that esthetic perception, like the coding of natural stimuli, depends on stimulus form rather than content, depends on higher-order statistics of the stimuli, and is non-intuitive to cognitive introspection. The model accommodates the central tenet of neuroesthetic theory that esthetic perception reflects fundamental functional properties of the nervous system. | |||||
BibTeX:
@article{redies_universal_2007, author = {Christoph Redies}, title = {A universal model of esthetic perception based on the sensory coding of natural stimuli}, journal = {Spatial Vision}, year = {2007}, volume = {21}, number = {1-2}, pages = {97--117}, note = {PMID: 18073053}, url = {http://www.ncbi.nlm.nih.gov/pubmed/18073053}, doi = {http://dx.doi.org/10.1163/156856807782753886} } |
|||||
Redies, C., Hasenstein, J. & Denzler, J. | Fractal-like image statistics in visual art: similarity to natural scenes | 2007 | Spatial Vision Vol. 21(1-2), pp. 137-48 |
article | DOI URL |
Abstract: Both natural scenes and visual art are often perceived as esthetically pleasing. It is therefore conceivable that the two types of visual stimuli share statistical properties. For example, natural scenes display a Fourier power spectrum that tends to fall with spatial frequency according to a power-law. This result indicates that natural scenes have fractal-like, scale-invariant properties. In the present study, we asked whether visual art displays similar statistical properties by measuring their Fourier power spectra. Our analysis was restricted to graphic art from the Western hemisphere. For comparison, we also analyzed images, which generally display relatively low or no esthetic quality (household and laboratory objects, parts of plants, and scientific illustrations). Graphic art, but not the other image categories, resembles natural scenes in showing fractal-like, scale-invariant statistics. This property is universal in our sample of graphic art; it is independent of cultural variables, such as century and country of origin, techniques used or subject matter. We speculate that both graphic art and natural scenes share statistical properties because visual art is adapted to the structure of the visual system which, in turn, is adapted to process optimally the image statistics of natural scenes. | |||||
BibTeX:
@article{redies_fractal-like_2007, author = {Christoph Redies and Jens Hasenstein and Joachim Denzler}, title = {Fractal-like image statistics in visual art: similarity to natural scenes}, journal = {Spatial Vision}, year = {2007}, volume = {21}, number = {1-2}, pages = {137--48}, note = {PMID: 18073055}, url = {http://www.ncbi.nlm.nih.gov/pubmed/18073055}, doi = {http://dx.doi.org/10.1163/156856807782753921} } |
|||||
Redies, C., Hänisch, J., Blickhan, M. & Denzler, J. | Artists portray human faces with the Fourier statistics of complex natural scenes | 2007 | Network (Bristol, England) Vol. 18(3), pp. 235-48 |
article | DOI URL |
Abstract: When artists portray human faces, they generally endow their portraits with properties that render the faces esthetically more pleasing. To obtain insight into the changes introduced by artists, we compared Fourier power spectra in photographs of faces and in portraits by artists. Our analysis was restricted to a large set of monochrome or lightly colored portraits from various Western cultures and revealed a paradoxical result. Although face photographs are not scale-invariant, artists draw human faces with statistical properties that deviate from the face photographs and approximate the scale-invariant, fractal-like properties of complex natural scenes. This result cannot be explained by systematic differences in the complexity of patterns surrounding the faces or by reproduction artifacts. In particular, a moderate change in gamma gradation has little influence on the results. Moreover, the scale-invariant rendering of faces in artists' portraits was found to be independent of cultural variables, such as century of origin or artistic techniques. We suggest that artists have implicit knowledge of image statistics and prefer natural scene statistics (or some other rules associated with them) in their creations. Fractal-like statistics have been demonstrated previously in other forms of visual art and may be a general attribute of esthetic visual stimuli. | |||||
BibTeX:
@article{redies_artists_2007, author = {Christoph Redies and Jan Hänisch and Marko Blickhan and Joachim Denzler}, title = {Artists portray human faces with the Fourier statistics of complex natural scenes}, journal = {Network (Bristol, England)}, year = {2007}, volume = {18}, number = {3}, pages = {235--48}, note = {PMID: 17852751}, url = {http://www.ncbi.nlm.nih.gov/pubmed/17852751}, doi = {http://dx.doi.org/10.1080/09548980701574496} } |
|||||
Redlich, A.N. | Redundancy reduction as a strategy for unsupervised learning [BibTeX] |
1993 | Neural Computation Vol. 5(2), pp. 289―304 |
article | URL |
BibTeX:
@article{redlich_redundancy_1993, author = {A. N Redlich}, title = {Redundancy reduction as a strategy for unsupervised learning}, journal = {Neural Computation}, year = {1993}, volume = {5}, number = {2}, pages = {289―304}, url = {http://redwood.berkeley.edu/w/images/5/5a/03-redlich-nc-1993.pdf} } |
|||||
Ruderman, D. | The statistics of natural images | 1994 | Network: Computation in Neural Systems Vol. 5, pp. 517-548(32) |
article | DOI URL |
Abstract: Recently there has been a resurgence of interest in the properties of natural images. Their statistics are important not only in image compression but also for the study of sensory processing in biology, which can be viewed as satisfying certain &145;design criteria'. This review summarizes previous work on image statistics and presents our own data. Perhaps the most notable property of natural images is an invariance to scale. We present data to support this claim as well as evidence for a hierarchical invariance in natural scenes. These symmetries provide a powerful description of natural images as they greatly restrict the class of allowed distributions. | |||||
BibTeX:
@article{ruderman_statistics_1994-1, author = {Daniel Ruderman}, title = {The statistics of natural images}, journal = {Network: Computation in Neural Systems}, year = {1994}, volume = {5}, pages = {517--548(32)}, url = {http://www.ingentaconnect.com/content/tandf/network/1994/00000005/00000004/art00006}, doi = {{doi:10.1088/0954-898X/5/4/006}} } |
|||||
Ruderman, D.L. | Origins of scaling in natural images. | 1997 | Vision Res Vol. 37(23), pp. 3385―3398 |
article | DOI URL |
Abstract: One of the most robust qualities of our visual world is the scale invariance of natural images. Not only has scaling been found in different visual environments, but the phenomenon also appears to be calibration-independent. This paper proposes a simple property of natural images which explains this robustness: they are collages of regions corresponding to statistically independent "objects". Evidence is provided for these objects having a power-law distribution of sizes within images, from which follows scaling in natural images. It is commonly suggested that scaling instead results from edges, each with power spectrum⅟k2. This hypothesis is refuted by example. | |||||
BibTeX:
@article{ruderman_origins_1997, author = {D. L. Ruderman}, title = {Origins of scaling in natural images.}, journal = {Vision Res}, year = {1997}, volume = {37}, number = {23}, pages = {3385―3398}, url = {http://redwood.berkeley.edu/bruno/npb261b/ruderman97.pdf}, doi = {{http://dx.doi.org/10.1016/S0042-6989(97)00008-4}} } |
|||||
Ruderman, D.L., Cronin, T.W. & Chiao, C.C. | Statistics of cone responses to natural images: implications for visual coding [BibTeX] |
1998 | Journal of the Optical Society of America A Vol. 15(8), pp. 2036―2045 |
article | URL |
BibTeX:
@article{ruderman_statistics_1998, author = {D. L Ruderman and T. W Cronin and C. C Chiao}, title = {Statistics of cone responses to natural images: implications for visual coding}, journal = {Journal of the Optical Society of America A}, year = {1998}, volume = {15}, number = {8}, pages = {2036―2045}, url = {http://redwood.berkeley.edu/w/images/b/b9/17-ruderman-josa-1998.pdf} } |
|||||
Ruderman, DL. & Bialek, W. | Statistics of natural images: Scaling in the woods. [BibTeX] |
1994 | Physical review letters Vol. 73(6), pp. 817, 814 |
article | URL |
BibTeX:
@article{ruderman_statistics_1994, author = {DL Ruderman and W Bialek}, title = {Statistics of natural images: Scaling in the woods.}, journal = {Physical review letters}, year = {1994}, volume = {73}, number = {6}, pages = {817, 814}, url = {http://view.ncbi.nlm.nih.gov/pubmed/10057546} } |
|||||
Schreiber, E. & Griffiths, T.L. | Subjective randomness and natural scene statistics. [BibTeX] |
2007 | Proceedings of the Twenty-Ninth Annual Conference of the Cognitive Science Society | article | |
BibTeX:
@article{schreiber_subjective_2007, author = {E. Schreiber and T. L. Griffiths}, title = {Subjective randomness and natural scene statistics.}, journal = {Proceedings of the Twenty-Ninth Annual Conference of the Cognitive Science Society}, year = {2007} } |
|||||
Schwartz, O. & Simoncelli, E.P. | Natural signal statistics and sensory gain control. | 2001 | Nat Neurosci Vol. 4(8), pp. 819―825 |
article | DOI URL |
Abstract: We describe a form of nonlinear decomposition that is well-suited for efficient encoding of natural signals. Signals are initially decomposed using a bank of linear filters. Each filter response is then rectified and divided by a weighted sum of rectified responses of neighboring filters. We show that this decomposition, with parameters optimized for the statistics of a generic ensemble of natural images or sounds, provides a good characterization of the nonlinear response properties of typical neurons in primary visual cortex or auditory nerve, respectively. These results suggest that nonlinear response properties of sensory neurons are not an accident of biological implementation, but have an important functional role. | |||||
BibTeX:
@article{schwartz_natural_2001, author = {O. Schwartz and E. P. Simoncelli}, title = {Natural signal statistics and sensory gain control.}, journal = {Nat Neurosci}, year = {2001}, volume = {4}, number = {8}, pages = {819―825}, url = {http://dx.doi.org/10.1038/90526}, doi = {http://dx.doi.org/10.1038/90526} } |
|||||
Sharpee, T., Rust, N.C. & Bialek, W. | Analyzing neural responses to natural signals: maximally informative dimensions. | 2004 | Neural Comput Vol. 16(2), pp. 223―250 |
article | DOI URL |
Abstract: We propose a method that allows for a rigorous statistical analysis of neural responses to natural stimuli that are nongaussian and exhibit strong correlations. We have in mind a model in which neurons are selective for a small number of stimulus dimensions out of a high-dimensional stimulus space, but within this subspace the responses can be arbitrarily nonlinear. Existing analysis methods are based on correlation functions between stimuli and responses, but these methods are guaranteed to work only in the case of gaussian stimulus ensembles. As an alternative to correlation functions, we maximize the mutual information between the neural responses and projections of the stimulus onto low-dimensional subspaces. The procedure can be done iteratively by increasing the dimensionality of this subspace. Those dimensions that allow the recovery of all of the information between spikes and the full unprojected stimuli describe the relevant subspace. If the dimensionality of the relevant subspace indeed is small, it becomes feasible to map the neuron's input-output function even under fully natural stimulus conditions. These ideas are illustrated in simulations on model visual and auditory neurons responding to natural scenes and sounds, respectively. | |||||
BibTeX:
@article{sharpee_analyzing_2004, author = {Tatyana Sharpee and Nicole C Rust and William Bialek}, title = {Analyzing neural responses to natural signals: maximally informative dimensions.}, journal = {Neural Comput}, year = {2004}, volume = {16}, number = {2}, pages = {223―250}, url = {http://dx.doi.org/10.1162/089976604322742010}, doi = {http://dx.doi.org/10.1162/089976604322742010} } |
|||||
Sharpee, T.O., Miller, K.D. & Stryker, M.P. | On the Importance of Static Nonlinearity in Estimating Spatiotemporal Neural Filters With Natural Stimuli | 2008 | J Neurophysiol Vol. 99(5), pp. 2496-2509 |
article | DOI URL |
Abstract: Understanding neural responses with natural stimuli has increasingly become an essential part of characterizing neural coding. Neural responses are commonly characterized by a linear-nonlinear (LN) model, in which the output of a linear filter applied to the stimulus is transformed by a static nonlinearity to determine neural response. To estimate the linear filter in the LN model, studies of responses to natural stimuli commonly use methods that are unbiased only for a linear model (in which there is no static nonlinearity): spike-triggered averages with correction for stimulus power spectrum, with or without regularization. Although these methods work well for artificial stimuli, such as Gaussian white noise, we show here that they estimate neural filters of LN models from responses to natural stimuli much more poorly. We studied simple cells in cat primary visual cortex. We demonstrate that the filters computed by directly taking the nonlinearity into account have better predictive power and depend less on the stimulus than those computed under the linear model. With noise stimuli, filters computed using the linear and LN models were similar, as predicted theoretically. With natural stimuli, filters of the two models can differ profoundly. Noise and natural stimulus filters differed significantly in spatial properties, but these differences were exaggerated when filters were computed using the linear rather than the LN model. Although regularization of filters computed under the linear model improved their predictive power, it also led to systematic distortions of their spatial frequency profiles, especially at low spatial and temporal frequencies. | |||||
BibTeX:
@article{sharpee_importance_2008, author = {Tatyana O. Sharpee and Kenneth D. Miller and Michael P. Stryker}, title = {On the Importance of Static Nonlinearity in Estimating Spatiotemporal Neural Filters With Natural Stimuli}, journal = {J Neurophysiol}, year = {2008}, volume = {99}, number = {5}, pages = {2496--2509}, url = {http://jn.physiology.org/cgi/content/abstract/99/5/2496}, doi = {http://dx.doi.org/10.1152/jn.01397.2007} } |
|||||
Sigman, M., Cecchi, GA., Gilbert, CD. & Magnasco, MO. | On a common circle: natural scenes and Gestalt rules. | 2001 | Proc Natl Acad Sci U S A Vol. 98(4), pp. 1940, 1935 |
article | URL |
Abstract: To understand how the human visual system analyzes images, it is essential to know the structure of the visual environment. In particular, natural images display consistent statistical properties that distinguish them from random luminance distributions. We have studied the geometric regularities of oriented elements (edges or line segments) present in an ensemble of visual scenes, asking how much information the presence of a segment in a particular location of the visual scene carries about the presence of a second segment at different relative positions and orientations. We observed strong long-range correlations in the distribution of oriented segments that extend over the whole visual field. We further show that a very simple geometric rule, cocircularity, predicts the arrangement of segments in natural scenes, and that different geometrical arrangements show relevant differences in their scaling properties. Our results show similarities to geometric features of previous physiological and psychophysical studies. We discuss the implications of these findings for theories of early vision. | |||||
BibTeX:
@article{sigman_common_2001, author = {M Sigman and GA Cecchi and CD Gilbert and MO Magnasco}, title = {On a common circle: natural scenes and Gestalt rules.}, journal = {Proc Natl Acad Sci U S A}, year = {2001}, volume = {98}, number = {4}, pages = {1940, 1935}, url = {http://dx.doi.org/10.1073/pnas.031571498} } |
|||||
Simoncelli, E.P. | Vision and the statistics of the visual environment | 2003 | Current Opinion in Neurobiology Vol. 13(2), pp. 144-9 |
article | URL |
Abstract: It is widely believed that visual systems are optimized for the visual properties of the environment inhabited by the organism. A specific instance of this principle is known as the Efficient Coding Hypothesis, which holds that the purpose of early visual processing is to produce an efficient representation of the incoming visual signal. The theory provides a quantitative link between the statistical properties of the world and the structure of the visual system. As such, specific instances of this theory have been tested experimentally, and have been used to motivate and constrain models for early visual processing. | |||||
BibTeX:
@article{simoncelli_vision_2003, author = {Eero P Simoncelli}, title = {Vision and the statistics of the visual environment}, journal = {Current Opinion in Neurobiology}, year = {2003}, volume = {13}, number = {2}, pages = {144--9}, note = {PMID: 12744966}, url = {http://www.ncbi.nlm.nih.gov/pubmed/12744966} } |
|||||
Simoncelli, E.P. & Freeman, W.T. | The steerable pyramid: A flexible architecture for multi-scale derivative computation [BibTeX] |
1995 | null Vol. 3, pp. 444--447 |
article | DOI URL |
BibTeX:
@article{simoncelli_steerable_1995, author = {Eero P Simoncelli and William T Freeman}, title = {The steerable pyramid: A flexible architecture for multi-scale derivative computation}, journal = {null}, year = {1995}, volume = {3}, pages = {444---447}, url = {http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.2.7126}, doi = {http://dx.doi.org/10.1.1.2.7126} } |
|||||
Simoncelli, E.P. & Olshausen, B.A. | Natural image statistics and neural representation | 2001 | Annual Review of Neuroscience Vol. 24, pp. 1193-216 |
article | DOI URL |
Abstract: It has long been assumed that sensory neurons are adapted, through both evolutionary and developmental processes, to the statistical properties of the signals to which they are exposed. Attneave (1954)Barlow (1961) proposed that information theory could provide a link between environmental statistics and neural responses through the concept of coding efficiency. Recent developments in statistical modeling, along with powerful computational tools, have enabled researchers to study more sophisticated statistical models for visual images, to validate these models empirically against large sets of data, and to begin experimentally testing the efficient coding hypothesis for both individual neurons and populations of neurons. | |||||
BibTeX:
@article{simoncelli_natural_2001, author = {E P Simoncelli and B A Olshausen}, title = {Natural image statistics and neural representation}, journal = {Annual Review of Neuroscience}, year = {2001}, volume = {24}, pages = {1193--216}, note = {PMID: 11520932}, url = {http://www.ncbi.nlm.nih.gov/pubmed/11520932}, doi = {http://dx.doi.org/10.1146/annurev.neuro.24.1.1193} } |
|||||
Simoncelli, E.P. & Olshausen, B.A. | Natural image statistics and neural representation. | 2001 | Annu Rev Neurosci Vol. 24, pp. 1193―1216 |
article | DOI URL |
Abstract: It has long been assumed that sensory neurons are adapted, through both evolutionary and developmental processes, to the statistical properties of the signals to which they are exposed. Attneave (1954)Barlow (1961) proposed that information theory could provide a link between environmental statistics and neural responses through the concept of coding efficiency. Recent developments in statistical modeling, along with powerful computational tools, have enabled researchers to study more sophisticated statistical models for visual images, to validate these models empirically against large sets of data, and to begin experimentally testing the efficient coding hypothesis for both individual neurons and populations of neurons. | |||||
BibTeX:
@article{simoncelli_natural_2001-1, author = {E. P. Simoncelli and B. A. Olshausen}, title = {Natural image statistics and neural representation.}, journal = {Annu Rev Neurosci}, year = {2001}, volume = {24}, pages = {1193―1216}, url = {http://dx.doi.org/10.1146/annurev.neuro.24.1.1193}, doi = {http://dx.doi.org/10.1146/annurev.neuro.24.1.1193} } |
|||||
Srinivasan, M.V., Laughlin, S.B. & Dubs, A. | Predictive Coding: A Fresh View of Inhibition in the Retina [BibTeX] |
1982 | Proceedings of the Royal Society of London. Series B, Biological Sciences Vol. 216(1205), pp. 427―459 |
article | URL |
BibTeX:
@article{srinivasan_predictive_1982, author = {M. V. Srinivasan and S. B. Laughlin and A. Dubs}, title = {Predictive Coding: A Fresh View of Inhibition in the Retina}, journal = {Proceedings of the Royal Society of London. Series B, Biological Sciences}, year = {1982}, volume = {216}, number = {1205}, pages = {427―459}, url = {http://redwood.berkeley.edu/w/images/f/f7/05-srinivasan-prsl-1982.pdf} } |
|||||
Srivastava, A., Lee, A.B., Simoncelli, E.P. & Zhu, S.C. | On Advances in Statistical Modeling of Natural Images [BibTeX] |
2003 | Journal of Mathematical Imaging and Vision Vol. 18(1), pp. 17―33 |
article | URL |
BibTeX:
@article{srivastava_advances_2003, author = {A. Srivastava and A. B. Lee and E. P. Simoncelli and S. C Zhu}, title = {On Advances in Statistical Modeling of Natural Images}, journal = {Journal of Mathematical Imaging and Vision}, year = {2003}, volume = {18}, number = {1}, pages = {17―33}, url = {http://www.springerlink.com/content/jt354188q4685l29/fulltext.pdf} } |
|||||
Taylor, R.P., Micolich, A.P. & Jonas, D. | Fractal Analysis: Revisiting Pollock's drip paintings (Reply) [BibTeX] |
2006 | Nature Vol. 444(7119), pp. {E10-E11} |
article | DOI URL |
BibTeX:
@article{taylor_fractal_2006, author = {R. P. Taylor and A. P. Micolich and D. Jonas}, title = {Fractal Analysis: Revisiting Pollock's drip paintings (Reply)}, journal = {Nature}, year = {2006}, volume = {444}, number = {7119}, pages = {E10--E11}, url = {http://dx.doi.org/10.1038/nature05399}, doi = {http://dx.doi.org/10.1038/nature05399} } |
|||||
Taylor, R.P., Micolich, A.P. & Jonas, D. | Fractal analysis of Pollock's drip paintings [BibTeX] |
1999 | Nature Vol. 399(6735), pp. 422 |
article | DOI URL |
BibTeX:
@article{taylor_fractal_1999, author = {Richard P. Taylor and Adam P. Micolich and David Jonas}, title = {Fractal analysis of Pollock's drip paintings}, journal = {Nature}, year = {1999}, volume = {399}, number = {6735}, pages = {422}, url = {http://dx.doi.org/10.1038/20833}, doi = {http://dx.doi.org/10.1038/20833} } |
|||||
TAYLOR, RP. | Order in Pollock's chaos [BibTeX] |
2002 | Scientific American Vol. 287(6), pp. 84-89 |
article | |
BibTeX:
@article{taylor_order_2002, author = {RP TAYLOR}, title = {Order in Pollock's chaos}, journal = {Scientific American}, year = {2002}, volume = {287}, number = {6}, pages = {84--89} } |
|||||
Thomson, M.G. | Visual coding and the phase structure of natural scenes | 1999 | Network (Bristol, England) Vol. 10(2), pp. 123-32 |
article | URL |
Abstract: Although it is now well known that natural images display consistent statistical properties which distinguish them from random luminance distributions, this ecological approach to vision has so far concentrated on those second-order image statistics which are quantified by image power spectra, and it appears to be the image phase spectra which carry the majority of the image-intrinsic information. The present work describes how conventional nth-order statistics can be modified so that they are sensitive to image phase structure only. The modified measures are applied to an ensemble of natural images, and the results show that natural images do have consistent higher-order statistical properties which distinguish them from random-phase images with the same power spectra. An interpretation of this finding in terms of higher-order spectra suggests that these consistent properties arise from the ubiquity of edge structures in natural images, and raises the possibility that the properties of ideal relative-phase-sensitive mechanisms could be determined directly from analyses of the higher-order structure of natural scenes. | |||||
BibTeX:
@article{thomson_visual_1999, author = {M G Thomson}, title = {Visual coding and the phase structure of natural scenes}, journal = {Network (Bristol, England)}, year = {1999}, volume = {10}, number = {2}, pages = {123--32}, note = {PMID: 10378188}, url = {http://www.ncbi.nlm.nih.gov/pubmed/10378188} } |
|||||
Thomson, M.G., Foster, D.H. & Summers, R.J. | Human sensitivity to phase perturbations in natural images: a statistical framework | 2000 | Perception Vol. 29(9), pp. 1057-69 |
article | URL |
Abstract: Fourier-phase information is important in determining the appearance of natural scenes, but the structure of natural-image phase spectra is highly complex and difficult to relate directly to human perceptual processes. This problem is addressed by extending previous investigations of human visual sensitivity to the randomisation and quantisation of Fourier phase in natural images. The salience of the image changes induced by these physical processes is shown to depend critically on the nature of the original phase spectrum of each image, and the processes of randomisation and quantisation are shown to be perceptually equivalent provided that they shift image phase components by the same average amount. These results are explained by assuming that the visual system is sensitive to those phase-domain image changes which also alter certain global higher-order image statistics. This assumption may be used to place constraints on the likely nature of cortical processing: mechanisms which correlate the outputs of a bank of relative-phase-sensitive units are found to be consistent with the patterns of sensitivity reported here. | |||||
BibTeX:
@article{thomson_human_2000, author = {M G Thomson and D H Foster and R J Summers}, title = {Human sensitivity to phase perturbations in natural images: a statistical framework}, journal = {Perception}, year = {2000}, volume = {29}, number = {9}, pages = {1057--69}, note = {PMID: 11144819}, url = {http://www.ncbi.nlm.nih.gov/pubmed/11144819} } |
|||||
Tolhurst, D.J. & Tadmor, Y. | Discrimination of spectrally blended natural images: optimisation of the human visual system for encoding natural images | 2000 | Perception Vol. 29(9), pp. 1087-100 |
article | URL |
Abstract: We have developed a protocol for testing experimentally the hypothesis that the human visual system is optimised for making visual discriminations amongst natural scenes. Visual stimuli were made by gradual blending of the Fourier spectra of digitised photographs of natural scenes. The statistics of the stimuli were made unnatural to varying degrees by changing the overall slopes of the amplitude spectra of the stimuli. Thresholds were measured for discriminating small amounts of spectral blending at different spectral slopes. We found that thresholds were lowest when the spectral slope was natural; thresholds were increased when the slopes were either shallower or steeper than natural. A number of spurious cues were considered, such as differences in mean luminance or overall spectral power or contrast between test and reference stimuli. Control experiments were performed to remove such spurious cues, and the discrimination thresholds were still lowest for stimuli that were most natural. Thus, these experiments do provide experimental support for the idea that human vision and the human visual system are optimised for processing natural visual information [corrected]. | |||||
BibTeX:
@article{tolhurst_discrimination_2000, author = {D J Tolhurst and Y Tadmor}, title = {Discrimination of spectrally blended natural images: optimisation of the human visual system for encoding natural images}, journal = {Perception}, year = {2000}, volume = {29}, number = {9}, pages = {1087--100}, note = {PMID: 11144821}, url = {http://www.ncbi.nlm.nih.gov/pubmed/11144821} } |
|||||
Torralba, A., Fergus, R. & Freeman, W.T. | 80 million tiny images: a large data set for nonparametric object and scene recognition | 2008 | IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 30(11), pp. 1958-70 |
article | DOI URL |
Abstract: With the advent of the Internet, billions of images are now freely available online and constitute a dense sampling of the visual world. Using a variety of non-parametric methods, we explore this world with the aid of a large dataset of 79,302,017 images collected from the Internet. Motivated by psychophysical results showing the remarkable tolerance of the human visual system to degradations in image resolution, the images in the dataset are stored as 32 x 32 color images. Each image is loosely labeled with one of the 75,062 non-abstract nouns in English, as listed in the Wordnet lexical database. Hence the image database gives a comprehensive coverage of all object categories and scenes. The semantic information from Wordnet can be used in conjunction with nearest-neighbor methods to perform object classification over a range of semantic levels minimizing the effects of labeling noise. For certain classes that are particularly prevalent in the dataset, such as people, we are able to demonstrate a recognition performance comparable to class-specific Viola-Jones style detectors. | |||||
BibTeX:
@article{torralba_80_2008, author = {Antonio Torralba and Rob Fergus and William T Freeman}, title = {80 million tiny images: a large data set for nonparametric object and scene recognition}, journal = {IEEE Transactions on Pattern Analysis and Machine Intelligence}, year = {2008}, volume = {30}, number = {11}, pages = {1958--70}, note = {PMID: 18787244}, url = {http://www.ncbi.nlm.nih.gov/pubmed/18787244}, doi = {{10.1109/TPAMI.2008.128}} } |
|||||
Torralba, A. & Oliva, A. | Statistics of natural image categories | 2003 | Network (Bristol, England) Vol. 14(3), pp. 391-412 |
article | URL |
Abstract: In this paper we study the statistical properties of natural images belonging to different categories and their relevance for scene and object categorization tasks. We discuss how second-order statistics are correlated with image categories, scene scale and objects. We propose how scene categorization could be computed in a feedforward manner in order to provide top-down and contextual information very early in the visual processing chain. Results show how visual categorization based directly on low-level features, without grouping or segmentation stages, can benefit object localization and identification. We show how simple image statistics can be used to predict the presence and absence of objects in the scene before exploring the image. | |||||
BibTeX:
@article{torralba_statistics_2003, author = {Antonio Torralba and Aude Oliva}, title = {Statistics of natural image categories}, journal = {Network (Bristol, England)}, year = {2003}, volume = {14}, number = {3}, pages = {391--412}, note = {PMID: 12938764}, url = {http://www.ncbi.nlm.nih.gov/pubmed/12938764} } |
|||||
Touryan, J., Felsen, G. & Dan, Y. | Spatial structure of complex cell receptive fields measured with natural images. | 2005 | Neuron Vol. 45(5), pp. 781―791 |
article | DOI URL |
Abstract: Neuronal receptive fields (RFs) play crucial roles in visual processing. While the linear RFs of early neurons have been well studied, RFs of cortical complex cells are nonlinear and therefore difficult to characterize, especially in the context of natural stimuli. In this study, we used a nonlinear technique to compute the RFs of complex cells from their responses to natural images. We found that each RF is well described by a small number of subunits, which are oriented, localized, and bandpass. These subunits contribute to neuronal responses in a contrast-dependent, polarity-invariant manner, and they can largely predict the orientation and spatial frequency tuning of the cell. Although the RF structures measured with natural images were similar to those measured with random stimuli, natural images were more effective for driving complex cells, thus facilitating rapid identification of the subunits. The subunit RF model provides a useful basis for understanding cortical processing of natural stimuli. | |||||
BibTeX:
@article{touryan_spatial_2005, author = {Jon Touryan and Gidon Felsen and Yang Dan}, title = {Spatial structure of complex cell receptive fields measured with natural images.}, journal = {Neuron}, year = {2005}, volume = {45}, number = {5}, pages = {781―791}, url = {http://dx.doi.org/10.1016/j.neuron.2005.01.029}, doi = {http://dx.doi.org/10.1016/j.neuron.2005.01.029} } |
|||||
Turiel, A. & Parga, N. | Role of statistical symmetries in sensory coding: an optimal scale invariant code for vision | 2004 | Journal of Physiology, Paris Vol. 97(4-6), pp. 491-502 |
article | DOI URL |
Abstract: The visual system is the most studied sensory pathway, which is partly because visual stimuli have rather intuitive properties. There are reasons to think that the underlying principle ruling coding, however, is the same for vision and any other type of sensory signal, namely the code has to satisfy some notion of optimality--understood as minimum redundancy or as maximum transmitted information. Given the huge variability of natural stimuli, it would seem that attaining an optimal code is almost impossible; however, regularities and symmetries in the stimuli can be used to simplify the task: symmetries allow predicting one part of a stimulus from another, that is, they imply a structured type of redundancy. Optimal coding can only be achieved once the intrinsic symmetries of natural scenes are understood and used to the best performance of the neural encoder. In this paper, we review the concepts of optimal coding and discuss the known redundancies and symmetries that visual scenes have. We discuss in depth the only approach which implements the three of them known so far: translational invariance, scale invariance and multiscaling. Not surprisingly, the resulting code possesses features observed in real visual systems in mammals. | |||||
BibTeX:
@article{turiel_role_2004, author = {Antonio Turiel and Néstor Parga}, title = {Role of statistical symmetries in sensory coding: an optimal scale invariant code for vision}, journal = {Journal of Physiology, Paris}, year = {2004}, volume = {97}, number = {4-6}, pages = {491--502}, note = {PMID: 15242659}, url = {http://www.ncbi.nlm.nih.gov/pubmed/15242659}, doi = {http://dx.doi.org/10.1016/j.jphysparis.2004.01.007} } |
|||||
Turiel, A., Parga, N., Ruderman, D.L. & Cronin, T.W. | Multiscaling and information content of natural color images | 2000 | Physical Review E Vol. 62(1), pp. 1138 |
article | DOI URL |
Abstract: Naive scale invariance is not a true property of natural images. Natural monochrome images possess a much richer geometrical structure, which is particularly well described in terms of multiscaling relations. This means that the pixels of a given image can be decomposed into sets, the fractal components of the image, with well-defined scaling exponents [Turiel and Parga, Neural Comput. 12, 763 (2000)]. Here it is shown that hyperspectral representations of natural scenes also exhibit multiscaling properties, observing the same kind of behavior. A precise measure of the informational relevance of the fractal components is also given, and it is shown that there are important differences between the intrinsically redundant red-green-blue system and the decorrelated one defined in Ruderman, Cronin, and Chiao [J. Opt. Soc. Am. A 15, 2036 (1998)]. | |||||
BibTeX:
@article{turiel_multiscaling_2000, author = {Antonio Turiel and N�stor Parga and Daniel L. Ruderman and Thomas W. Cronin}, title = {Multiscaling and information content of natural color images}, journal = {Physical Review E}, year = {2000}, volume = {62}, number = {1}, pages = {1138}, note = {Copyright (C) 2009 The American Physical Society; Please report any problems to prola@aps.org}, url = {http://link.aps.org/abstract/PRE/v62/p1138}, doi = {{10.1103/PhysRevE.62.1138}} } |
|||||
Wachtler, T., Lee, T.W. & Sejnowski, T.J. | Chromatic structure of natural scenes. | 2001 | J Opt Soc Am A Opt Image Sci Vis Vol. 18(1), pp. 65―77 |
article | URL |
Abstract: We applied independent component analysis (ICA) to hyperspectral images in order to learn an efficient representation of color in natural scenes. In the spectra of single pixels, the algorithm found basis functions that had broadband spectra and basis functions that were similar to natural reflectance spectra. When applied to small image patches, the algorithm found some basis functions that were achromatic and others with overall chromatic variation along lines in color space, indicating color opponency. The directions of opponency were not strictly orthogonal. Comparison with principal-component analysis on the basis of statistical measures such as average mutual information, kurtosis, and entropy, shows that the ICA transformation results in much sparser coefficients and gives higher coding efficiency. Our findings suggest that nonorthogonal opponent encoding of photoreceptor signals leads to higher coding efficiency and that ICA may be used to reveal the underlying statistical properties of color information in natural scenes. | |||||
BibTeX:
@article{wachtler_chromatic_2001, author = {T. Wachtler and T. W. Lee and T. J. Sejnowski}, title = {Chromatic structure of natural scenes.}, journal = {J Opt Soc Am A Opt Image Sci Vis}, year = {2001}, volume = {18}, number = {1}, pages = {65―77}, url = {http://redwood.berkeley.edu/w/images/5/59/18-wachtler-josa-2001.pdf} } |
|||||
Wainwright, M.J. & Simoncelli, E.P. | Scale mixtures of Gaussians and the statistics of natural images [BibTeX] |
2000 | , pp. 855―861 | techreport | URL |
BibTeX:
@techreport{wainwright_scale_2000, author = {M. J Wainwright and E. P Simoncelli}, title = {Scale mixtures of Gaussians and the statistics of natural images}, year = {2000}, pages = {855―861}, url = {http://www.cns.nyu.edu/ftp/eero/wainwright99b.ps.gz} } |
|||||
Wainwright, M.J., Simoncelli, E.P. & Willsky, A.S. | Random cascades on wavelet trees and their use in analyzing and modeling natural images [BibTeX] |
2001 | Applied and Computational Harmonic Analysis Vol. 11, pp. 89--123 |
article | DOI URL |
BibTeX:
@article{wainwright_random_2001, author = {Martin J Wainwright and Eero P Simoncelli and Alan S Willsky}, title = {Random cascades on wavelet trees and their use in analyzing and modeling natural images}, journal = {Applied and Computational Harmonic Analysis}, year = {2001}, volume = {11}, pages = {89---123}, url = {http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.20.7029}, doi = {http://dx.doi.org/10.1.1.20.7029} } |
|||||
Watson, A.B. | Digital Images and Human Vision [BibTeX] |
1993 | , pp. 224 | book | |
BibTeX:
@book{watson_digital_1993, author = {Andrew B Watson}, title = {Digital Images and Human Vision}, publisher = {MIT Press}, year = {1993}, pages = {224} } |
|||||
Yang, Z. & Purves, D. | The statistical structure of natural light patterns determines perceived light intensity | 2004 | Proceedings of the National Academy of Sciences of the United States of America Vol. 101(23), pp. 8745-50 |
article | DOI URL |
Abstract: The same target luminance in different contexts can elicit markedly different perceptions of brightness, a fact that has long puzzled vision scientists. Here we test the proposal that the visual system encodes not luminance as such but rather the statistical relationship of a particular luminance to all possible luminance values experienced in natural contexts during evolution. This statistical conception of vision was validated by using a database of natural scenes in which we could determine the probability distribution functions of co-occurring target and contextual luminance values. The distribution functions obtained in this way predict target brightness in response to a variety of challenging stimuli, thus explaining these otherwise puzzling percepts. That brightness is determined by the statistics of natural light patterns implies that the relevant neural circuitry is specifically organized to generate these probabilistic responses. | |||||
BibTeX:
@article{yang_statistical_2004, author = {Zhiyong Yang and Dale Purves}, title = {The statistical structure of natural light patterns determines perceived light intensity}, journal = {Proceedings of the National Academy of Sciences of the United States of America}, year = {2004}, volume = {101}, number = {23}, pages = {8745--50}, note = {PMID: 15152077}, url = {http://www.ncbi.nlm.nih.gov/pubmed/15152077}, doi = {http://dx.doi.org/10.1073/pnas.0402192101} } |
|||||
Yao, H., Shi, L., Han, F., Gao, H. & Dan, Y. | Rapid learning in cortical coding of visual scenes. | 2007 | Nat Neurosci | article | DOI URL |
Abstract: Experience-dependent plasticity in adult visual cortex is believed to have important roles in visual coding and perceptual learning. Here we show that repeated stimulation with movies of natural scenes induces a rapid improvement in response reliability in cat visual cortex, whereas stimulation with white noise or flashed bar stimuli does not. The improved reliability can be accounted for by a selective increase in spiking evoked by preferred stimuli, and the magnitude of improvement depends on the sparseness of the response. The increase in reliability persists for at least several minutes in the absence of further movie stimulation. During this period, spontaneous spiking activity shows detectable reverberation of the movie-evoked responses. Thus, repeated exposure to natural stimuli not only induces a rapid improvement in cortical response reliability, but also leaves a 'memory trace' in subsequent spontaneous activity. | |||||
BibTeX:
@article{yao_rapid_2007, author = {Haishan Yao and Lei Shi and Feng Han and Hongfeng Gao and Yang Dan}, title = {Rapid learning in cortical coding of visual scenes.}, journal = {Nat Neurosci}, year = {2007}, url = {http://dx.doi.org/10.1038/nn1895}, doi = {http://dx.doi.org/10.1038/nn1895} } |
Created by JabRef on 22/01/2009.