CVMP 2009
A multimodal approach to perceptual tone mapping
urn:nbn:de:0009635145
Abstract
We present an improvement of TSTM, a recently proposed tone mapping operator for High Dynamic Range (HDR) images, based on a multimodal analysis. One of the key features of TSTM is a suitable implementation of the NakaRushton equation that mimics the visual adaptation performed by the human visual system coherently with WeberFechner's law of contrast perception. In the present paper we use the Gaussian Mixture Model (GMM) in order to detect the modes of the logscale luminance histogram of a given HDR image and then we use the information provided by GMM to properly devise a NakaRushton equation for each mode. Finally, we properly select the parameters in order to merge those equations into a continuous function. Tests and comparisons to show how this new method is capable of improving the performances of TSTM are provided and commented, as well as comparisons with state of the art methods.
Keywords: High Dynamic Range Images, Tone Mapping, Naka Rushton formula, WeberFechner Contrast, Gaussian Mixture Model
Keywords: High Dynamic Range Images, Tone Mapping, Naka Rushton Formula, WeberFechner Contrast, Gaussian Mixture Model
SWD: High Dynamic Range, Tone Mapping, Computergraphik
In daylight the Human Visual System (HVS) works best in terms of color vision and perception of details. The amount of light arriving to our retinas can span many orders of magnitude, from the scotopic lower bound 10^{2} indoors to the glare upper bound 10^{9} outdoors with the brightest sunlight [ FPSG96 ] but our photoreceptor neurons in the retina, rods and cones, produce electrical outputs which span only two orders of magnitude [ SEC84 ] (pag.326).
Therefore, the HVS cannot operate over the entire range of physical radiances simultaneously. Rather, it adapts to an average intensity and handles a smaller magnitude interval, through a process called visual adaptation. In photography and film production we are faced with the same situation: most cameras (both photo and video cameras) take Low Dynamic Range (LDR) pictures, spanning only two orders of magnitude, so some sort of adaptation mechanism is required. In film production, the equivalent of the visual adaptation of the HVS is achieved by flooding the scene with more light, as the great director Sydney Lumet so clearly explains in [ Lum95 ] (page 83):
If you've ever passed a movie company shooting on the streets, you may have seen an enormous lamp pouring its light onto an actor's face. We call it an arc or a brute, and it gives off the equivalent of 12,000 watts. Your reaction has probably been: What's the matter with these people? The sun's shining brightly and they're adding that big light so that the actor is practically squinting. Well, film is limited in many ways. It's a chemical process, and one of its limitations is the amount of contrast it can take. It can adjust to a lot of light or a little bit of light. But it can't take a lot of light and a little bit of light in the same frame. It's a poorer version of your own eyesight. I'm sure you've seen a person standing against a window with a bright, sunny day outside. The person becomes silhouetted against the sky. We can't make out his features. Those arc lamps correct the "balance" between the light on the actor's face and the bright sky. If we didn't use them, his face would go completely black.
Therefore, the use of movie cameras capable of capturing High Dynamic Range (HDR) images would greatly simplify the process of shooting outdoors: less artificial lights to transport and setup, less time spent, less hassle for the actors. These sort of cameras are becoming more popular, but still we are faced with the problem that most displays are LDR, so a HDR to LDR conversion must be performed in order to screen the picture. This HDR to LDR conversion, if it is performed trying to emulate as much as possible the contrast and color sensation of the realworld scene, i.e. achieve an image that looks natural (as it is our case, as opposed to trying to maximize the visible details even if the resulting image appears artificial), is called 'Tone Mapping' (TM) or 'Tone Reproduction' (TR) [ LRP97 ].
An excellent survey of the many TM methods proposed up to 2005 can be found in [ RWPD05 ]. Among the more recent works we would like to mention [ RSSF02, RD05, TAMS08 ], which use a perceptualbased approach involving the NakaRushton equation [ NCY79 ]; [ KMS05 ], which uses the anchoring theory of visual perception [ GKB99 ]; and [ LFUS06 ], where the authors propose an interactive method that allows the user to create better subjective results.
In this paper we propose a perceptualbased approach for TM which is an extension of the method introduced in [ FPBC09b ]. Given that the goal is to obtain LDR pictures that appear natural, it seems reasonable to try to mimic basic features of the HVS: in our case, we are trying to emulate visual adaptation and spatially local contrast enhancement. Our contribution is to propose a method for TM which compares well in terms of image quality with the stateoftheart, is able to deal with images where the luminance histogram has modes which are far apart, and is fast. It is an improvement of the method introduced in [ FPBC09b ] and which was not capable of dealing well with multimodal histogram images. It is presented for still images, but in the final section we suggest how it could be extended to motion pictures.
This paper is organized as follows. In section 2 we review the technique proposed in [ FPBC09b, FPBC09a ] and discuss its limitations. Section 3 introduces our method and explains how to overcome most of the problems encountered in [ FPBC09b, FPBC09a ]. Section 4 presents some results of our algorithm as well as comparisons with stateoftheart methods. Finally, section 5 presents some conclusions and possibilities for future research.
In this section we will present the fundamental concepts about visual adaptation and contrast perception. These concepts will allow us to discuss a modification of the NakaRushton equation that is vital for the construction of our tone mapping operator. These conclusions were already presented in [ FPBC09b, FPBC09a ]. We will also present the advantages and drawbacks of the method and, in next section, we will propose a new improvement based on Gaussian Mixture Models (GMM).
Let us begin by recalling how the retina responds to light stimuli. The range of radiances over which the HVS can operate is very large: from 10^{6}cd/m^{2} (scotopic limit) to 10^{6}cd/m^{2} (glare limit) [ Pra07 ]. The automatic process that allows the HVS to operate over such a huge range is called visual adaptation [ SEC84 ].
It is important to stress that the HVS cannot operate over its entire range simultaneously. Rather, it adapts to an average intensity and handles a smaller magnitude interval. There is no complete agreement in the literature about the precise value of this range, which can vary from two ([ SEC84 ] (pag.326)) to four ([ KS08 ] (pag.670)) orders of magnitude.
Neuroscience experiments show that visual adaptation occurs mainly in the retina. The experiments to measure this behavior were performed using very simple, nonnatural images: on a uniform background were superimposed brief pulses of light with intensity . When a photoreceptor absorbs , the electric potential of its membrane changes accordingly to the empirical law known as the NakaRushton equation [ SEC84 ]:
where _{s} is the light level at which the photoreceptor response is half maximal, called semisaturation level and which is usually associated with the level of adaptation. The change of electric potential ΔV( ) is the photoreceptor's physiological response to , which generates an electric current that propagates towards the brain. Finally, ΔV_{max} is the highest difference of potential that can be generated. The graph of the function r( )is depicted in Fig. 1(a).
Figure 1. (a) Graph of the NakaRushton equation r() vs. , with _{s}. (b) Increment threshold versus intensity in loglog scale. The dots represent experimental data taken from [Shapley and EnrothCugell 1984](pag. 291). The curve that interpolates the data was obtained using a function with the same structure as in eq.( 3 ). (c) Inverse of the derivative of the NakaRushton function in loglog scale in arbitrary units.
Let us notice that, since and _{s} are light levels, they are positive and therefore the right hand side of NakaRushton's formula (1) belongs to [0,1], independently of the range of the light stimuli.
The NakaRushton equation describes the behavior of the HVS at any specific adaptation level.
In the midnineteenth century, the German physician E. H. Weber conducted the first psychophysical experiments using hand held weight, discovering that the perception of weight followed a ratio law. Regarding visual perception, these experiments where developed with a similar setup (flashes of light on a uniform background) as the ones described above, but instead of measuring the electric response inside the retina a phenomenological approach was applied by asking the subject when the difference between the background light and the superimposed light + Δ was noticeable. The minimum difference Δ which the subject is able to perceive is called the Just Noticeable Difference, JND. Weber found out that the ratio between the JND and the background intensity is constant for a wide range of values of , which is expressed in what is known as Weber' Law:
where k > 0 is a perceptual constant called Weber fraction.
In loglog units, the relationship between Δ and is linear, with slope 1: . However, Weber' s Law does not hold for low intensity values, where the slope tends to zero instead of one. To account for this, Weber' s colleague G. Fechner introduced the concept of 'dark light', thus modifying Weber's Law as follows:
where m > 0 is a quantity often interpreted as internal noise in the visual mechanism, e.g. quoting [ CW03 ] (pag. 859) "an intrinsic activity [...] within the receptor systems that combines with the excitation produced by the background to raise the threshold". This last equation is commonly called WeberFechner's Law. Now, when >> m the slope of log(JND) as a function of log( ) is 1; but when << m the slope is close to zero and the curve matches the experimental data also at low intensity values. See Fig. 1(b).
We now follow the approach in [ WS82 ] (pag. 490) and postulate that the JND can be used as a 'sensation magnitude' because it corresponds to (minimum) equal increments of sensation along the whole photopic range. This allows us to rewrite WeberFechner's Law in the following way:
where Δs is the increment in the sensationmagnitude function s) ), also called "perceived brightness".
We would now like to underline the identification between the electric response of visual neurons in the retina and the perceived brightness function by presenting the following qualitative argument. With easy manipulations of eq.(4) and by taking infinitesimal differences, we can write:
On the other hand, if we plot the graph of , where r( ) is defined in eq.( 1), we find the curve represented in Fig. 1(c). It can be noticed that there is a very good qualitative match between the curve related to the WeberFechner behavior and the one related to the NakaRushton equation.
So, from now on, we will identify the output of the NakaRushton equation r( ) with the perceived brightness described by the WeberFechner law s( ).
Even though this idea has not been explicitly stated yet in the tone mapping literature, all the TM works that use the NakaRushton equation ([ PTYG00, RD05, TAMS08 ]) are implicitly using the just described assumption.
As we have just discussed, the WeberFechner law and the NakaRushton equation refer to the same natural process: brightness perception. But they describe different aspects of the problem: on one hand we have the WeberFechner law that defines the perception of contrast, and on the other hand we have the NakaRushton equation that describes the process of adapting to the average light value of the scene and properly compressing the whole radiance range into [0,1]. In order to combine these two descriptions, we are now going to show that the NakaRushton function r( ) must be modified to reproduce the correct (WeberFechner) perception of contrast.
Given that we can identify s( ) and r( )>, let us rewrite eq.(4) by substituting function s with function r and taking again infinitesimal differences:
This equation gives us the condition that the NakaRushton function r( ) must satisfy in order to reproduce the correct perceived contrast. However, let us notice that r( ) is expressed by formula (1), thus performing the derivative we have:
Comparing eq.(6) and eq.(7) we can see that the right hand sides do not coincide.
This implies that the NakaRushton equation does not follow the WeberFechner law unless s is modified [ SEC84 ] from a constant to a function of .
As remarked in [ SEC84 ], we can argue the same conclusion by analyzing the behavior of the NakaRushton equation. In fact, if we set a constant value for s, the function will map to 1 all light levels significatively bigger than s, an effect called 'saturation catastrophe'.
We will show in the next section, after introducing a proper nomenclature for HDR images, that if we substitute s with a suitable function, , we will avoid this problem. We will refer to the modified NakaRushton equation as:
In this section we will apply the concepts already presented to digital images. For the sake of simplicity, we will first consider the luminance image and then extend the method to the full color image. We use the simplest, equally weighted, luminance since the results that we are going to present are practically invariant with respect to the many luminance definitions available in the literature.
Let us introduce the notation that will be used throughout the paper. Let , be the radiance map representing the input HDR image, being its spatial domain: = {1,...,W} x {1,...,H} ⊂ , where W,H ≥ 1 are integers corresponding to the image width and height, respectively. We denote with I_{c} the generic value of the scalar chromatic components of ,c ∈ {R,G,B}, with x = (x_{1},x_{2}) ∈ the spatial position of an arbitrary pixel in the image, and with I_{c}(x) the intensity value in the pixel x of the c channel. Finally, we denote with λ(x) the luminance, i.e. λ(x) = [I_{R}(x) + I_{G}(x) + I_{B}(x)], of the pixel x ∈ and with λ a generic luminance value, i.e. λ ∈ [λ_{min},λ_{max}] ⊂ (0,+∞), where λ_{min} and λ_{max} are the extreme luminance values. In order to avoid singularities inλ = 0 we add to the whole luminance image a value of 10^{12} .
The translation of the equation (8) to the digital world is:
where we have maintained the same symbol r to avoid a cumbersome notation. Note that we are assuming that the semisaturation constant to be translated as the value μ, that we will leave unspecified for now.
In [ FPBC09b, FPBC09a ], the authors proposed univocally determine f_{μ} by imposing the function r(λ) in eq.(9) to satisfy WeberFechner's law of contrast perception. This requirement can be formalized through the following differential equation [ FPBC09b, FPBC09a ]:
By integrating both members with respect to the variable λ, one obtains:
being C an integration constant. Introducing this expression of f_{μ}(λ) in eq.(9), one finds that the expression of the generalized NakaRushton formula coherent with WeberFechner's contrast perception is:
The triplet of parameters C, k and m can be determined by imposing some general conditions that were discussed in [ FPBC09b, FPBC09a ]. Their analytical expressions are:
With these parameters, the function f_{μ} is well defined and nonnegative within the range λ_{min},λ_{max}] and, by substituting the explicit value of C, we get the expression:
with k and m defined as in eqs. (14), (15), respectively.
Let us now extend the NakaRushton equation to a full color image. In [ FPBC09b ], the authors commented that the NakaRushton implementation on color images that gives the best results in terms of color rendition is the following:
with the same choice of the parameters appearing in f_{μ} as above. In this paper, we will follow this choice. Note that we are applying independently the function r(I_{c}(x)) to each R, G, B color channel.
This transformation constitutes the first step of the Two Stage Tone Mapper (TSTM). Let us now review the second stage of TSTM: enhancement of spatially local contrast. The phenomenological characteristics of the HVS have been used in [ PAPBC09 ] to build a variational energy functional E(I) whose minimization gives rise to an explicit algorithm able to perform a balance between two opposite mechanisms: one provides a spatially local contrast enhancement and the other controls that the intensity value dispersion does not depart too much from the input image. This step, apart from improving detail visibility, permits to partially discard the presence of a possible color cast.
The empirical tests performed by using eq. (17) have shown that TSTM performs very well on HDR images whose range is up to 5 orders of magnitude and whose histogram is not sharply multimodal.
Moreover, its output results strongly depend on the choice of μ. In [ FPBC09b ], this value has been represented by a convex linear combination in the logarithmic domain between the arithmetic μ_{a} and geometric μ_{g} luminance averages: μ(ρ) = μ_{a} ^{ρ}μ_{g} ^{1ρ} , where ρ ∈ [0,1]. The best results were achieved using values of ρ that vary between 0.7 and 1, depending on the particular image. The effect of varying ρ is a modification in the overall brightness of the output: the bigger the value of ρ, the darker the output.
To bypass the problem related with HDR images whose range extend beyond 5 orders of magnitude, in [ FPBC09b ] the authors proposed to reduce the image range. However, this does not overcome the problem and all the information contained in the clipped luminance regions below the new value of λ_{min} and above the new value of λ_{max} are lost.
Finally, the presence of sharply separated 'modes' in the histogram can result in an incorrect rendition of contrast because, in that situation, the value of μ can fall into a poorly populated region of the histogram, resulting in the under or overexposition of some image areas.
In order to overcome the problems related to the possible presence of sharp modes in the HDR image histogram, we propose here an extension of the first step of TSTM based on a multimodal approach.
The main idea of our method is to divide the whole luminance range into smaller intervals, apply eq.(12) over each interval and merge them in such a way that the global transformation on λ results continuous. The main advantage of this approach is that, while preserving WeberFechner's contrast globally in the image, it tonemaps correctly the details in all the areas of the histogram. Moreover, outliers or luminance values far in the histogram will not influence the current intervals.
Once again we will introduce the method for the luminance plane and then extend it to a color image. We will start by explaining how to divide the luminance range into intervals, then how to process each of them and, finally, we will show how to link the different intervals in order to process the complete luminance range.
We propose to divide the logluminance histogram in intervals using the Gaussian Mixture Model (GMM) method [ Bis06 ]. Understanding GMM as a density estimator, if we compute GMM over the histogram we obtain where the modes are located. In the present paper, we have chosen to compute GMM over the histogram in the log scale because we obtain more robust results.
The result of the GMM is a set of N Gaussians defined by their mean values
and standard deviations , j = 1,...,N in the logdomain [1].
For the jth Gaussian, the values and can be considered as the extrema of its area of influence, since the area under the Gaussian between these extrema is approximately 95,4% of the total area.
We notice that, on one side, the support of the N Gaussians may not cover the entire logluminance range of the image because of isolated pixels; on the other side, Gaussians may overlap, so, in order to overcome these problems, we define the limits of the N subintervals in the linear luminance range as follows:
note that we are forcing the intervals to be within the range defined by the contiguous Gaussian means.
Let us now define the jth interval as [λ^{j} _{min},λ^{j} _{min}] and the normalized NakaRushton formula over it as:
where r(λ) is defined as in eq.(12), with the following parameters:
Thus, instead of having only one m and k for the whole image, we have one for each interval.
Note that we are forcing m_{j} to be positive. A value of m_{j} less than zero may correspond to a convex r_{j} , a condition that we want to avoid because the HVS response to light stimuli [ SEC84 ] is described by a concave function.
Now that we have defined the normalized NakaRushton formula over each interval, we will show how to link all of them and construct a continuous function over the entire range that we express as:
where function χ_{j}(λ) is the characteristic function of the jth subinterval:
The values h_{j} and C_{j} allow us to stick together the different NakaRushton equations and set their output within the range [0,1]. Let us start by the scaling factor h_{j} . This value defines the height of the jth NakaRushton within the final range r_{G}(λ_{max})  r_{G}(λ_{max}). In the λ domain the jth NakaRushton formula will be applied over the luminance values in the the jth Gaussian domain, thus, in the perceptual brightness domain, these values will be mapped to . The length of the jth interval in the perceptual brightness domain is then:
By normalizing, we obtain the final expression of h_{j} :
where m is computed with global values as in eq. (15) with μ = μ_{a} . This expression guarantees that each jth subrange is mapped into a subrange coherently with the WeberFechner's law, eq.(4). Note that we are computing the h_{j} values over the extrema of the Gaussians instead of the extrema of the intervals. Some HDR images present outliers produced by numerical errors while creating the HDR image, i.e. small amounts of pixels with values far away from the main mass of the histogram. If the amount of outliers is small, the GMM algorithm does not model their group of values, thus, they will not be taken into account in the computation of h_{j} . This fact is important, given that we define h_{j} as a distance between the extrema, and therefore, a single outlier can modify the final value of the h_{j} . Therefore, the main advantage of taking the extrema of the Gaussians is that the effect of the outliers is minimized.
Finally, we can stick together all the scaled NakaRushton functions by defining C_{j} as:
The extension of this method to color images is exactly the same as for the TSTM method. We use eq. (17) but we subtitute f_{μ}(λ) with:
Comparing Tone Mapping results is a very difficult issue given the perceptual nature of the problem. Here we propose a perceptual approach, that is, the subjective taste of the user, which has being the standard for the last years in the Tone Mapping community. Although we would like to point out that some interesting work are being produced in this topic: Drago et al. [ DMAC03 ], Yoshida et al. [ YBMS05 ], Kuang et al. [ KYL07 ], Ledda et al. [ LCTS05 ] , Čadík et al. [ CWNA08 ] all ran psychological experiments and obtained a set of the best TMO following the users' answers. Recently, Aydin et al.[ AMMS08 ] have proposed a measure that is able to output a numerical error by comparing the TM image and the original HDR, allowing a more objective and reproducible way of judging the quality of a TMO.
In this paper we leave the numerical evaluation of our TMO for further research. We will show some results obtained by the multimodal TSTM method and compare them, both to TSTM and to some methods of the state of the art in Tone Mapping.
Let us start by comparing TSTM and multimodal TSTM. Multimodal TSTM was first introduced in order to obtain higher amount of details in areas misrepresented by μ(ρ) in the TSTM method. The effect of computing eq.(16) over subintervals of luminance range is a greater accuracy in the representation of all image areas, as can be seen in the first row of fig. (4. Note how multimodal TSTM renders well the stainedglass window without diminishing the overall contrast of the image.
The overall brightness of the TSTM output images depended on two factors: the user dependent parameter ρ and the proper location of the final value of μ. By locating the modes automatically the userdependent parameter is not required and the control of the overall brightness depends mostly on the h_{j} values. The consequence can beobserved in the image 'Cars' in fig.(4): while TSTM produces good results contrastwise, the overall brightness of the image is low for a midday scene. The reason of this improvement is that the modes obtained with GMM represent better the mass of the histogam (see the histogram in fig. (4), thus the NakaRushton functions are more precisely located.
Figure 2. (a) In red we have the multimodal NakaRushton function with its three μ(j) as red circles, in blue the NakaRushton obtained with TSTM and its μ(ρ) value. (b)Histogram of the 'Cars' image with the modes obtained with GMM in red and the μ(ρ) with ρ = 0.5 for the TSTM algorithm in blue. Note how the modes represent better the mass of the histogram.
As it was stated in section 2.4, when the value of μ(ρ) misrepresents the mass of the histogram, TSTM produces output images with unbalanced contrast between different areas of the histogram, i.e. while some areas present a great amount of details others tend to be flat. An example can be seen in third row of fig. (4). The TSTM result gives a high amount of details in the brighter areas (the sky) leaving the darker areas too bright to reproduce the perception of a shadowy scene. However, multimodal TSTM balances better the contrast in all the image, obtaining more contrast on the darker areas.
Figure 3. Result obtained by tonemapping with the TSTM algorithm (right) and the multimodal TSTM algorithm (left) the synthetic image 'Bathroom', courtesy of Greg Ward.
On the other hand, multimodal TSTM seems to give unnatural results for synthetic HDR images. The reason could be that the method is assuming features of a natural scene that may not be fulfilled by the synthetic image (see fig. 4).
Figure 4. Results obtained by tonemapping with (a) the TSTM algorithm and (b) the multimodal TSTM algorithm of the images: 'Nave' (first row) courtesy of Paul Debevec [ DM97 ], 'Cars'(second row) and 'GroveC' (third row), courtesy of Paul Debevec [ DM97 ].
Let us now discuss the results obtained by multimodal TSTM in comparison with three methods of the state of the art: [ DD02 ], [ RWPD05 ] and [ MMS06 ]. From a color reproduction point of view, multimodal TSTM produces natural colors in dark and bright areas without over or undersaturating tones, as can be seen in the sky of the 'Office' image or in the stained glass of the 'Memorial' and 'Desk' images, see fig. 5, 6 and 7, respectively.
Figure 5. Results of 'Office' produced by the methods based on the papers of (a) Durand et al. (b) Reinhard et al. (c) Mantiuk et al. (d) multimodal TSTM.
Figure 6. Results of the 'Memorial'(courtesy of Paul Debevec [ DM97 ]) produced by the methods based on the papers of (a) Durand et al. (b) Reinhard et al. (c) Mantiuk et al. (d) multimodal TSTM.
Figure 7. Results of the 'Desk' (courtesy of Industrial Light & Magic, all rights reserved. [2] ) produced by the methods based on the papers of (a) Durand et al. (b) Reinhard et al. (c) Mantiuk et al. (d) multimodal TSTM.
Taking into consideration overall contrast, multimodal TSTM reproduces details in bright and dark areas while maintaining overall contrast in the image (see the dark areas of 'Memorial' and 'Desk'). Therefore, to the authors opinion multimodal TSTM compares well to the state of the art.
We have proposed a multimodal extension of TSTM [ FPBC09b, FPBC09a ], a recent tone mapping operator for HDR images inspired by two sequencial stages of the HVS: visual adaptation and local contrast enhancement. The first step is implemented through a suitable modification of the classical NakaRushton equation which combines range compression and global rendition of contrast following WeberFechner's law. The second step is performed through a variational algorithm for contrast enhancement which is also able to reduce the effect of a possible color cast. The multimodal extension proposed in this paper only affects the first step, while the second is kept unchanged. Our proposal is to use the GMM in order to approximate the modes of the logarithmic histogram through a collection of Gaussians and use this information in order to implement, for each mode, a NakaRushton function that better distributes the tone values with respect to a global transformation. Finally, all these restricted NakaRushton functions are merged together.
Our tests have shown that this multimodal extension indeed corresponds to a better rendition of contrast with respect to the global NakaRushton transformation proposed in [ FPBC09b ]. Besides this improvement, with the new method we do not need to restrict the range of HDR images to 5 orders of magnitude anymore, while in[ FPBC09b ] this was essential in order to avoid a weak rendition of global contrast.
We are currently working on the extension to motion pictures of the technique presented in this paper. As an initial approach we would like to use several correlative frames, corresponding to the same shot, to build the function in eq. (21), and then apply our TM operator, using this function, to all the frames in the shot. It is expected that if there are no sudden and abrupt changes of luminance in the sequence, the output will not show mapping artifacts.
As a theoretical drawback of our model, we point out that the way in which we build domain and codomain of the NakaRushton functions corresponding to each Gaussian is not perfectly coherent in the case of overlapping Gaussians. As a future work, it would be interesting to find a smooth way to overcome this problem.
The authors would like to thank L. Sánchez for his photograph, and F. Durand, E. Reinhard, and G. Ward for providing the implementations of the methods in the state of the art. M. Bertalmío and V. Caselles acknowledge partial support by PNPGC project, reference MTM200614836. V. Caselles also wants to acknowledge "ICREA Acadèmia" prize for excellence in research founded by the Generalitat de Catalunya. E. Provenzi acknowledges the Ramón y Cajal fellowship by Ministerio de Ciencia y Tecnología de España.
[AMMS08] Dynamic range independent image quality assessment, ACM SIGGRAPH 2008 papers, 2008, Los Angeles, California, pp. 1—10.
[Bis06] Pattern Recognition and Machine Learning (Information Science and Statistics), Springer, 2006, isbn 0387310738.
[CW03] L. M. Chalupa and J. S. Werner, The visual neuroscience, MIT Press, 2003, isbn 0262033089.
[CWNA08] Evaluation of HDR tone mapping methods using essential perceptual attributes, Computers & Graphics, 2008), no. 3, 330—349, issn 00978493. (
[DD02] Fast bilateral filtering for the display of highdynamicrange images, SIGGRAPH 2002, Proceedings of the 29th annual conference on Computer graphics and interactive techniques, 2002, pp. 257—266, isbn 1581135211.
[DM97] Recovering high dynamic range radiance maps from photographs, Proceedings of the 24th annual conference on Computer graphics and interactive techniques, ACM Press/AddisonWesley Publishing Co., 1997, pp. 369—378, isbn 0897918967.
[DMAC03] Adaptive logarithmic mapping for displaying high contrast scenes, Computer Graphics Forum, 2003), 419—426, issn 14678659. (
[FPBC09a] An analysis of visual adaptation and contrast perception for a tone mapping operator, IEEE Transactions on Pattern Analysis and Machine Intelligence, 2009), no. 10, 2002—2012, issn 01628828. (
[FPBC09b] TSTM: A twostage tone mapper combining visual adaptation and local contrast enhancement, 2009, IMA Preprint, http://www.ima.umn.edu/preprints/may2009/may2009.html, Last visited May 24th, 2012.
[FPSG96] A Model of Visual Adaptation for Realistic Image Synthesis, Proceedings of SIGGRAPH 96, Computer Graphics Proceedings, Addison Wesley, 1996, pp. 249—258, isbn 0897917464.
[GKB99] An anchoring theory of lightness perception, Psychol Rev, 1999), no. 4, 795—834, issn 0033295X. (
[KMS05] Lightness Perception in Tone Reproduction for High Dynamic Range Images, Computer Graphics Forum, (2005), no. 3, 635—645, issn 14678659.
[KS08] Mathematical Physiology, Springer, 2008, isbn 9780387094199.
[KYL07] Evaluating HDR rendering algorithms, ACM Trans. Appl. Percept., 2007), no. 2, 9, issn 15443558. (
[LCTS05] Evaluation of Tone Mapping Operators using a high dynamic range display, Proceedings ACM Transactions on Graphics, 2005), no. 3, 640—648, issn 07300301. (
[LFUS06] Interactive local adjustment of tonal values, SIGGRAPH '06: ACM SIGGRAPH 2006 Papers, Boston, Massachusetts, ACM, New York, NY, USA, 2006, pp. 646—653, isbn 1595933646.
[LRP97] A visibility matching tone reproduction operator for high dynamic range scenes, IEEE Transactions on Visualization and Computer Graphics, 1997), no. 4, 291—306, issn 10772626. (
[Lum95] Making movies, Alfred A. Knopf, 1995, isbn 0679437096.
[MMS06] A Perceptual Framework for Contrast Processing of High Dynamic Range Images, ACM Transactions on Applied Perception (TAP), (2006), no. 3, 286—308, issn 15443558.
[NCY79] Adaptation in catfish retina, Journal of Neurophysiology, 1979), no. 2, 441—454, issn 00223077. (
[PAPBC09] A perceptually inspired variational framework for color enhancement, IEEE Transactions on Pattern Analysis and Machine Intelligence, 2009), no. 3, 458—474. (
[Pra07] Digital Image Processing: PIKS Scientific inside, J. Wiley & Sons, 2007, 4. ed., newly updated and rev. ed., isbn 0471767778.
[PTYG00] TimeDependent Visual Adaptation For Fast Realistic Image Display, Proceedings of SIGGRAPH, 2000, pp. 47—54, isbn 1581132085.
[RD05] Dynamic Range Reduction Inspired by Photoreceptor Physiology, IEEE Trans. on Visualization and Computer Graphics 2005), no. 1, 13—24, issn 10772626. (
[RSSF02] Photographic tone reproduction for digital images, ACM Trans. Graph., 2002), no. 3, 267—276, issn 07300301. (
[RWPD05] High Dynamic Range Imaging, Acquisition, Display, And ImageBased Lighting, Morgan Kaufmann Ed., 2005, isbn 0125852630.
[SEC84] Visual adaptation and retinal gain controls, Progress in Retinal Research, 1984), 263—346, issn 02784327. (
[TAMS08] Digital camera workflow for high dynamic range images using a model of retinal processing, Proceedings SPIE, LCAV, 2008, , issn 0277786X.
[WS82] Color science: Concepts and methods, quantitative data and formulae, John Wiley & Sons, 1982, isbn 0471021067.
[YBMS05] Perceptual evaluation of tone mapping operators with realworld scenes, Human Vision and Electronic Imaging X, SPIE, 2005, pp. 192—203, isbn 9780819456397.
[1] N, the total number of modes, depends on the orders of magnitude of the image, we have taken .
[2] Copyright (c) 2004, Industrial Light & Magic, a division of Lucasfilm Entertainment Company Ltd. Portions contributed and copyright held by others as indicated. All rights reserved.
Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:

Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.

Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.

Neither the name of Industrial Light & Magic nor the names of any other contributors to this software may be used to endorse or promote products derived from this software without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
Fulltext ¶
 Volltext als PDF ( Size 25.3 MB )
License ¶
Any party may pass on this Work by electronic means and make it available for download under the terms and conditions of the Digital Peer Publishing License. The text of the license may be accessed and retrieved at http://www.dipp.nrw.de/lizenzen/dppl/dppl/DPPL_v2_en_062004.html.
Recommended citation ¶
Sira Ferradans, Marcelo Bertalmio, Edoardo Provenzi, and Vincent Caselles, A multimodal approach to perceptual tone mapping. JVRB  Journal of Virtual Reality and Broadcasting, 9(2012), no. 7. (urn:nbn:de:0009635145)
Please provide the exact URL and date of your last visit when citing this article.