Still Photos With Sound


By Michelle I. Seelig

WJMCR 23 (August 2010)

Introduction|Background|Media Entertainment|Research Questions|Procedures|Findings|Discussion and Conclusion


Whether for a news website or for personal use, the Web allows for the opportunity to tell stories in unique and compelling ways unimaginable in print, such as photo slideshows, photo galleries, flash videos, QuickTime movies, etc. Just as static images may be entertaining or enjoyable to viewers, it is possible that if sound is added to the static image, individuals will find this viewing experience entertaining and enjoyable. Therefore, this research employed a web survey to collect data regarding the entertainment and enjoyment value of viewing still images with sound. The web survey included a series of photographs with environmental sound presented as QuickTime movies embedded in pages of the survey to measure if there is a positive affect upon viewing still images with sound. Participants were recruited through visual communication and technology listservs. This exploratory research indicates that participants are receptive to the entertainment aspect that sound adds to viewing still images.


Whether for a news website or for personal use, the Web allows for the opportunity to tell stories in unique and compelling ways unimaginable in print, such as photo slideshows, photo galleries, flash videos, QuickTime movies, etc. There are many websites that provide users the opportunity to organize images on their home computer as well as upload on the Web; in addition to clean-up, color correct, retouch images, add effects; create web photo albums that can be shared with others, produce slideshows with or without sound, and order high quality print images sent directly to the home.1 There are also sophisticated websites that provide users the ability to combine personal still images with music to create high-energy music videos or slideshows.2 “Consumers are learning how to use these different media technologies to bring the flow of media more fully under their control and to interact with other users.”3 In this regard, consumers are producers of the same technology that is available to the news media. Mostly though, it is the younger generation quicker to adapt to innovative modes of communication as well as use the latest technological devices.4

Yet despite this change in media consumption, mainstream audience research lacks attention to the gratifications sought by Internet users’ particularly from an entertainment point of view.5 According to Bosshart and Macconi, “ . . . entertainment can be described as experienced as a reception phenomenon”.6 In this ‘new’ media culture, there is the potential for the combination of still images with sound (i.e. environmental sound or narrative) as a viable presentation technique to engage the viewer in a sensory modality beyond the traditional experience of viewing a static image. Just as static images may be entertaining or enjoyable to viewers, when sound is added to the static image, it is possible that individuals will find this viewing experience entertaining and enjoyable too. For this reason, this research is a pilot study of techno savvy web users exposed to viewing the alternative visual storytelling technique of still images with sound. Specifically, this research is an exploratory investigation of how sound interacts with still images and fulfills the entertainment function that the convergence of media provides the viewer.


Recent technological innovations have sparked a number of research streams such as the design of websites, users interaction with websites, user satisfaction with technology be it for work or learning purposes, impact of technology in routinzing work, as well as the effects of playing video games, and which technologies are better suited to execute certain tasks, etc.Yet, little research has specifically addressed how technology adds to our viewing pleasure, and theoretical driven research is lacking.

Patwardhan examined Internet user involvement and satisfaction with online activities in both India and the United States.8 According to Patwardhan, satisfaction is a natural outcome of media use and can be viewed as a general feeling of contentment resulting from frequent use of a particular medium. Also in 2004, Lowrey examined how hypermedia impacts non-linear web stories from the user’s perspective.Findings revealed that non-linearity had a positive effect on the degree of perceived control of the reading experience. Yet at the time of Lowrey’s research, he said that readers had not entirely grasped the relationship between hypermedia and the non-linear story format. According to Lowrey, “the real issue may not be the structural characteristics of the non-linear story itself, but rather the degree to which readers can make sense of story content when it is viewed from within a structure that is less familiar.”10

In 2007, Huang found rich media adoption trickling down from top media markets to lower markets.11 Hung’s analysis also revealed that many newspapers joined forces with local television stations to provide content beyond their respective platforms, and others have started to produce their own rich media to accompany news content. However, Huang found that newspaper websites still treat rich media as a supplement compared to text-based content, and producers of these sites continue to follow the media logic of their print counterparts. “In other words, technology is facilitating the transformation of news presentations on the Internet but is still subject to the boundary set by the social-cultural tradition of news production.”12 He predicts this will change in the years to come as producers of content become verse in producing rich media content; still, “much work remains to be done to provide viewers a truly pleasant converged media experience on the Internet.”13

While it stands to reason that improvements in technology is a way to expedite user satisfaction with traditional media’s presence on the web, what remains to be uncovered is how best to incorporate emerging technologies to create a unique and compelling story. Anyone with a computer, digital camera, photo editing software and Internet access can distribute and share content in today’s media environment. So, it is reasonable to believe that more emphasis will be placed on producing a multimodal product that is both enjoyable and informative given that the convergence of media and technology is visually driven. Especially, if how a media message is crafted and presented can lead to enjoyment.14

Media Entertainment

Technological innovation challenges the paradigm of visual storytelling by shifting the storytelling experience of just text and still image on the printed page to still image and sound on a web page or other multimodal entity. While photography presented a static image expressive within a 2-D sensory experience, it is possible that on the web that sensory experience is enhanced with sound. The end result integrates both the static nature of still photos and sound to create a depth of sensory experience that results in a “new” multimodal storytelling presentation to tell stories.15

In the old media paradigm, Vorderer points out that consumers were presented with “text and pictures in books and magazines and on radio and TV stations in a way that allowed audience members to attend to a presentation.”16 However, technological innovation has shifted how consumers attend to their media content from a receptive audience to active participants engaging with multimedia content. In this media paradigm, individual media consumers and their interaction with multimedia offerings now replace the mass audience.

Klimmt and Vorderer offer several possible theories to better understand the effects of new media on the audience and these theories relate to various states of media entertainment such as fascination, delight, enjoyment, and astonishment.17 Even though past research examined how the audience used media to fulfill needs and entertainment is considered one of those functions that the media performs, most of the research focused on program enjoyment, viewing of scary movies, humorous commercials, watching sports, playing video games or the viewing of pornography, as well as mood management, stimulation and arousal.18 There are a few studies that have explored how newer technologies provide entertainment or enjoyment for media seekers.19 According to Vorderer, “It is indeed often expected that the audience will accept and even seek out new forms of media use if they can receive entertainment in the process. More than anything else, this expectation is based on the observation that it is entertainment that has grown and diversified…”20

Cupchik and Kemp make a case that “the entire organization of information on “the Net” is orchestrated to enable you to access this information quickly and efficiently.21 While messages might be packaged to look a little more “aesthetic,” this only means they are made a little more visually salient in order to attract your attention. …Aesthetic artifacts can also be distinguished from everyday information-oriented materials because they combine both subject matter and style. Subject matter is any kind of factually oriented (i.e., semantic) information about the physical or social world. Style has to do with the way that the physical/sensory qualities (i.e., syntactic information) of a message are organized and affect sensory experience.”22 In this regard then, both content and the style of that content is important to the user and adds to the total viewing experience. Whether viewing media products (i.e., movies or websites) or works of art (i.e., opera or novels) “the same principles that are applied to other kinds of entertainment or cultural artifacts.”23

The web provides a change from the old media paradigm to a “new” process of bringing together a variety of media forms that may vary in linearity/nonlinearity and activate different senses when using the media. It is possible then, that “media settings designed for working, learning, or other non-entertainment purposes may be very enjoyable to use.”24 Therefore, it is the contention of this investigation to suggest that entertainment and enjoyment is a reason users” engage in the creation and distribution of such content.

Research Questions

Entertainment research is not new and has been subject to a variety of interpretations.25 Bosshart and Macconi defined entertainment as the gratification sought by users of most media. “After all is said and done, entertainment is pleasure. It means experiencing pleasure by witnessing or being exposed to something.”26 Entertainment is a receptive function by the individual experiencing it either for psychological relaxation (restful, distracting), change and diversion, stimulation (interesting, exciting, thrilling), fun (amusing, funny), atmosphere (beautiful, pleasant, comfortable) and joy (happy, cheerful). According to Bosshart and Macconi, semantic differentiation provides a profile of what encapsulates entertainment along three factors: assessment (good, pleasant, agreeable, beautiful, and enjoyable), potential (light, restful, easy, not demanding, and not compulsory), and activity (stimulating, dynamic, alive, exciting, thrilling, spontaneous, and fast); and, the opposite of entertainment is boredom (5). It is plausible then, that entertainment taps into the enjoyment either of the senses, mood management, wit and knowledge, or feelings, and is a function of a pleasant experience or pleasure-seeking/viewing behavior.

Entertainment is also a construct often associated with enjoyment and suggests that the reader or viewer has a generally positive disposition toward the media content. While liking is often used to evaluate a program or character, enjoyment is appropriate to capture “the more experiential nature of the viewing dynamic. That is, whereas liking reflects reactions (cognitive, affective, or both) to a media message, enjoyment can reflect reaction to both the message as well as the fuller media experience, including situational and contextual elements.”27 Thus, entertainment encompasses both the evaluative and experiential components of the media experience. Therefore, this study extends the line of entertainment research to include whether or not still images with sound fulfills the entertainment function and if individuals find the viewing experience enjoyable.

RQ1: Does viewing images with sound have a positive affect on viewing experience?

Media psychologists have examined traditional media effects on the audience and parallels may be drawn to emerging technologies to better understand the effects of media reception such as information processing; cognitive, affective and behavioral effects of media use; as well as motivational determinants of media use.28 “A number of scholars argue that research which focuses on distinctions among technologies rather than on distinctions about the experiences that technologies create has prevented treating the emergent concepts of hypermedia as qualities of communication.”29 Barbatsis defined mediated communication as a social construction of the real world experienced through a mediated entity. The media create a presence for the viewer so as to transport the viewer, albeit for a brief period in time, through the story as if mentally part of the mediated experience. “Hypermedia immerses its participants in its simulated environments by its particular aesthetic potentials for clarifying and intensifying sense perceptions.”30 Barbatsis proposed that properties of hypermedia and the mediated experience are merging together in the online world to create a new form of transportation.

Green and Brock “conceptualize transportation into a narrative world as a distinct mental process, an integrative melding of attention, imagery, and feelings.”31According to Green, Brock and Kaufman, transportation theory “contributes to the conceptual understanding of enjoyment by helping to specify mechanisms underlying enjoyment, including (a) the phenomenological experience of enjoyment through immersion in a narrative world, (b) enjoyment through beneficial consequences of media exposure, and (c) the circumstances under which enjoyment is enhanced or reduced.”32 Green et al., suggest that transportation is a key component of the narrative world and is useful at aiding our understanding of how an individual enjoys the media experience. Individuals find their media experience enjoyable because “it takes individuals away from their mundane reality and into a story world.”33 In this regard, media are able to transport the viewer into a narrative world and thus create an enjoyable media experience.

Transportation has been measured using a 15-item scale that taps into cognitive, affective, and imagery involvement with the media experience.34 Although transportation typically investigates attitude change, it is possible that the experiential aspect of immersion created by the narrative world of static images with sound generates a theoretical link between transportation and enjoyment as opposed to attitude change.35 “In some cases, individuals may enjoy a media experience because they feel it has given them new knowledge or enriched their lives in some way, such as providing greater insight into an historical event or a philosophical problem.”36 Nabi and Krcman suggest that it is possible enhance our understanding of media enjoyment by employing the transportation scale with a single-item measure where the viewer rates how much he or she enjoyed or liked the media content; or, the viewer answers several questions about enjoyment of the media content (e.g., enjoyable, entertaining, likeable).37 Although transportation typically investigates attitude change, it is possible that the experiential aspect of immersion created by the narrative world of static images with sound generates a theoretical link between transportation and enjoyment as opposed to attitude change.38

According to Martinec and Salway, consumers traditionally view still images with text as a multimodal experience.39 Bolls reminds us that radio is referred to as theater of the mind.40 Similar to reading books, listening to the radio helps the audience member evoke a picture in their imagination. Much in the same way color changed the way we looked at photography, sound is also changing the way we look at still images. Therefore, it is possible that one’s presence in the narrative world is associated with how much a person enjoys the media presented to them. To the extent that participants are absorbed by the presentation of still images with sound, this rich media experience will engage the viewer in a sensory modality beyond just viewing a static image, and will lead to greater transportation and enjoyment. Accordingly, the following research question is posed:

RQ2: Is transportation related to the enjoyment of viewing still images with sound?


To carryout this unconventional look at storytelling, this pilot study employed a web survey to collect data regarding the entertainment and enjoyment value of viewing still images with sound. Following a similar practice employed by Huang and Marsiglio41, participants were recruited through multiple listservs related to photography, new technology and visual communication for professionals, academics and students through such communication organizations as the Association of Education in Journalism and Mass Communication, the National Communication Association, and the National Press Photographer’s Association, as well as students majoring in communication enrolled at a private Southeastern University. Criteria were that they were members of the above-mentioned listservs. The first email was posted on listservs in end of November 2006 one month prior to winter break at colleges and universities. A follow-up posting was sent February 2007 after colleges and universities returned for the spring semester. The survey took approximately 15 minutes to complete. All information remained anonymous and participation in the survey was strictly voluntarily. The survey link was open 24 hours a day for both time periods.

A web survey42 was constructed to collect the test data regarding viewing experience of photos with sound. An email message was sent to the above-mentioned listservs asking for participation and provided the link to the web survey for those interested in completing the questionnaire. The survey started on the home page with an introduction to the purpose of the survey and informed participants that a broadband connection, QuickTime, and audio capabilities were needed for completion of the web survey. Those participants willing to take the survey clicked the hyperlink that takes them to the survey.

A series of seven photographs with environmental sound were presented as QuickTime movies embedded in pages of the survey. No text or narration was provided. Photographs used in this survey have no motion or animation; they were just single images with environmental sound. All images and sound were gathered by the author and prepared as QuickTime files. Photos consisted of everyday subjects such as:

  • a photograph of Fort Lauderdale beach with birds flying, waves crashing at the shore with sound of waves crashing and birds flying.
  • a photograph of trolley in San Francisco with sound of trolley; photography of Golden Gate Bridge with cars passing and sound of cars passing by.
  • a photograph of people in a restaurant, people in image are blurred so that faces are not distinguishable, accompanied with sound of people in restaurant.
  • photography of people walking around a Park at dusk, people in image are not facing the camera and image is too dark to see faces, with sound of people talking and walking;
  • photography of Chinatown in San Francisco with sound from Chinatown.
  • a photograph of waves crashing Fisherman’s Warf, San Francisco, with sound of waves crashing.

Participants were asked to indicate their general impression of the photo using such items as: good, adds information, entertaining, and helpful; items were measured using a Likert-type scale (1 = very strongly disagree to 7 = very strongly agree).

After viewing all seven photos, participants completed a post-exposure instrument. Post-exposure items were adapted from previous entertainment research that contained 10-items for individuals to rate how much he or she enjoyed or liked the overall media content they just experienced (e.g., enjoyable, distracting, useful).43Previous entertainment studies have also used semantic differential scales to conduct post exposure items. However, due to the nature of the web survey software each item was placed on a seven-point Likert-type scale (1 = very strongly disagree to 7 = very strongly agree); which, according to Mendelson and Thorson, “provides for a wider variation in responses.”44 Participants also completed a modified version of Green and Brock’s transportation scale using a seven-point Likert-type scale (1 = very strongly disagree to 7 = very strongly agree).45 Only general transportation scale items were included as each photograph varied in content. The scale was defined by items such as, “While viewing the images, I could easily picture the events in it taking place,” and “While viewing the photos with sound I had a vivid image of the content.”


A convenience sample of participants were recruited through multiple listservs related to photography, new technology and visual communication for professionals, academics and students through such communication organizations as the Association of Education in Journalism and Mass Communication, the National Communication Association and the National Press Photographer’s Association, as well as students majoring in communication enrolled at a private south-eastern University. Accurate data regarding membership to the listservs as well as cross-membership is not up-to-date, therefore it is estimated that more than 900 members of the aforementioned listservs received an invitation to participate in this survey. In total, 219 submissions were received. However, the survey software showed only 154 respondents completed the web survey and only completed surveys are included in this analysis, a response rate of 17%. Demographic information revealed that respondents represented a broad spectrum. Among the survey respondents, approximately 54.7% were male and 45.3% were female. As for education, 3.3% completed high school, 21.4% reported some college, 2.7% were college graduates, 31.3% completed their master’s degree, and 41.3% earned a doctoral degree. The age of respondents also varied; 28.3% were 18 to 25 years of age, 19.3% were 26 to 34 years of age, 19.3% were 35 to 44 years of age, 19.3% were 45 to 54 years of age, and 13.8% were 55 years old and higher.

Research Question 1 sought to determine if viewing images with sound has a positive affect on viewing experience. Participants rated their general impression of each image with sound immediately after viewing. Items are summarized in a scale designed to measure the overall positive effect of viewing still images with sound (see Table 1).

Table 1Impression of Still Images With Sound
Good       80.5%
Adds Information81.9%

The Cronbach’s alpha coefficient for this measure was .88.46 Further, more than half of the respondents (51.9%) indicated if given the choice, they would prefer viewing still photos with sound, and just less than a quarter of respondents (24.0%) said given the choice, they would not prefer viewing still photos with sound. Thus, in answer to research question one, the majority of participants appeared positive toward viewing still images with sound.

Research Question 2 examined the relationship between transportation and enjoyment of viewing still images with sound. In order to demonstrate validity and reliability of the transportation measures adapted from Green and Brock,47 a confirmatory factor analysis was conducted on the seven transportation items and accounted for 71.70% (see Table 2).

Table 2Transportation Scale Items
 Factor LoadingM
While viewing the images, I could easily picture the events in it taking place..6362.78
I was mentally involved with the photos with sound while viewing them..7762.86
While viewing the photos with sound I had a vivid image of the content..8293.01
While viewing the photos with sound I became immersed in the content..7893.46
Viewing photos with sound helped reinforce the context of the story..7193.15
I wanted to know more about the story after viewing the photo with sound..6133.78
Do you think sound helped add reality to the context of the photo?.6573.06

The dimensionality of the transportation items was examined following Hunter and Gerbing’s48 recommended procedures for confirmatory factor analysis. Hunter and Gerbing argued that a valid unidimensional scale would consists of items that (1) contribute equally to the total score (internal consistency) and (2) contribute equally to the prediction of criterion variables (parallelism). The application of the recommended procedure resulted in a single set of items meeting the criteria of internal consistency and parallelism. The solution exhibits a simple structure with a clear solution, each item dependent on only one factor, transportation. Individual variables were explained fairly well (with communalities ranging from .613 to .829); and, achieved an acceptable level of Cronbach’s alpha of .95. Evidence suggests that this instrument reasonably explains transportation in the context of viewing still images with sound. The relationship between transportation and post-exposure enjoyment items were tested using the computation of Pearson’s bi-variate correlation and indicated a correlation (r = .677; p < .001) between transportation and post-exposure enjoyment items. Transportation and impression items were also tested using the computation of Pearson’s bi-variate correlation and showed a somewhat moderate correlation (r = .497; p < 001) between transportation and impression of viewing still images with sound. Thus, these findings show that transportation is an indicator of an individual’s enjoyment of viewing still images with sound.

Discussion and Conclusion

The present study is an exploratory investigation of the recent cultural trend viewing still images with sound. The evidence at hand indicates that the multimodal experience of still images with sound was received positively and for the most part, participants enjoy the overall experience of viewing of still images with sound. As suggested by Green et al., enjoyment of the media experience is further explained by transportation into the narrative world by the sensory experience of viewing still images with sound. “The presence of rich detail leads to greater transportation and enjoyment, perhaps because details allow individuals to form more vivid mental images.”49 Therefore, still images with sound creates a rich media experience, as opposed to just viewing a static image, and leads to greater transportation and enjoyment. Taken as a whole, the results of this research indicate that as new media technologies continue to enter the marketplace audiences are receptive and adapting to changes in presentation techniques associated with storytelling.

This change in perception from Lowrey’s research suggests that users of the Web have become familiar with the hypermedia culture of today and recent trends in visual storytelling. According to Barbatsis, “Coming to terms with a new medium of expression typically raises questions for communication scholars about how the experiential environments it creates are at once distinct from and similar to other forms of mediated communication.”50 McLuhan’s ideas that the “content” found in traditional media now “is made strong and intense just because” it plays out on the Web,51 occurs because it is the convergence of media that allows for a new medium to be born. While McLuhan articulated hot and cool media associated with different sensory modalities, it might very well be that by exploring sensory participation through the lens of media entertainment will we fully comprehend the significance of his writings. “The perception of reality now depends upon the structure of information. The form of each medium is associated with a different arrangement, or ratio, among the senses, which creates new forms of awareness. These perceptual transformations, the new ways of experiencing that each medium creates, occur in the user regardless of the program content.”52 Technology is growing smarter and empowering publishing, television, and music audiences with more choices to view and interact with media content. For this reason, the entertainment concept is a useful indicator of audience consumption and potential profit regarding the unconventional methods media employs to tell unique and compelling stories unimaginable via traditional media.

The evidence provided here is informative; however, research is limited in its scope due to lack of a comparison between photos with sound and without sound. Although the findings do indicate that viewing still photos with sound is entertaining, we cannot speculate how much entertainment value sound added to viewing still photos. Future research ought to consider the effectiveness of adding sound through controlled measures and possibly the addition of text and/or narration with still images to further our understanding of the effects of rich media tools as part of the storytelling experience. These findings also need to be interpreted with caution due to this study’s lack of a representative sample or use of a comparison condition. This research used a convenience sample that is not truly representative of Web users. The sample was selected to represent a grouping of individuals that could readily understand visual communication and emerging technologies. It is highly likely that the participants of this survey are more familiar with these technologies than the average consumer and therefore skewed the results to some degree. For these reasons, research should expand to general users of the Web to see if the entertainment concept is pulling users to websites that provide such dynamic content.

Further research might also explore the cognitive capabilities and psychological characteristics of individuals as predictors of media use and enjoyment of specific media; or, if participants that express a stronger need to keep up with the latest emerging media are more receptive towards these multimodal storytelling products. It would also be interesting to see how viewing still images with sound aids in both the recall of information and helps reinforce the context of the story. Such additions would ensure that the study would make more of a contribution and more clearly interpretable and conclusive findings. Therefore, with the refinement and further development of these emerging technologies, research should continue to track audiences to see if these new techniques for visual storytelling are enjoyable and add to the viewing experience.

This work represents an initial attempt to advance our understanding of the multimodal experience of viewing still images with sound. Media technologies will continue to emerge, as well as offer users of technology new outlets of expression; therefore, research in this realm should continue to address the affective components of media entertainment. In summary, it appears that the entertainment construct, specifically enjoyment, is a useful indicator in determining why users’ engage in this media experience. Although the media industry is still working out how best to incorporate current cultural changes to visual storytelling, audiences are not only receptive to these new storytelling techniques; they also find them enjoyable and for the most part entertaining. What this pilot study does provide is a starting point for building a more detailed and theoretical discussion as to why individuals are receptive to these visual storytelling techniques.

Michelle I. Seelig is an associate professor of communication at the University of Miami. This study was funded, in part, by a grant from the School of Communication, University of Miami.

Leave a comment

Your email address will not be published. Required fields are marked *