This question was posted by John earlier and I think it is an interesting one, so I re-post it here to get a separate discussion going.
Let's take the example of pology.com There I found two different kinds of postings: 1) photo essays and 2) articles. The photo essays basically consist of photos only, no text. The articles, on the other hand, consist of a lot of text and few photos which do, however, not correspond to the text. This is what I consider problematic from an anthropological perspective: to have no text to go with the photos and / or to have a dissonance between what is written in the text and what is visible on the photos. In German the later would be called a "Text-Bild-Schere" (lit. text-picture-scissors) because text and picture diverge like blades of scissors. For a photoethnography, I think, we need both: text and photos. But most importantly, we need them to talk to each other. This is different than saying the text should “explain” the photograph. If a photograph is good, it doesn’t need explanation. However, it should always be contextualized. What do the others think?
to kick off the conversation: i have encountered similar problems when writing for the photoethnography blog of the project i am working in (www.kalaureiainthepresent.org -i blog there as 'pwork'). The main problem, i think, is this: a photograph by itself, especially when it has to do with something material and not a 'situation', cannot usually convey the richness of ethnographic detail one demands of an ethnography. The question then arises whether one starts 'filling out' the gaps of the picture by providing information. To me, this is certainly useful, but it sometimes makes for bad prose and it does not really complement the photograph at the end of the day. Additionally, it might 'tie up' the picture to the text, and sort of degrade the image to an illustration of what is written. Something that helped me think about this was Sergei Eisenstein's 'The Film Sense', and his theory of montage. Eisenstein said, if i read him correctly, that when two elements are juxtaposed (in his case two scenes) they create a new 'gestalt', somethign more than their sum, and that's the bottomline of montage. So instead of thinking of image versus text in an either/or fashion, perhaps we could start thinking of text and image as elements of a montage that create a new thing, which is called... i dunno what. I think this is what John means when he says that text and photos 'should talk to each other' I found out that the blog format actually can be used for such exercises, when the picture is posted in a page and the text in a comment. this way i think the two are separated technically (meaning that one can focus at the picture first and then read the comment if she so wishes), but also inerconnected. I have done this also because only some of the pictures there are mine, and most were taken by our collaborator Fotis Ifantidis, so there was the issue of letting his work speak for itself and provide the ethnographic context for the interested reader.
I will stop here, i apologise for the shameless plug, but i think it is better if we all speak of our practical experiences of photoethnographic projects to keep the conversation going.
I am not sure in which ways a phtograph can communicate things. I mean, we take the visual 'truth' of the photograph as evidence of something that 'really' happened too much for granted. I think this was partly a reaction, at least in the past decade, towards what was perceived as anthropology's 'textuality'. People needed to break out of that, and with good reason. However, a picture is much more than a symbol or a sign. It has a number of layers, as Ryan pointed out so well, and so it has a million of possible visual referents. Think of the example Clifford Geertz gave in 'Thick Description' of a wink of an eye. Geertz borrowed the example from Guilber Ryle to show that a wink can mean a number of different things, and only within a specific context can its meaning be clear. I am not sure therefore if an image can really stand *for* something in a context-free environment. Additionally, the picture itself *is* something, meaning it is a material artefact, and people should refer to the conditions of its creation (eg how did the photographer end up there? did people concede in her taking their pictures? do people know these are going to be published? and so on). Now, i am not saying that all this should end up in a caption, because that would make boring reading at best. I am just trying to say that maybe we are putting too much weight on the truth of the picture without considering how this truth may be a plural rather than a singular one.