Visualizing author Marcel Proust's words: Fine Art Photography vs. Generative AI

Visual 1st Perspectives

January 17, 2023

Visualizing author Marcel Proust's words:

Fine Art Photography vs. Generative AI

By Alexis Gerard, photographer and co-host Visual 1st

In July of 2022 my partner Annabelle Matter and I traveled to the French region of Normandy to shoot on location a project we’d been researching for months: Translating into photographs a dozen paragraphs from a text by French author Marcel Proust (of “Remembrance of Things Past” fame) renowned for his ability to convey vivid visual and emotional impressions to his readers through highly detailed and insightful descriptions. The concept was to start from the mental images and emotions we personally experienced through reading Proust’s text, and to render them as closely as possible through real-world photographs.

The text we selected was a milestone article entitled “Impressions de Route en Automobile”, widely considered to have been a prototype for “Remembrance of Things Past”. Published in November of 1907, it narrates a part factual, part imagined, road trip in a chauffeur/mechanic-driven car between the small Normandy towns of Cabourg and Lisieux. From the standpoint of logistics and creative endeavor the locations it describes fall into three categories: Real places that still exist and can be visited, real places that existed at the time of Proust’s writing but were since lost to war or urban development, and imaginary places cobbled by the writer from a variety of impressions - some of which could be traced to their origins through extensive research. Each of these location types required different strategies and approaches, but in all cases our goal was to be true to our personal experience of Proust’s text.

The resulting photo essay was published in the Figaro Magazine in November of 2022, right at the time when the text-to-image AI phenomenon exploded on the scene. At the suggestion of Visual 1st Chair Hans Hartman we extended the project by “feeding” the same textual descriptions that inspired our photographs to an AI image generator. We chose DALL·E 2 for its ability to work directly with the original French text. Since the objective was to obtain a comparison rather than to achieve a predetermined result, the relevant paragraphs were input to the engine as-is, with only the addition of the terms “realistic color photograph” to guide the rendering style.

When:

January 19

8:00 am - 9:30 am Pacific Time

(17:00 - 18:30 CET)

Generative AI: From technology Showcases to real-world Apps

Ticket sale ends EOD tomorrow, Jan. 18!

What: Live demos, discussions, virtual

Expected 100+ Attendees: Startup founders and corporate executives in the photo & video ecosystem, as well AI-specialized solution developers.

Presenters:

Sofiia Shvets, CEO & Co-Founder, Claid

Ofir Bibi, VP Research, Lightricks

Champ Bennett, CEO & Co-Founder, Capsule

Lusine Harutyunyan, VP of Product, Picsart

Servi Pieters, Founder, myprint.ai

Dmitry Shironosov, CEO, Everypixel

Lisha Li, CEO, Rosebud AI

Yair Adato, CEO & Founder, BRIA

Buy your ticket now! More info.

Roughly speaking, about 80% of the results could be described as somewhat close, 10% well off base, and 10% quite good. Strangely, while DALL·E 2 always complied with “realistic” and “photograph”, it had more trouble with “color” and consistently offered several renditions in black and white. The versions selected for the five comparisons below are the ones that came closest to our own images. No enhancements were made to the content generated by the program.

1) In this paragraph Proust describes fleeting views of “old and wobbly” traditional Normandy houses that he glimpses from his moving car. He goes into some detail about the rose bushes and trees that grow alongside them.

Alexis and Annabelle

DALL·E 2

Surprisingly DALL·E 2’s flowers are not roses, but gladiolas. The trees which feature prominently in Proust’s description are a very minor part of this image. The architectural style of the house however is fairly convincing.

2) In this paragraph Proust describes the moment when the plain surrounding the city of Caen becomes visible to him from the road.

Alexis and Annabelle

DALL·E 2

In this case, no matter how many variations were asked for, the AI insisted on generating an aerial view rather than a ground-level view.

3) In this paragraph Proust describes houses at a location in the town of Lisieux that feature a particular architectural style and are decorated with carvings of saints or demons.

Alexis and Annabelle

DALL·E 2

The AI gets the architectural style absolutely spot on, but it misses the figurative carvings altogether, possibly because its training set didn’t contain these types of images.

4) Here Proust describes a section of the porch of a 13th century cathedral in Lisieux, which features a series of closely spaced stone columns. He visits at night, and his driver uses the headlights of the car to sweep across the porch so that Proust can see the carvings.

Alexis and Annabelle

DALL·E 2

The AI did a remarkable job of understanding the subject matter and lighting. However, since the text doesn’t specify a number of columns, DALL·E 2 is at a disadvantage as it doesn’t have our ability to go on site.

5) Here Proust engages in a cultural digression referencing the author La Fontaine as well as various painters. Rather than an actual scene, he describes an archetype of “voyager”, or traveler, who can be imagined riding a horse at sunset on a beach.

Alexis and Annabelle

DALL·E 2

Here the previous situation is flipped: While DALL·E 2 doesn’t have the advantage of being on location in Normandy, it does have access to plenty of images of beaches, sunsets and riders - whereas, at the time we were on location, no equestrians appeared at sunset.

On reviewing the results, Hans Hartman offered the following thoughts:

“If DALLE were human, according to Maslow, she’d master the creation of visuals from clearcut descriptions first and foremost before tackling the higher echelons of multi-faceted literary passages and turning these into esthetic pictures. But DALLE is not human, and is apparently firing on all cylinders, tackling multi-faceted and multi-interpretable literary texts as well as descriptions of well-defined objects.

Despite also producing a fair amount of unsuccessful visuals, DALLE proved at times to be remarkably capable of creating images that represented challenging descriptions, such as a particular architectural style, landscape composition, a scene with car lights sweeping across a porch, and a beach scene featuring a galloping horse.

Surprisingly, DALLE stumbled in some instances of well-defined descriptions – mistaking gladiolas for roses, and showing an aerial rather than a ground-level view.

Going forward, it’s safe to assume we’ll be seeing fewer and fewer of these types of mistakes (plus a variety of other misinterpretations and non-sensical imagery) and the battlefield will center around the artistic eye of the serious photographers versus that of advanced generative AI solutions. That battle has already started, as Alexis and Annabelle demonstrated. Place your bets!”

About the photographers/authors:

Annabelle Matter and Alexis Gerard are a fine art and travel photographer couple who favor cultural subjects, and whose visual style is both contemporary and rooted in history. Their images bear witness to the memories that places hold, the spirit that remains from the fleeting presence of people and events over the course of time. Their projects have included retelling the life of Napoleon through on-site photographs in the book "Napoleon l'Esprit des Lieux", translating Marcel Proust’s prose into photos for the Figaro Magazine, and contributing to numerous publications including GEO and Le Point. They are represented by AKG Images and Farmboy Fine Arts.

To contact Annabelle and Alexis please use the form at https://www.annabellematteralexisgerard.photography.

And one more thing...

data.ai. State of Mobile 2023. data.ai's new mobile app market analysis report contains a wealth of data pertaining to the mobile app market. You can download it for free here (registration required). It contains too much info to summarize in today's edition, but I'd like to highlight one particularly interesting demographics slide, which indicates the monthly active usage of different app categories for different consumer age categories.

From: State of Mobile 2023. (I added indicators for the video and photo editing app categories)

A much higher % of 18- to 24-year-old consumers use video editing apps than is the case for older consumers. Where is the video editing app that successfully attracts anyone over 24?

For photo editing apps these differences between age groups are less pronounced – except consumers age 45 or older of whom only a small percentage actively use photo editing apps. That percentage is even lower for their use of video apps. [What kinds of apps do they use? You guessed it: Grocery delivery and telehealth apps :-)]

In sum: We're still waiting for the killer [really: no pun intended] photo or video editing app that successfully addresses the needs of 45+ smartphone users; the use of video editing apps among consumers aged 25-45 is still not nearly at the level of their use of photo editing apps – it's the <25 generation that has fully embraced mobile video editing.

Best,

Hans Hartman

Archive & Subscribe | Share your news with us | Connect on LinkedIn

Visual 1st | Suite 48 Analytics