A Dove, a Statesman, and Four Invisible Facts
In the mid‑1950s, India’s pioneering photo‑journalist Homai Vyarawalla captured Prime Minister Jawaharlal Nehru releasing a white dove at New Delhi’s National Stadium—a gesture meant to symbolize the young republic’s yearning for peace. Without Vyarawalla’s handwritten caption—“Jawaharlal Nehru releasing a dove, National Stadium, New Delhi, c. 1955”—the frame could be read as almost anything: a politician pointing skyward, a bird caught mid‑flight, even a staged performance. Those four bits of invisible context—Who (Nehru), What (releasing a dove), Where (National Stadium, New Delhi), and When (mid‑1950s)—instantly transform an eye‑catching photograph into a historical document.

Nehru releasing a dove at a public function at the National Stadium in New Delhi
See the image: Wikimedia Commons
Importance of Context: A Guide to Creating Perfect Image Titles and Descriptions with Phototager AI
In today's world of AI-powered search, images are increasingly surfaced and categorized by the words that surround them. But while artificial intelligence can detect shapes and colors, it struggles to grasp the deeper story within a frame—its culture, context, or emotional meaning.
This article is specifically designed for stock photographers and videographers, helping them craft meaningful, accurate, and culturally aware titles by providing essential “invisible” context: Who is in the picture, What they are doing, Where it takes place, and When it occurs. By supplying these key details, creators can guide AI systems to generate titles that go beyond appearance to convey real understanding.
We’ll break down each of the 4 Ws and explore real-world examples of context input in action.
To get a meaningful, accurate, and culturally aware title, you must provide the AI with the "invisible" context. The best way to do this is by clearly defining Who, What, Where, and When.
1. WHO: The People in the Picture
This is often the most crucial context for a title, as it's impossible for an AI to guess age, relationships, or specific heritage accurately.
Specify the Age Group
Avoid generic terms like "man" or "child." Providing an age group adds clarity and detail.
Instead of: "a child," "an adult"
Use terms like: "a toddler," "a young child," "a teenager," "a young adult," "a middle‑aged adult," "a senior" or "an elderly person."
Specify the Ethnicity, Nationality, or Heritage
Visual identification of ethnicity is unreliable and prone to stereotypes. You must state it directly. This is especially important for groups that may not be widely represented in training data or are visually indistinguishable to an outsider.
Examples of specific heritage an AI cannot guess:
- A person of Sámi heritage (indigenous to northern Europe)
- A woman of Basque origin (from the border of Spain and France)
- A family of Filipino‑Mestizo heritage
- A group of Kurdish individuals
- A man of Roma ethnicity
- Children of Māori descent (from New Zealand)
Example of Context Input:
"The image features two senior women of Japanese descent."
"A middle‑aged man of Peruvian Quechua heritage."
2. WHAT: The Specific Action
What are the subjects doing, and are they using any significant objects? The more specific the action, the more informative the title.
Instead of: "dancing"
Say: "dancing the Kolo, a traditional Serbian folk dance."
Instead of: "making food"
Say: "preparing potica, a traditional Slovenian nut roll."
Instead of: "playing music"
Say: "playing the kantele, a traditional Finnish stringed instrument."
Example of Context Input:
"They are performing a traditional tea ceremony."
"The person is harvesting saffron crocuses from a field."
3. WHERE: The Precise Location
Ground the image in a specific place. This adds a layer of authenticity and information that an AI could never infer.
Instead of: "in a city"
Say: "at the Kalemegdan Fortress in Belgrade, Serbia."
Instead of: "in the mountains"
Say: "in Triglav National Park, Slovenia, near Lake Bohinj."
Instead of: "at a market"
Say: "at the Central Market in Ljubljana, designed by Jože Plečnik."
Example of Context Input:
"The location is the ancient city of Petra in Jordan."
"They are inside the Postojna Cave in Slovenia."
4. WHEN: The Event or Time
Context about the timing or the specific event transforms a simple description into a story.
Instead of: "in the evening"
Say: "during the golden hour just before sunset."
Instead of: "at a party"
Say: "during the annual Kurentovanje festival in Ptuj, Slovenia."
Instead of: "in winter"
Say: "on New Year's Day 2025."
Example of Context Input:
"The event is the opening ceremony of a local film festival."
"This was taken during the autumn grape harvest (trgatev)."
Putting It All Together: When the File Name Is the Context
Even that single string can carry one—or occasionally two—of the four Ws that the AI would never guess from pixels alone. Below are four real file names (supplied context) and the richer titles they enable.
Supplies → WHERE
Resulting AI Title: “Lake Bled with its island church and distant Bled Castle, city of Bled, Slovenia.”

Description : A tranquil scene of Lake Bled in Slovenia featuring a charming church on Bled Island surrounded by mountains and reflecting waters under soft clouds.
Supplies → WHO and WHAT
Resulting AI Title: “Diverse Group of Women Enjoying a Bridal Shower Celebration”

Description : A diverse group of women joyfully celebrates a bridal shower in a bright venue with floral decorations, balloons, and light refreshments, creating a cheerful and heartwarming atmosphere.
Supplies → WHAT and WHERE
Resulting AI Title: “Traditional Lakhon Khol Dance Performance with Dancers in Ornate Costumes in Cambodia”

Description : Two male performers in ornate traditional Khmer attire showcase Cambodia's Lakhon Khol dance, featuring intricate movements, masks, and dramatic cultural storytelling on a vibrant stage.
Supplies → WHAT and broad WHERE
Resulting AI Title: “Traditional Japanese Wagashi Tasting and Tea Ceremony in Elegant Setting”

Description : A serene Japanese tea ceremony set with handcrafted wooden tools, ornate teapot, and traditional cups. Perfect for savoring Wagashi desserts in a calm, cultural ambiance.
Your Folders Are Your Superpower: A Two-for-One Solution
We've seen how crucial context is for a single image, but how do you apply that detail across an entire shoot without tedious work? We suggest a simple but powerful best practice: organize your projects into descriptively named folders.
Think "Chef Luka Plating Session" or "Kurentovanje Parade 2024." This habit solves two major problems at once. First, your own photo library becomes instantly organized and searchable—no more hunting for DSC_8459.NEF
. You can find exactly what you need, when you need it.
Second, PhotoTager uses that folder name as the perfect context. It can then generate accurate, consistent titles and keywords for up to 100 images at once inside that folder. You solve your organization problem and your metadata problem in one go, simply by naming your folders well.
Coming Up Next: A Practical Guide to Folder-Based Tagging
In our next article, we’ll walk you through the step-by-step process of using your existing folder structure to generate rich metadata at scale. We'll show you how this simple method solves two problems at once: making your own files instantly searchable while boosting their visibility for clients, editors, and search engines.
Read Part 2: Turn Your Folders into Smart Context: A Practical Guide