Jump to content

Commons talk:Categories

Add topic
From Wikimedia Commons, the free media repository
Latest comment: 12 hours ago by Prototyperspective in topic AI-assisted diffusion
This is the talk page for discussing improvements to Commons:Categories.
Archives: 1, 2, 3, 4, 5, 6

Merging categories with identical scopes

[edit]

A dispute has arisen over whether categories for the two Boeing aircraft construction number systems should be merged or left apart. For context, Boeing uses two separate construction number systems to designate their aircraft. As such, each Boeing airliner is given both a unique "manufacturer serial number" (msn) and "line number" (ln). Currently, there are separate categories for msn and ln (for example, Category:Boeing 747-8 (msn 37075) and Category:Boeing 747-8 (ln 1449) both refer to the same individual aircraft). Unlike an aircraft's registration, which can change multiple times throughout said aircraft's service life, an msn/ln pairing never changes from the moment it is built to the moment it is scrapped or otherwise destroyed. In a sense, an msn/ln pairing is the absolute identifier for an individual Boeing airliner. As such, an aircraft's "msn" and "ln" categories should always be populated the identical subcategories and pages with identical sorting. It is my understanding that this is a textbook example of COM:OVERCAT and should be merged. If I am mistaken, please let me know.

It should be noted that there are many thousands of "msn" and "ln" categories just like the above example, so if this is indeed OVERCAT, then it will be a huge effort to clean it up manually. Perhaps it could be a task for a bot.

Pinging Ardfern, who has been the other side of this dispute. - ZLEA T\C 03:16, 1 November 2025 (UTC)Reply

I think it's fine the way it is, the existence of these additional categories does no harm. Perhaps it would be good to link with {{See also cat}} though. - Jmabel ! talk 20:55, 21 November 2025 (UTC)Reply

Recommending populating categories at creation

[edit]

Regarding how the creation of categories, this page currently has

To create a new category:
[…]
2. Find images (or a gallery or other pages) which should be put in the new category. Edit this page, and at the end insert the new category reference. e.g. Category:Titles. Save the edited page. The new category appears as a red link at the bottom of the page.

So a user adding just one image to a category and then creates it, leaving it near-empty like that would be perfectly following this policy.

Creating categories containing only one file or very few files when there are more (or a tiny fraction of files that belong into it) I think is not beneficial overall. Such categories give people – who open them via Commons search, Web search, file cats, or category-subcategory browsing – a wrong impression of what's on Commons relating to the subject, aren't useful, and are basically misleading.

Thus, I suggest that text is added to the section Creating a new category where it's recommended that people also do a thorough search to find and add files in the scope of the new category before or directly after creating the category. It could also be worth considering making some effort to populate a category a requirement to mitigate more of excessively incomplete categories being created and facilitate more users properly adding files to new categories.

Currently, if a category has not yet been created instead of created with just very few files, then another user who creates the category will usually do a thorough search to check whether files are missing. This is a much rarer practice for categories that already exist as people assume the person who created it and other visitors likely already did so (not always of course, just more often/usually).

The Creating a new category section could also inform about techniques on how to find relevant files such as using search term + deepcategory:"a parent category that contains relevant files" in the search and/or about the tools Cat-a-lot and HotCat which can make populating a category much easier, quicker, and accessible than the antique plain wikitext-editing for adding categories that's barely used anymore but described in this subsection. Ideas and suggestions for the text could be added to this thread here. Prototyperspective (talk) 17:37, 21 November 2025 (UTC)Reply

I would agree that should be recommended, but it should be understood more as a "best practice" than as a command. I usually try to do that, but I can think of times I've skipped it, especially when introducing an intermediate category in the hierarchy to make sure a "leaf" category I'm adding gets an ancestor structure similar to other parallel categories. For example, if someone were introducing Category:John T. Williams memorial pedestrian crossing and Category:Howell Street, Seattle didn't already exist, so they had to create it, I wouldn't necessarily consider them obligated to see if they could further populate Category:Howell Street, Seattle. Great it they could do that, but far from required. - Jmabel ! talk 21:03, 21 November 2025 (UTC)Reply
One could also add it like that and then maybe think about whether to phrase it less like a mere recommendation. One could also name exceptions or broad principles/types of categories where this doesn't make sense.
if someone were introducing Category:John T. Williams memorial pedestrian crossing and Category:Howell Street, Seattle didn't already exist, so they had to create it, I wouldn't necessarily consider them obligated to see if they could further populate Category:Howell Street, Seattle. the better course of action would be to just add the category as a redcat if they don't populate it and the closest existing category such as Category:Public art in Seattle. Somebody who sees the category due to the other categories set on it or otherwise, can create the category if they also populate it. If your goal is to create a certain category, there is no need and no justification for creating misleading incredibly incomplete intermediary categories just because they fit on the category one wanted to create. The effect here is not the user populating a street category but not creating the empty street category but leaving it e.g. to users motivated & skilled to do so or users who frequently or routinely create street categories and know what to do and how. Creating empty or very incomplete categories at least at this point is unconstructive.
Also, I'd like to add that the guidance in that section currently describes things as if the HotCat gadget was not a default-enabled gadget. Prototyperspective (talk) 15:42, 24 November 2025 (UTC)Reply
@Prototyperspective: I'm having trouble following at least some of that response. the better course of action would be to just add the category as a redcat: not sure which category you mean. If Category:Howell Street, Seattle, I disagree. It's much more likely to get populated if it is visible when coming down the hierarchy from Category:Streets in Seattle. - Jmabel ! talk 19:53, 24 November 2025 (UTC)Reply
Yes, Howell Street, Seattle which you said you didn't populate. I outlined earlier and multiple times why it's problematic if categories are heavily incomplete. I could expand on that but I'm not sure if it was unclear or if you have anything to address those things. Such categories would be more likely to get populated if you leave creating them to those that do substantially populate them at creation. There are lots of fine-resolution subcategories around all the relevant topics already. Maybe it would be more likely to get a file here and there but it's not much of a help if it has just 2 or 4% of files. The course of action proposed here is to also populate the new subcategory you think should be set on a category you're also creating. The second best option would be to just use the closest category and leave creating the category to somebody who will populate the new category. That small category already has
Pedestrian crossings in Seattle
Boren Avenue, Seattle
Deer in art
Public art in Seattle
Monuments and memorials in Seattle
Denny Triangle, Seattle, Washington
Decorated pedestrian crossings
White road markings in the United States
plus a good substitute of the category Howell Street, Seattle which is one (or multiple) of the parent categories now set on it. Again, all fine if you put a sizable fraction of the files that belong into it into it but if not how could people even tell that's a stub category with <1% of files? It's not useful and obstructs populating categories as such is usually done when creating a new category. Prototyperspective (talk) 23:25, 24 November 2025 (UTC)Reply
No, I did not say I did not populate Category:Howell Street, Seattle. Please re-read my remark, which was in the subjunctive and referred to a hypothetical situation for a hypothetical user. - Jmabel ! talk 01:17, 25 November 2025 (UTC)Reply
You're right on that, sorry. I'm addressing the hypothetical then which you argued by/for, not what was actually done. Prototyperspective (talk) 15:33, 25 November 2025 (UTC)Reply
(In fact, we seem to have few pictures along that street; if we have more, there is no indication in their respective descriptions. I still think it is an appropriate parent for a crosswalk category, even if it makes for a barely-populated category.) - Jmabel ! talk 01:23, 25 November 2025 (UTC)Reply
we seem to have few pictures along that street; if we have more, there is no indication in their respective descriptions In such cases it would be totally fine with what was proposed if it's put into stronger language than a mere loose recommendation.
  • It's about some minimum level of checking whether there are files that belong into the cat and adding them (e.g. searching for name of category and then from these search results adding the relevant ones)
  • / about the fraction of files that are added to the cat compared to all files on Commons that belong into it
not about the total number of files in the cat (see when there are more in the original post) Prototyperspective (talk) 15:37, 25 November 2025 (UTC)Reply
In practice, I believe I usually do a fair job of populating categories I create, but I don't think that is incumbent on everyone who creates a category. - Jmabel ! talk 01:26, 25 November 2025 (UTC)Reply

AI-assisted diffusion

[edit]
This image should be in Category:Abkhazia/Cities in Abkhazia/New Athos/Iverian Mountain rather than in Category:Abkhazia directly

There are thousands of Categories requiring diffusion. Many editors, myself included, have diffused thousands and thousands of images in my wikicommons career but it looks like the backlog keeps growing. For the most part it's drudgery even though sometimes you learn something new or go into various rabbit holes trying to locate some ruins in the Caucasus.

This problem is related to but different from the problem of non-categorised media. We've been using bots to find categories for newly uploaded images lacking categories for 10+ years.

Would you use a user script that analyses images in a category with hundreds of images and suggests how to categorise them properly? Assume it's not perfect but has decent performance (e.g., it suggests categories for 80% of images, and 80% of suggestions are correct). Alaexis (talk) 11:47, 6 February 2026 (UTC)Reply

I would honestly love to see this, as someone who's made bulk category edits myself. A model parsing the image and relevant metadata could definitely be tested on a sample of images, and we can see how accurate the suggestions are and if it is worth pursuing this further. The two risks I am keeping in mind are either hallucinating non-existing categories, or hallucinating details about the image that are not present (for example, giving a specific species ID for an organism that can't be identified visually down to that level). Chaotic Enby (talk) 12:10, 6 February 2026 (UTC)Reply
Are you thinking of developing such or similar or is this only about hypotheticals?
Furthermore, please see Commons:Bots/Work requests/Archive 18#Auto-addition of inferrable categories. Prototyperspective (talk) 21:56, 7 February 2026 (UTC)Reply