The Theory of Image Indexing to Tagging in Flickr

The whole doc is available only for registered users

Pages: 37
Word count: 9048
Category:

A limited time offer! Get a custom sample essay written according to your requirements urgent 3h delivery guaranteed

Statement of the problem

Data warehouse on the web: huge opportunity, huge challenge

Internet is an amazing tool in terms of content made available to everyone by some users. Considering that the reach of any website is (nearly) worldwide, the idea of making all sorts of collections has imposed itself very quickly and very efficiently considering the amazing potential of the tool.

But of course, to every strength its weakness: whereas the amount of data which was collectible was huge, the problems to index these data in order to make the database of use to other users became number one problem. Actually, the fact that users will use the data is already part of the problem.

The simple test of trying to find a specific Word document in somebody else’s computer or mailbox brings to the conclusion that it is very difficult to understand how someone classifies a document if you do not have, by any chance, the same references as him. The problem can be better explained through a little practical example. If you look in an office at how each employee is filing the documentation sent through e-mail none of them will have the same way of classifying although the documents sent are the same, and relate to subject which are already qualified by subject (like “HR, finance, project A, etc…). In fact, each person will use its “personal-as-fingerprints” set of contextual references to classify data. The conclusion to that is that for a simple Word document already included in an existing business structure which should provide each employee the same set of references, each brain will associate it to different subject according to its own set of associations which are related to feelings toward the subject, precedent experiences, etc…

So, for a defined batch of data, the classification will be different according to the use you would like to make of it. But then, the problem you could be facing is that once you have classified your data in a certain way which is suitable according to your objectives, you might want to use the same data again, but for a different purpose…which might mean a totally different classification…and that you have to do the classification job all over again.

Therefore, if you imagine that contributors to a data warehouse are millions of people of all kind of origins, culture, age, profession, etc….and that users are just the same variety, and if you add to this the fact that they all want to use the data collectively collected for different purpose, then you start to imagine the challenge encountered by websites as Flickr.

Classification challenges specific to images:

Let’s try to move into images and examine the controversy on the Rorschach inkblot test. The idea is to present each patient a set of inkblots which do not represent anything defined and therefore have no defined meaning. Each patient then is asked to explain what the image represents. The possibilities of answering are just as infinite as the imagination of the patient is. The assumption of the theory is that patients with similar intellectual patterns would react in similar ways to the inkblots. Apart from the controversy of validity of the test, the natural conclusion to the use of inkblots to diagnose intellectual patterns is that what an image evokes is a reflection of each individual’s intellectual context.

A very simple example is to imagine a picture of a car in front of a pyramid in Egypt. If your context is environment sensitive, you might look at that and think that the car represents oil consuming, and the pyramid in the desert the future of death and dryness that awaits us if we do not stop over consuming. But if you imagine that you are sensitive to civilization’s progress, then this picture might evoke the fantastic evolution that took place within the last five thousand years of human history: from pyramids to cars. A politically sensitive mind will lead you to think about Egyptian society at the time of the pharaohs, in comparison with 21^st century society, or to oppose tradition to modernity, or maybe if the car is an American one, to think about economic relations between the US and Egypt etc…In fact, the number of problematic which could be evoked looking at a picture like this is virtually infinite…and so is the number of words which could be associated to that picture if it was to be classified.

How does a human brain “read” an image? Stephen Palmer wrote a fascinating book on human visual perception[i]. Trying to understand the mental process of extracting information from images leads to aggregate the same cross disciplinary data as for any cognitive science branch, from fields as different as (among others) computer science, neurosciences, psychology, linguistic, sociology, anthropology, education, etc…Basically, his approach is to consider vision as “a kind of computation” occurring in eyes and brain through complex neural information processing.

Further work on computational vision (as part of research led on artificial intelligence[ii]) showed that images have both semantic and visual content. Through this computation, human brain is able to browse through a large set of images within a specific aim defined at start (like searching a specific image or images on a specific subject, or corresponding to aesthetic criteria like colour). Some experiments analysed the wide semantic categories driving visual perception, human perception of similarities between images, how human brain organises images in clusters, names each cluster. The basic results show that image selection is a largely focused process which extracts and selects from images only the interesting information and filters the others as non interesting. For example, the selection criteria can be dominating colour, or a pre-defined category like “birds pictures”…although the interpretation of words can differ from one individual to another: can a plane be considered as a bird (which means that “bird” is anything which flies) or not (which means that a “bird” must be something living)? These problematic lead us to the last challenge of image classification.

Classification challenges specific to the intended use of the images

The main issue classification intends to deal with is that any object which is not referenced in a way where you can easily retrieve it is…lost. Therefore, there is a significant efficiency challenge in finding the best classification technique adapted to images.

One of the main richness of the internet is its proteiform nature. In fact, the way its various features are used is just as infinite as imagination is, and the more information are susceptible to be searched, the more possibilities you have to combine them into a new use.

The classical idea of a “tree” classification in which the classification would be decided by the author of the picture (or the person who uploaded it on the website) would be of little use because there would only be very little chance that his classification would be useful to another purpose, or even that the picture would have the same meaning to another user.

As David Weinberger[iii] pointed out, there seems to be a new pattern in classification. After having urged experts to structure the knowledge of every field by creating tree classifications, the actual “early users” of websites as Flickr are forming a new “tagging movement” who prefers to build their own classification, even if it is not as perfect as the one which could be built by professional taxonomists. The lack of perfection is perceived as highly compensated by the fact that it is more adapted to the purpose of each user.

The idea of tagging is that classification does not anymore depend on a team of devoted taxonomists whose function is to decide how to organise clusters and how to name them, who will be in charge of keeping the classification up to date with evolution of the database…with the huge risk of developing a costly and inefficient tool. Tagging is a much more flexible solution by which users themselves do the work. Therefore, it is the use and the users that shape the classification in a form that is (1) not predictable and (2) shaped by user’s needs. Therefore, it will remain up to date. Basically, the idea of tagging is to forget about exhaustive listing of all information on a subject and enter into the information age, where the challenge is more about selecting enough relevant information among the floods of “noise” data, than gathering huge amount of references of which most are just time consuming unnecessary or even redundant.

Tagging systems improvements

Tagging is basically a concept used for creating labels for online content or images. If you are a registered user on a site like flickr.com, you can upload your photos on the site and also define labels for the pictures that are useful in the future. On subsequent visits, it is possible to search the site using the provided labels. It is also possible to provide labels or tags for the photos which are uploaded by others on the site. All the images tagged by a particular user are available at a single location for the convenience of all users.

The main objective of tagging systems is to help users to apply tags or labels to digital objects with free form keywords according to their choice. On the basis of these tags, it is possible to share these annotations and images with other users with similar interests. On all of these sites, tags are directly assigned by users who are members of that site and these tags are visible to the users for further exploration and searching. This instant visibility is useful for the motivation and feedback mechanism for the users.

Tagging systems are still an innovation, and therefore lack standardization. Each tagging system, according to the items tagged (images, websites, etc…) encounters difficulties in ensuring the quality of the tags provided, and the quality of the retrievals which will be made later on these tags. The idea behind this is the ever coming back problem of treating the enormous amount of information collected on the web in order to ensure that query’s results give adequate relevant results, free of “noise” and spams. In order to ensure this, various collaborative tagging theories and tools are used.

For image storage and retrieval, the most common method used is clustering. Using this method, the images are organized in groups according to their rank and further into the classes according to their features. If there are certain images which belong to the same date or category, then these are part of a single class.

The classes which have features similar to one another are grouped together to create clusters and each such cluster is stored with an identification symbol to distinguish it from all other clusters[iv]. When a user wants to look for a particular image, they simply must specify the category to which that picture belongs. The query based on category is entered into the system to identify all the clusters which are available in that category. According to different functions applied, it is also possible to find the distance between the query and the exact cluster to which the image belongs. The effect is to group together visually similar images in the results.

The methods utilized in indexing image storage allow users to distinguish between object, concept and symbol. The indexing text is chosen either according to nature of the objects or through the concept on which the images are based. Tagging is different from indexing in the sense that it is based on the needs and interests of users for an object or a concept. It is more useful for the end user because different people can be interested in different aspects of the images and if tagging is used they can store the images according to their own classification and needs. Tags can also be extracted from the resource or assigned by an indexer if not defined by the user uploading the image.

Objective of the research work:

In this research work, my main objective is to study how existing theories of indexing images apply to Flickr.

My work will consist in two phases: (I) a phase of data gathering, and then (II) a phase of data analysis.

The data gathering phase will consist in researching (A) existing theories of indexing images and (B) existing websites gathering images and their indexing methods.

The data analysis phase will consist in (A) comparing and contrasting the different methods of indexing exposed in theory and displayed on the web, thereby assessing their strength and weaknesses on tagging systems, (B) comparing available theories with methods and techniques used by flickr.com.

The final objective of my research is to try to assess whether the new features displayed by flickr.com are worth the interest of major e-players like yahoo.com because they are really innovating, and are rooting a little revolution in our indexing patterns, or if the method is only a trend before going back to traditional tree indexing modes.

Image classification theories

The fundamental evolution noted on the use of the web is the appearance of what some authors called the “semantic web”.[v] This idea of a web based on words more than a pre-set logic which should be learned is a logic development of all concepts aimed at facilitating web universe for non professional users.

The first sign of this evolution toward a user’s directed space were web search engines which allow users to search the web on the basis of a query in their own language. These search engines responded to the fact that the web structure was not adapted to manually supervised directories. The problem now is that word queries addressed to search engines only reach the metadata content which is still created by web masters, therefore rendering particularly important the last developments are tagging techniques’ applications. The important challenges which tagging theories have to face are selection of appropriate and relevant items on the basis of the tags, avoiding noise and spam, semantic gap.

Evolution from taxonomic model (folder indexing) to folksonomic model (tagging systems)

Taxonomic models

Taxonomy is the science of classifying objects in a hierarchical system[vi]. Taxonomic indexing models are the traditional way of classifying images. There are several reasons to this.

The first reason is that classifying was intended for scientific use, and that therefore, the idea was to create objective and universal classifications which were disembedded from any social relations[vii]. The idea behind this was for all scientists to use the same words and the same classification principles for each category in order to be able to work on the same basis.

The second reason is that before digital imaging even existed, pictures were paper printed. Therefore, their use was basically divided between a professional use and a personal use, and their manipulation was not as easy as with actual digitalized material.

Personal use was limited to basically creating photo albums on subjects as holidays or family events for example. These albums would only be seen by family members and kept as a trace of family history. Once they would have been created, they would not need to be reorganised later. Due to the limited amount of documents to be managed, there was no real need to implement document management rules.

Professional use, like photo database of a photo agency required the work of professional taxonomists who would decide how to index pictures according to the sole use that would be made of them, which is to be sold to customers wanting to illustrate an article.

Of course, these two examples are not the only ones, but basically, they are illustrating quite well the fact that taxonomy was adapted to printed images time: folder hierarchy provided what Barreau & Nardi (1995)[viii]qualified as a simple information model that is easy to navigate.

The third reason of the supremacy of taxonomy was cultural. Russel (2005) clearly stated that our Cartesian way of thinking leads us to believe that the only way to solve a problem is to use hierarchical top to down approach[ix]. Therefore, in order to classify correctly documents, there is a need for a sort of “super user”, an expert in classification who would be the only one to have the global view on the folders’ classification. This choice led to give the responsibility of the classification to one person (or one group of person). This way of thinking pertains to a hierarchical culture, in which the “top-to-down” approach seems to be the only one available and in which the objective is to obtain, when needed, an exhaustive image retrieval.

Therefore, the available categories would be chosen according to his value hierarchy and to his semantic environment. The idea of a closed environment is, however, valid when the use of the images is linear and predictable. The main problem of this choice is that it leads to never ending negotiation in order to decide which title each category should be given. And, once again, the critical point to mention is the volume of documents to be treated and the cost of such a categorizing.

Actual websites are organizing millions of documents. A closed taxonomy, in order to be exploitable, would need thousands of categories, which would probably make image retrieval completely useless if you didn’t have a complete view of the folder’s tree in order to see what keywords were available.

But culture has changed under the influence of globalization and ever increasing use of internet. In a world culture, the choice of predefined key words has become increasingly difficult, as potential users are of all countries and of all cultures, their semantic world is just too wide to be possibly captured by one deciding entity. Several research work has already been led in order to address the problem of cross languages image retrieval[x]. In these research paper, the idea was to determine how different software responded in terms of efficiency when asked the same query on a specific subject.

The concept of a tree structure to automatically arrange a set of documents based upon a certain set of indexes has been used for image retrieval in most of the sites. The images are provided captions while storing and then some common terms are extracted from the image caption to create an index to make the images searchable and more easily explored. This is the extension of the structure used for storing documents and their links on the sites. If the images are stored after indexing on the site then it is more convenient for the users to search the images according to their requirements.

A tree or folder structure is at the base of most image storage systems. Ultimately, an image is categorized based on a single concept or classification, although this is often invisible to the end user. This is simply the nature of a filing system, whether it is electronic or not.

Tagging provides a way of moving from the cardboard catalog into the age of web based communities where classification authority is given to users through tags and algorithm are used not to do an automated classification, but to analyse users’ contribution.

From taxonomy to folksonomy

The main problem with taxonomy is that our civilization has shifted to a civilization of information, internet and globalization. Users’ needs have shifted from finding information on a subject to finding the right information and separating it from the non relevant “noise”, and then also sharing it with others.

Another key element is the appearance of digitalized cameras. Their joint use with internet image repositories led to a situation where the image production exploded and the number of images available online became just unmanageable on the basis of taxonomy.

The idea behind taxonomy is to leave the responsibility of choosing categories to someone who is not the user. But this is only possible on a limited amount of material, as it would be too much time consuming to pay someone to index the millions. Therefore, researchers tried to find ways to organize automated description of a picture which would lead to automated labelling. . However, even if computerized image recognition was made possible, the conclusion in terms of automated description of an image was that the description was not equal to the addition of each element[xi].

In fact, behind the question of automated labeling appeared the fact that image retrieval was dependant on the interpretation made of the image and the words associated to it. The idea behind this was that different users would use different words to describe the same image (called the “vocabulary problem”[xii]) and therefore effective automation was not possible.

Tagging appeared as a shift in the taxonomy concept which is usually known as folksonomy. A tag is a relevant keyword associated to a piece of information. Tagging enables users to associate their own keyword to an internet resource without having to choose from a controlled vocabulary[xiii]. This new way of creating metadata for the user has become increasingly popular because it enables users to rely on other users when it comes to assessing the quality or the relevance of an information item. The idea behind this is that users do not rely anymore on the expertise and authority (Russel:2005) of a “librarian”, but prefer to rely on other users.

Folksonomy theories

Folksonomy, a contraction of the words “folks” and “taxonomy” attributed to Thomas Wanderval, allows people to tag essentially web objects using their own semantic field. The superiority of this classification is that it will allow them to retrieve information easily as it will be organized following their own logic. Folksonomy has been divided between broad Folksonomy and narrow Folksonomy.

Broad Folksonomy is when an author creates a web object and makes it available to the rest of the community. Some members of the community, if interested by this element, will tag it in their own terms and other members will from then on, be able to find these object based on these tags. Broad Folksonomy enables to see dominating trends in tagging on one object. The benefits of this method can only be attained if a wide range of users tag the same item.

Narrow Folksonomy, which is the tool represented by Flickr enables to tag objects which retrieval is not easy. In this model, only a few users tag the object (usually the author), which enables other users to get back to the item. The objective of narrow Folksonomy is not to define dominating semantic concepts, but to associate a tag to an object which is not easy to define. These tags are used to group images into relevant categories, in order to facilitate researches[xiv].

There is no such discussion about the intrinsic value of Folksonomy, but there is one on the relevance or efficiency of models like Flickr which do not use broad Folksonomy, as they do not take into account the authority dimension in communities, which are defined in contextual authority tagging.

Contextual authority tagging is a branch of Folksonomy because it tries to conceptualize the cognitive authority dimension present in all community.

The legitimacy of other user’s authority in tagging has been defined by Russel in 2005 as “how people trust one another’s opinions and thoughts”. In fact, authority can be divided between administrative authority and cognitive authority[xv]. Administrative authority relates to acknowledging that someone is allowed to tell others what to do, while cognitive authority is granting someone knowledge authority in a specific area. Tags grant cognitive authority as they show who is an authority in specific fields and not who is “in authority”. The main idea behind this is obviously that, in order to be able to separate relevant information from the usual “noise”, internet users rather rely on someone who is a user also and therefore should share the same point of view.

Our usual way of assessing information value is determined by the authority we give the author on the specific subject. Muller [xvi]described authority relationships as mainly built through community members’ contribution to the work of the community. Each member, therefore, has a reputation and can be trusted to give relevant information on specific areas of knowledge whereas they are perceived as not having any authority in other areas. This does not mean that they do not have any knowledge in this area, but only that they haven’t been given the opportunity to demonstrate it through their contribution to the community. Tagging analysis algorithm are looking into translating this reputation concept into calculable value. The consequence is that through this process of aggregation, there is a considerable loss of information, but it is deemed to be compensated by the relevance of the retained selection.

The idea behind being able to value authority on a specific subject is to be able to select within the community the experts on a specific topic (which can vary from one subject to another), and not have to rely on the same ones for all subjects. These models only work on the basis of big number community which is precisely what web based image repositories are.

The idea behind conceptual authority tagging is widely used on websites when some users rate other users on specific criteria. For example, on ebay, sellers are given a satisfaction rate by their customers. This is obviously not a very relevant method for image repositories, as it would be rather difficult to rate the ability of a user to tag a picture. Therefore, a theory which is applicable to images is social tagging.

Social tagging, also called collaborative tagging, is the process to assign keywords freely to individual and shared content in order to facilitate later retrieval. (Golder and Huberman, 2006[xvii]). The problem with social tagging and user generated description in general is that semantic relation between words and related item is imperfect. Therefore, it would probably be useful to combine social tagging either with contextual authority recognition or with broad Folksonomy principles. The superiority of social tagging is that it allows end users to use their own vocabulary. Social tagging of images has not been widely studied in literature[xviii]. The general problem of accessing images remains largely unsolved due to the inadequacy of current indexing tools. The traditional indexing tools are irrelevant, due to lack of consensus on the taxonomy to be adopted[xix]. The main challenge at the moment for researchers is to try to fill the gap between index terms assigned by professional indexers and semantic field used by end users[xx].

The idea behind social tagging is that, as professionals are not able to find indexes that suit user’s requirements, it would be a better idea to organize a democratic consensus between users in order to choose the index categories (which is the basis of Folksonomy). One of the many benefits of social tagging is that it spreads the indexing workload between all users, which is a real economy.

Recent debates on tag aggregation and semantics[xxi]focused on the questions of quality of the metadata produced and possible trade offs between Folksonomy and ad-hoc ontologies[xxii].Other contributions are calling for a domestication of social tagging[xxiii]. In their opinion, one of the main problem with social tagging is that words used by different people can have different meanings (vocabulary gap) or different words can be used for the same meaning (synonyms). In order to find a solution to this, they suggest to introduce analysis of the social network in order to define semantic similarities. This would establish a link between users, tags and resources.

Quintarelli[xxiv] (2005) suggests that the main strength of collaborative tagging is that it may be the only way of cataloguing significant amount of web content in a timely and cost efficient manner. The main idea behind this is to allocate responsibility for classification hierarchy to a democratic and neutral mechanism.

Image repository: Flickr.com

Image repository websites are common on the web, but as most of them are very basic and classic tools, they haven’t attracted much attention. Sites like www.shutterfly.com provide basically a facility to edit photoalbums online, share them with friends or authorized group online, but they do not pertain to a massive image tagging system. Considering that the objective of this research is to confront Flickr.com to actual image tagging theories, this research will be centered around this website only.

Why chose to restrain to flickr.com? Maybe because in 2007, flickr.com ranked 32^th in total web traffic worldwide (40^th a year ago), which attest of its success and attractiveness to users. [xxv]The increasing success of a website like flickr.com stands on its innovative concept of mass social collaborative tagging. The concept is not only about sharing your own resources with close circle of friends and relatives, it is about sharing by default your pieces of art or your proudness to be a new father with the community. Of course, there is a possibility to restrain access to a designated group, but it is not the default setting.

Website Description:

What is Flickr?

There are many sites like Flickr which help people organize images and photographs, allowing them to be stored and shared by family, friends, or the general public.

These photographs can be arranged and structured by assigning tags to them when they are uploaded to the site. These tags are also useful in sorting the photo collection according to their index text and values. If these pictures are stored in a folder-based system, which uses a tree-like structure, then it is not possible to sort these photographs in groups according to different conditions.

Flickr.com screen capture

The tags are used by the people so they can place images in multiple groups; these groups can be used by everyone to perform a variety of tasks. On some sites, the links to the images are stored as bookmarks and tags are used for these bookmarks. This method allows the users to categorize based on the tags and their classification is done using shared spaces for groups of users to explore and create new groups of tags.

Part of the popularity of sites like Flickr easily comes from the easy accessibility of the images stored there. Even without an account, a user can browse the site and find a variety of images suited to their tastes. They can do a search based on keywords, or they can browse by tags. Flickr maintains a list of its most frequently used by those looking for a particular kind of image. The more popular images get a larger font, setting them out from the rest. The site even goes so far as to break it down over the last day and week.

Beyond tagging, Flickr also supports clusters of tags. For example, under the tag “culture,” Flickr suggests several clusters that include several other tags. One of the clusters they suggests are images tagged as involving people, travel and museum as well as cultures. This concept presents additional challenges in indexing.

Both the clustering concept and the ranking of tags by popularity present indexing challenges. The users expect speed and efficiency, which tagging and clustering helps make happen from their point of view. Tracking the popularity of an image, or worse a group of images, can be a little more tricky but is necessary for the satisfaction of the end user.

How to use flickr?

The first use of flickr.com is sharing pictures with a defined group of friends or family. You are allowed a certain disk space per week and may upload as many pictures as you want withing this limit. Once your photos are uploaded, you are asked to give them a title, then tag them, and even you may give a description to the meaning of the photo. After that, you may create sets or batches of photos according to your needs and give them all the same tag at once. Each set/batch can also share access grants. The question of who accesses which photos can also be taken upside down, as flickr gives you the possibility to create groups or to belong to groups who share photos on a specific subject or who might be doing an activity together. Each group has a pool of photos in common. Photos can be attributed a place where they have been taken on a world map. All access to photos are controlled, even tagging permission.

These described features are only some of the ones available to user who wishes to upload his photos.

User who only wishes to look at other’s pictures can also do it, tag authorized photos, etc…

The sharing comes when you can see the activity which has taken place on your photos, other user’s comments or tags. You may create artists groups who will want to share best pictures or even invent games like chasing for a particular photo on the website.

It would be too long to describe all the possibilities of a website like flickr. The richness of possibilities and the flexibility of the tool is what makes its value and probably its success, but there is probably still room for improvement on the basis of existing theories.

Comparison between Flickr tagging method and existing theories:

Why compare flickr.com with existing theories? Flickr.com was not born from research, it was born from a creative and successful accident. Basically, it’s creators were working on a computer game software when they decided that it would be fun to have a tool which enabled player to take snapshots, make it available to other players and allow space for player’s tagging on the snapshot. They ended up having more fun tagging photos than playing the game. The idea behind the accidental creation is that reality and success have to be explained after existing. There is no such thing as prior research, development and marketing research with flickr.com. Theory had to analyse the picture tagging phenomena, systematize it, and try to propose enhancement from their conclusion.

Flickr can neither be put in the category of taxonomy nor in the category of Folksonomy. It is a mixture of each.

When you upload a photo on flickr, you are asked to provide a title, a tag and maybe a comment. Then you may add location of where the picture was taken, and even date. These features are qualified by theory as multifaceted tagging: the information is entered by the user, but there is a facilitation tool which directs his inputs in order to generate an information which is easily analysed. For example, if you take a photo of a goat in the Alps, you might enter a title like your landowner’s name, then tag it “goat, alps, landowner”, and comment like “my landowner refused to repair my door”. Location will be Alps and date the day the picture was taken. Flickr is therefore providing a collaborative tagging tool in the sense that when you tag, you won’t put the title or location, which have their own input location. Therefore, you will have to tag with other related areas.

Flickr has some folder related classification, as it provides with information as location (you can then retrieve photos from Africa or Germany only), or date. But it has also tagging features which open to clusters in order to be able, if you enter a query which is not satisfactory to you, to broaden your research in related directions.

Therefore, and as clusters are created on a statistic basis, there is also a relation to Folksonomy theory. Clusters are categories automatically created by an algorithm on the basis of user’s tagging. They are not a single word naming a set category, but they are defining categories on the basis of occurrences entered by multiple users, which is the definition of Folksonomy.

So the answer to deciding whether flickr pertains to tagging or to taxonomy is that it definitely pertains to tagging, but that, in order to manage to rationalize tagging and to attain a certain Folksonomy, the site features some sort of objective and pre-defined entrance fields for collaborative tagging.

Between the two extreme theories represented by taxonomy and tagging, there are multiple possibilities of mixing both, like using multifaceted collaborative tool asking users willing to tag an item to enter also objective data as date, place, etc.

Theories divided Folksonomy between narrow and broad Folksonomy. The objective of broad folksonomy is to gather enormous amount if tagging information and treat it through an algorithm which will isolate best used terms and use it as categories. Narrow folksonomy is mainly used in order to identify each document in a context of enormous volume treated. The idea behind this is that it would not be possible to do it if users didn’t tag their pictures.

The question coming up is why not apply broad folksonomy algorithm to the already existing volume of documents and set defined categories in which later users could range their items. This would render the tool obviously too rigid. In Flickr, clusters may change from time to time, and collaborative tools provide ever changing suggestion of most used terms on the site. This helps the classification to remain lively. Therefore, if users of Flickr in 2006, were interested in travel photos, they might in 2007, be more interested in showing their newborn babies, or rugby world cup photos, or whatever could become fashinable at a time. In order to keep the website alive, which is key to maintaining and even enhancing traffic, which is drastic on a website living from advertising, it has to be flexible enough to attract users to come back several time a week to see new items uploaded, to see new suggestions of search, or even new sources of amusement like games. The idea behind this is about the fact that internet users are to come back frequently to the site either if they have a use for it (hence the various task which can be performed on the site) or if they expect the site to provide with new ideas of exploration or new uploaded photos. On flickr, newly uploaded photos can be selected on the front page, whereas most used research terms are provided also.

The idea of featuring the latest/most demanded requests is very interesting on the side of social authority. Social pressure is inherent to humanity: whenever we see a certain amount of persons doing something, we acknowledge that it is the right and interesting thing to do: it becomes fashionable.

Therefore, it is very clever of flickr to provide front page with most used terms as it will convince some users to search also in that direction and create communities on specific subjects which will be seen as fashionable.

Of course, if the system is manipulated, it could lead undedicated user to attract attention of other users without any reason or for wrong reasons.

Conceptual authority tagging is a concept which has little place within flickr, maybe because there is no actual need for a cognitive authority on a photo repository. The underlying idea of cognitive authority is to ensure that the quality of the tags is maintained. For example, if tags are assessing online career advice websites, it is important to know that the tagging author is working in the field, and is know for having a competence in career advice. A 13 year old who would never have worked would most certainly not be deemed an authority in this field. Whereas if the tag is supposed to relate to gaming website, the same 13 year old could be the person to trust for advice.

As describes above in the problems linked to photo descriptions in words, there is basically no such thing as cognitive authority for pictures as tagging is more about feelings and what the pictures inspires the tagger.

Studies trying to implement automated tagging have stated that there was no such thing possible as automated tagging with the richness of semantic attribution. The idea is that even if the computer is able to recognize shapes, colours and even to deduct a certain amount if things from what it recognizes, artificial intelligence does not have imagination or creativity yet which would enable it to tag contextually a picture representing different items. In the introduction example of the Pyramid with a car in front, the automated tagging tool might recognize the Guizeh pyramid and even the brand and colour of the car or the time of the day, but it will not be capable to give it a symbolic sense as confronting antique past to modernity. Therefore, if flickr was to grand cognitive authority to a tagger, it would have to measure its creativity and how other users appreciate it. Whereas popularity can be measured, creativity, imagination, and contextual tagging are still escaping algorithm.

Therefore, the only thing which might be a solution in this context is to tag the tagger. This would mean, on flickr, to ask users to assess quality of the tags created by other taggers…and while it would probably prove useful in certain context, it would probably be of no use at all considering the highly subjective quality of such assertion. The tagging of the tagger would probably not be of more use than the tag itself. Therefore, there is no need on flickr to evaluate cognitive authority of someone.

In fact, there is a use in tagger’s assessment: avoiding spamming. But this kind of tagger’s assessment would be limited to a negative assessment in case of discovering spamming. For example, if you were to search for Egypt and reach photos of completely unrelated business like photos, you would probably be disappointed as these unwanted items would be time consuming and spoil the efficiency of the search. Therefore, there should be a tool in order to signal to site administrator that you’ve discovered a spammer.

Tagging on flickr uses the open vocabulary associated with collaborative tagging applications (Weiss, 2005)[xxvi] Users may use their own word in their own language, and describe their photos just the way they want.

One could object that the variety of words existing should deteriorate the quality of the retrieval, but chosen tags are proved to follow distribution laws and, on a massive basis, converge on defined terms.[xxvii].

Automated tag clustering is using this matter of fact in order to try to avoid the problem of vocabulary used by taggers. This pitfall is unavoidable in a context of free word tagging as in language, it is natural to use a certain lexical field in relation with the context. Basically, words do not stand alone and refer to one another in order to produce a meaning which is different from just taking the meaning of the two words separately and adding them. Moreover, a single word be used in different meanings when referring to different things. Therefore, Begelman et al (2006)[xxviii] stated that automated tag clustering had to account for lexical relations in order to enable search engines in tag space when “many people of various background collaborate”. Their idea is that tag clusters in Flickr are even superior to what would be a closed category as, when user is searching under a specific vocabulary, it might come useful to him to look in the list provided with the related terms cluster. If maybe he does not find the documents he needs under the terms he selected as relevant for its search, the cluster might provide him with related terms which will be the ones used by the majority of users for the same concept. Therefore, by using them, the search will become more efficient.

Another way of using the theory of related lexical is to browse a tag cloud of more or less popular tags used. The cloud is easily generated by a visual representation: the more the term has been used, the bigger its transcription on the page is. Therefore, a tag cloud provides many useful information to users: trends in tagging, words used for specific concepts, or only a suggestion for new exploration. Though a great idea, Begelman’s assessment is that the cloud provides too much information without enough structure, and therefore, it is hard to make an efficient use of it. One or two popular tags dominate the cloud and most other terms are related to these two. In fact tag cluster seem to come in more handy as they are somehow already coming with a structure. In order to obtain these clusters, flickr uses common statistical tools which measure terms which occur together most. Instead of creating a tag cloud, it creates many tag aggregations which a supposedly referring to the same sort of subject.

Flickr.com efficiency valuation

Evaluation tool

In a paper trying to classify theories on tagging, Cameron Marlow[xxix]designed and interesting tool of tagging system valuation.

It evaluated first system design and attributes (objective efficiency) and then tried to measure user’s incentive (satisfaction). Design and attributes were evaluated on following criterias:

tagging rights/suppression rights which is system’s restriction on group tagging, which range from self tagging only (author is the only one to be allowed to tag), to free-for-all tagging (any user is allowed to tag any resource). This is very important in terms of quality of metadata produced: a tag on a picture can vary widely whether it is tagged by a family member, a friend, a complete stranger or the author. Obviously, the broader the tagging rights are opened, the wider the tag ranges are and the lower the quality is.
Tagging support (mechanism of tag entry) which covers blind tagging (tagger cannot see previous tags on the same document), viewable tagging (tagger can see previous tags), and suggestive tagging (system suggests possible tags on various basis). The first idea behind this is either to try to broaden the tag ranges or to try to converge quicker to a folksonomy. Statistical research showed that the more tagging is supported, the more it is likely that the highest ranking tag will be chosen by user, as human brain has a natural tendancy to accept as correct the term which has been chosen by a majority (social proof phenomenon)[xxx]. Assisted tagging may also influence tagger because it needs less effort to adopt an already existing term than to create a new one[xxxi]. The second idea behing assisting users when they are tagging is aimed at trying to compensate the inconvenience of free tagging versus faceted expert driven description. Basically, faceted description of an image would be to give date, place, author’s name, country of the subject, etc…which are objective information on the item. The gathering of these information is intended in order to avoid chaos and having the user “muddled in a hodgepodge”[xxxii]
Aggregation (clustering) mode: which range from no aggregation (allowing all the semantic problems to arise) to a clustering type as used in Flickr in order to enable the use of statistic analysis on tag use.
Resource Connectivity: this measures whether tags are the only link between resources or if they are connected through another criteria (like assigning photos to groups in Flickr). Connectivity enhances retrieval relevance in case of connection of resources to an assigned context.
Social connectivity: this measures whether users are connected by something else than tags. This criteria may imply definition of local Folksonomy.

User’s incentive are influenced by a system’s design: some users tag for themselves only when others appreciate the social component of tagging, therefore motivations underlying tagging vary from people to people and from system to system.

Motivations to tag can be social and/or organizational, and vary on the basis of the users and intended use. Most users will, in fact, act upon different motivations at the same time:

Future retrieval (like tagging newborn baby pictures “Marie-Caroline” on Flickr in order for friends to be able to find them online at ease)
Contribution and sharing: add tags to conceptual clusters in order to help future user, whether this user is related or not to tagger
Attract attention: users might be influenced to contribute massively in order to influence content of “tag clouds”, the highest level being undesirable tag spamming.
Play and competition: groups enter sites like Flickr with a search mission and tag findings accordingly.
Self Presentation: users might be able to leave a profile online in order to mark specific resources
Opinion expression: some sites allow users to express their views on the document.

This group of criterias is a basis for constructing an evaluation tool in order to assess efficiency of Flickr, but it is not very developed and should remain on the basis of a quality assessment but is more difficult to change into an algorithm enabling assessment on the basis of figures.

Another proposition of assessment tool was made by Yahoo!Inc team Zhichen C et al (2006) who more particularly wanted to assess capacity of the collaborative tagging research tool to part relevant information from noise and spam. The criteria they discussed and retained were: multi faceted tagging (which would enhance retrieval capacity) to least effort (cost efficiency) and high popularity (which would be a criteria of tag quality, an assumption that the tag is not a spam, and assurance that the tag can be re-used easily). Their research showed that the algorithm was giving intended results.

In order to improve the quality of the tagged resources, they also suggested to implement collaborative tagging and also context generated tag proposals, to attribute a calculated popularity rating for each user in order to assess quality of his tags.

The idea of tagging people has been largely developed in literature[xxxiii] in view to keeping a track of employee context in big corporations (meaning where they are based, which project they are working on, etc…). This idea can also help improve tags analysis, as tag’s author can be tagged himself by other users who will assess quality of his tags.

Assessing quality of tag is, in itself a challenge. Whereas tags taxonomy was discussed by Golder and Hubermann (2005), criterias for tag quality in order to implement collaborative tagging were a new field and should be retained as an assessment criteria.

In our opinion, automated tag clustering should also be part of an assessment tool as in a free tagging context, collaborative tagging tools are fundamental in order to ensure tag quality.

Assessing efficiency of flickr.com in terms of tagging tool

[i] Palmer, S (1999): Vision Science: Photons to Phenomonology, MIT Press Cambridge MA

[ii] See works of the Berkeley Computer Vision Group, http://www.eecs.berkeley.edu/Research/Projects/CS/vision/vision_group.html

[iii] Weinberger, D (May 13, 2005): “Why Tagging Matters”, Harvard Berkman Center for the Internet and Society,

[iv] Begelman, G et al (2004) Automated Tag Clustering: Improving search and exploration in the tag space, Israel Institute of Technology, Computer Science Departement.

[v] Zhichen C et al (2006) Towards the Semantic web: collaborative tag suggestion, Yahoo! Inc

[vi] http://en.wiktionary.org/wiki/taxonomy

[vii] http://en.wikipedia.org/wiki/Taxonomy

[viii] Barreau D & Nardi B.A., 1995. Finding and Reminding: File organisation from the desktop, SIGCHI Bulletin, 27(3) 39-42

[ix] Terrel, R, 2005. Contextual Authority Tagging: Cognitive Authority Through Folksonomy, INLS 302, Gollop, 7-8

[x] Pasley, R.C., “Use of image in understanding of documents in cross language information retrieval”; MCSI paper, University of Scheffield.September 2003.

[xi] Romanan M et al., “Image indexing using the general theory of moments”, Technical University of Cluj-Napoca, 2002

[xii] Furnas et Al. “The vocabulary problem in human communication system”, Commun, ACM 30, 11(1987)

[xiii] Marlow,C et al “Position paper, tagging, taxonomy, Flickr, Yahoo Research Berkeley

[xiv] Vanderval, T, http://www.personalinfocloud.com/2005/02/explaining_and_.html

[xv] Wilson, P (1983). Second Hand Knowledge:An inquiry into cognitive authority, Greenwood Press, Westport, CT

[xvi] Muller, N (2003), The role of Authority in the Governance of Knowledge communities, Danish Research Unit for Industrial Dynamics

[xvii] Golder S.A, and Hubermann B.A. (2006) Usage Patterns of Collaborative Tagging Systems, Journal of Information Science, 32, 198-208

[xviii] Rorissa, A (2007). Comparative study of Flickr tag andindex terms in generalimage collection, University of Scheffield, Department of Information Studies,

[xix] Chen, H, and Rasmussen, E.M.(1999), Intellectual Access to images, Library Trends

[xx] Trant, J (2006), Exploring the potential for social tagging and folksonomy in art museums: Proof of concept, New Review of Hypermedia and Multimedia, 12, 83-105

[xxi] Mathes, A.(2004), Folksonomy:Cooperative classification and Communication through shared metadata. UIC Technical Report

[xxii] Shirky, C Ontology is overrated, http://shirky.com/witings/ontology_overrated.html, and Merholz, P. Clay’s Shirky’s viewpoints are overrated, http://www.peterme.com,/archives/000558.html

[xxiii] Walker, J (2005), feral hypertext: when hypertext literature excapes control, Hypertext 05, ACM Press, New York

[xxiv] Quintarelli, E (2005) Folksonomies:Power to the people, Retrieved August 2006 from the International society for Knowledge Organization, Italy-UniMIB meeting archive at http://www.iskoi.org/doc/folksonomies.htm

[xxv] Alexa.com, Top 500 websites, retrieved august 26th 2007

[xxvi] Weiss, A (2005) The Power of collective intelligence, netWorker 9(3), 16-23

[xxvii] Voss,J (2006), Collaborative Thesaurus Tagging the Wikipedia way, retrived October 23, 2006 from the Collaborative Web Tagging Workshop Archive available at http://www.arxiv.org/abs/cs/0604036

[xxviii] Begelman G, et al (2006) Automated Tag Clustering: Improving search and exploration in the tag space”. Israel Institute of Technology, Computer Science Department.

[xxix] [xxix] Marlow,C et al “Position paper, tagging, taxonomy, Flickr, Yahoo Research Berkeley

[xxx] Cialdini R.B.(1993), Influence: Science and Practice (3^rd Edition)New York, Harpers Collins College Publishers

[xxxi] Pirolli, P and Card, S (1995)Information Foraging in Information Access Environment, Proceedings of the SIGCHI Conference on human factors in computing systems,May 7-11, pp51-58, New York, ACM Press

[xxxii] “OSAF wiki.Journal.HierarchyVersusFacets.” http://wiki.osafoundation.org/bin/view/Journal/HierarchyVersusFacetsVersus Tags?skin=pring, 2005.

[xxxiii] Farrell, S and Lau, T (2006) Fringe Contacts: People-tagging for the Enterprise, IBM Almaden Research Center

The Theory of Image Indexing to Tagging in Flickr

Flickr.com screen capture

Related Topics