விக்கிப்பீடியா:Tamil Wikipedia: A Case Study

கட்டற்ற கலைக்களஞ்சியமான விக்கிப்பீடியாவில் இருந்து.

Tamil Wikipedia: A case study

L.BalaSundaraRaman
(accepted for presentation at Wikimania 2009)

Acknowledgment

The author thanks Natkeeran, Ravishankar, Selvakumar, Karthickbala, and fellow Tamil Wikipedia editors for feedback and other inputs.

Abstract

Three distinct growth phases have been identified in this case study of Tamil Wikipedia since late 2003. Several distinct characteristics of the Wikipedia and its editors are identified. Outreach efforts and sibling projects are also discused in this study. Challenges and future plans are outlined.

Introduction

Tamil is a classical language spoken by more than 78 million people across the world with a rich literary tradition spanning millennia. Significant acclaimed Tamil literature has existed for over two thousand years.[1] The Tamil language Wikipedia has 18,021 articles (as of writing), a number of them of good quality. This case study attempts to characterise the Tamil Wikipedia, its editorial team, growth trends, challenges faced, and plans to take it to the next important stage.

History

Palm leaf manuscript from Tolkāppiyam—the earliest of extant works in Tamil, which described the grammar of written Tamil.

Extant Tamil literature consists of works on poetics, philosophy, ethics, grammar, etc. Notable among the early Tamil encyclopaedias were Abidhaanakosam,[2] written by Muthuthambiyaar and published in Jaffna in 1902, and Abidhaana Chindhaamani,[3] a 1050 page work which took 42 years of determined work by Singaravelanar and published in Chennai in the year 1910. Later, a 18-volume encyclopaedia on science and a 15-volume work on humanities were published by the Thanjavur Tamil University,[4] in an intended series of 20 and 15 volumes respectively. The first comprehensive modern encyclopaedia was published from 1954 to 1968 as a 10 volume set.[5] It was a collaborative effort by scholars, philanthropists and the Government of Tamil Nadu. More recently, in 2007, a collection of 28,000 articles from the concise edition of Encyclopaedia Britannica was translated and published in Tamil by Vikatan Publishers.[6]

Mayooranathan, Tamil Wikipedia visionary, is an architect living in the United Arab Emirates.

Tamil Wikipedia was started on September 30, 2003 by an anonymous person by posting a link to their Yahoo! Group and the text manitha maembaadu (மனித மேம்பாடு), fittingly, a phrase that means human development, on the Main Page.[7] However, for several weeks after that, the site had an all-English interface with little activity. Mayooranathan, in response to a request posted in a mailing list, completed 95% of the localisation between November 4, 2003 and November 22, 2003. He made some anonymous edits alongside. On November 12, 2003 Amala Singh from the United Kingdom wrote the first article in Tamil, but with an English title Shirin Ebadi.[8] The earliest editor who continues to edit actively, Mayooranathan, has written more than 2760 articles and has kept the project alive during an intervening period when practically nobody else was editing. Around five active editors including the author joined the project in the second half of 2004. Some occasional editors turned out to become regular editors and the Wiki started growing steadily. Bugs were reported to fix the interface, policies partially deriving from the English Wikipedia were initiated, and editors started to specialise in tasks like stub sorting, creating templates, copyediting, wikifying, translation, original writing etc. Even at this early stage, the Tamil Wikipedia had a global editorial team representing almost every continent.

After registering a period of high linear growth in several metrics on a lower base, the Tamil Wikipedia started witnessing, around April 2007, a low linear growth on a higher base in several quantitative metrics. This period, however, also showed a perceivably super-linear growth in article quality aspects like length, standard of prose, image use, inline citation usage, etc. Late 2008 to early 2009 was a period characterised by a near constant number of active and very active editors, a steady influx of new and occasional editors, a healthy, enthusiastic and continuity-preserving churn, and, above all, optimism for a promising future.

Data up to Jan 2009 (18,027 articles as of April 30)

Growth

The three distinct phases noted in the History section are shown in the accompanying chart. The number of very active Wikipedians (not in chart) has also grown well. With the recent workshops and the planned events, we hope to hit a hockey-stick growth phase in the second half of 2009.

The premise behind the hope is the following: a linear growth in active editors results in a super linear growth in number of articles due to accumulative effect. Other metrics like article length etc., might improve at a greater rate. Given this, if the number of active users increases super-linearly due to the recent outreach efforts and the consequent mainstream media attention, content growth will really take on to a higher plane.

Editor profiles/demographics

Nirojan, from Canada, one of the youngest editors, wrote more than two thousand articles on Tamil films, ancient tamil kings, theatre and drama
Prof. VK, the senior most editor, has so far written 188 articles in Mathematics, Astronomy and Philosophy, contributing from the US and India.

Tamil Wikipedia has had a diverse set of editors from the beginning. Editors came from various disciplines like Architecture, Biotechnology, Economics, Electronics, Information Technology, Mathematics, Music, Social Welfare etc. The editors are from various professions—engineers, scientists, academics, students, administrators, self-employed people, etc. Editors are aged between 15[9] and 85 years, with a non-uniform but remarkably not power law distribution in between. Educational qualifications and income levels too vary across the spectrum.

More information regarding the profiles of editors as well as visitors to Tamil Wikipedia will come out when the results of the UNU-MERIT survey[10] are published. Based on some available monitoring tools, it has been identified that there are approximately 60,000 page requests each day.

827 editors have made just one edit and 4 editors have made more than 10,000 edits each. However, this curve is flattening over time. A point to be kept in mind is that many newcomers have made a significant contribution within a shorter duration than the established users.

Distinct characteristics

  • General cordiality and assumption of good faith among regular editors
  • Quality focus from early on[11] (concern[12] about article diversity when Ganeshbot, a bot similar to Rambot of the English Wikipedia, was proposed)
  • Early emphasis on citing sources[13][14]
  • Individual editors writing full-length articles later copyedited by others
  • Specialist roles chosen by editors even when a handful of editors were actively editing
  • 'In the news' and 'Selected anniversaries' sections meticulously updated, almost on a daily basis, by a dedicated user[15]
  • Several topics, on diverse areas, are being covered for the first time in Tamil. Tamil Wikipedia editors endeavour to attain currency of knowledge, by writing articles on topics that are emerging in science, technology, politics etc. As is customary, especially in agglutinative languages, suitable terminologies are coined as needed from existing words and roots.
  • In English Wikipedia, the primary and nearly the singular motivation for editors, is to document and spread knowledge. English as a medium is incidental. However, in the case of Tamil Wikipedia, most of the editors view this as a way to spread precious knowledge in Tamil. Many editors are motivated for being able to enrich the modern Tamil corpus, by adding quality content in Tamil.
Chandravathanaa, from Germany, is among the few female editors in Tamil Wikipedia
Ravi, a longtime editor, at the Chennai workshop
The top-4 editors have made over 46% of the edits. One goal of the outreach programs is to make the tail long and diverse enough to make a significant contribution.

Challenges

  • Low internet penetration among the majority of the population
  • Low awareness about Tamil typing tools
  • Low awareness about Tamil Wikipedia
  • Less than 2% editors female
  • Disconnect between skilled writers and internet access
  • Still not reached critical mass of tech-savvy editors who can fix interface issues

Outreach

Except a small initiative to display Wikipedia badges in blogs in late 2004, and one instance of media outreach, there have not been any planned activities to bring more readers and editors to Tamil Wikipedia. But, from the beginning of 2009, three workshops[16] were organised by Wikipedians during which the participants were introduced to the Tamil Wikipedia, explained about its philosophy and usefullness, and tutored on typing in Tamil and basic editing. Half a dozen introductory talks were delivered in meetups of other groups. These have been conducted in colleges including the prestigious Indian Institute of Science,[17] workplaces,[18] and special interest clubs. These workshops and talks have shown a good impact by way of bringing new active editors from various backgrounds.

Based on the feedback from each workshop the following have been observed:

  • Tutor-learner ratio should be around 1:5 for useful practical training. Having multiple tutors handling different aspects of editing is helpful.
  • A classroom is good, a computer science lab environment is better.
  • Asking some uninitiated person from the audience to come forward and edit is a good approach--convinces others about ease of use, gives feedback to the tutors about difficulties faced by new editors.
  • If a remote editor leaves a message of appreciation at the new user's talk page as soon as they make the first trial edit, it encourages them a lot.
  • Articles to cite as examples should be picked based on audience composition.
  • Emailing all those who attended, thanking them as well as inviting them to edit, leads to more conversions.
  • In the Indian Wikipedia context, the first session after introduction should be about typing in the Indian language concerned.


Following is the agenda of a typical workshop:

  • Introductions by the host and the Tamil Wikipedia member who acted as an interface with the host
  • A short presentation on what Wikipedia is, its history, philosophies, software, etc.,
  • A tutorial on Tamil typing tools
  • Tea break
  • Tutorial on editing through someone from the audience. The newbie picks the topic and content.
  • Q & A session

Sibling projects

Other Tamil Wiki projects are Wiktionary, Wikinews, Wikisource, Wikibooks, and Wikiquotes. However, Tamil Wiktionary is the one project that has matured and grown well. Mainly seeded by an automated bot[19] adding entries from technical dictionaries, the Tamil Wiktionary reached more than 1,00,000 entries and was featured on the main Wiktionary page for sometime. It has attracted more editors since then, and, at this stage, its sustenance and future growth is guaranteed. Tamil, with a long and rich literary tradition, has numerous public domain works available. Because of this, there is ample scope for Wikisource to grow. The other Tamil Wiki projects are still in bootstrapping stage and there is also some new-found interest in starting a Wikispecies project in Tamil as well.

Future plans

Language Off count > 200 Char Mean bytes Length 0.5K Length 2K Size Words Images
Tamil 16 k 16 k 1619 81% 21% 74 MB 3.0 M 3.0 k
Bengali 19 k 12 k 1113 49% 11% 61 MB 3.1 M 8.5 k
Marathi 21 k 6.4 k 623 20% 5% 44 MB 1.8 M 0.769K
Telugu 42 k 13 k 578 16% 5% 64 MB 3.0 M 2.6 k
Hindi 24 k 14 k 1128 35% 11% 76 MB 4.6 M 1.4 k
Malayalam 8.3 k 7.8 k 2425 78% 30% 58 MB 2.1 M 5.4 k
Kannada 6.1 k 5.3 k 1282 53% 14% 23 MB 0.965M 0.211K
Tamil's rank 5 1 2 1 2 2 3 3
Table showing comparison of top Indian language Wikipedias (as of Nov 2008)

Tamil and Malayalam Wikipedias top the quality metrics. Tamil Wikipedians monitor the changes regularly.

  • firming up policies and guidelines
  • media outreach
  • bringing out an offline collection of wiki articles
    • The 28,000 articles in the Tamil edition of the concise Britannica, currently being sold in the market, are of stub-quality. A collection of 5,000 selected articles from Tamil Wikipedia, published after manual perusal, will definitely have a number of takers. In fact, a collection of wildlife articles for school children and an assorted collection[20] of good articles given to scientific research students have been well-received.
  • liaising with the Indian Wikimedia Chapter being formed and other bodies
  • conducting article-writing contests, local conferences, etc.,

Conclusion

A case study on Tamil Wikipedia has revealed 3 distinct growth phases so far. Important characterisations of the editors as well as the Wiki itself has been made. Main problems coming in the way of its growth have been identified and future plans are outlined. Conducting similar studies on other language Wikipedias that are in a similar phase of growth could reveal commonalities as well as distinct characteristics.

Notes

  1. Kamil V. Zvelebil (1992). Companion Studies to the History of Tamil Literature. BRILL Academic. பக். 12. பன்னாட்டுத் தரப்புத்தக எண்:9004093656. "p12 - ...the most acceptable periodisation which has so far been suggested for the development of Tamil writing seems to me to be that of A Chidambaranatha Chettiar (1907–1967): 1. Sangam Literature - 200BC to AD 200; 2. Post Sangam literature - AD 200 - AD 600; 3. Early Medieval literature - AD 600 to AD 1200; 4. Later Medieval literature - AD 1200 to AD 1800; 5. Pre-Modern literature - AD 1800 to 1900..." 
  2. Abidhaanakosam in the Noolaham archive
  3. Author Jeyamohan on Abidhaana Chindhaamani
  4. http://www.tamiluniversity.ac.in/english/links/encyclopaedia.html
  5. Ma. Po. Sivagnanam. 1978 The history of Tamil Development after (Indian) independence. Chennai: Poongodi Publications.
  6. "Karunanidhi releases Encyclopaedia Brittanica in Tamil". The Hindu. 2007-04-29. http://www.hindu.com/2007/04/29/stories/2007042902840300.htm. பார்த்த நாள்: 2009-05. 
  7. http://ta.wikipedia.org/w/index.php?title=முதற்_பக்கம்&diff=prev&oldid=5
  8. The article titled in English was moved to the Tamil title, and the redirect page was subsequently deleted. It has been recently restored for the record.
  9. Karthikeyan, a school student from Singapore, wrote several articles on herbs from this user account and anonymously prior to that.
  10. Möller, Erik (2008-10-24). "Multilingual Wikipedia Survey Launched". Wikimedia Foundation. பார்க்கப்பட்ட நாள் 2009-04-16.
  11. Tamil Wikipedia quality monitor
  12. "Wikipedia discussion prior to bot approval". பார்க்கப்பட்ட நாள் 2009-04-16.
  13. Citation guidelines
  14. "Articles using "Cite journal" template". பார்க்கப்பட்ட நாள் 2009-04-16.
  15. Kanags maintains these two sections
  16. Homepage for workshops
  17. Details of the workshop held at the IISc
  18. "Wikipedia Academy in Bangalore". My Bangalore. 2009-02-05. http://mybangalore.com/article/wikipedia-academy-in-bangalore.html. பார்த்த நாள்: 2009-04-25. 
  19. SundarBot project page
  20. Booklet given to participants of the workshop held at the Indian Institute of Science

References

  • "Wikipedia Statistics: Tamil". Wikimedia. பார்க்கப்பட்ட நாள் 2009-04-15.
  • Ramaswamy, Sumathi (1998). Passions of the Tongue: language devotion in Tamil India 1891–1970. Delhi: Munshiram. ISBN 81-215-0851-7.