TritaH (Scholar HIndex Batch Calculator)  Luca Boscolo
From VIAWiki
TritaH (Scholar HIndex Batch Calculator) is a software designed by Luca Boscolo that calculates, automatically, hindex and other parameters using the Google Scholar database.
This software, written in C# and VB.NET, it is based on the .net framework, it runs on a Windows Web Server machine and the data are saved in a SQL Server database.
Contents 
DISCLAIMER
Since a lot of people contacted me saying TritaH does not capture all their publications, I must highlight that TritaH displays the results provided by Google Scholar database based on a specific search. It could happen some publications are not captured by Google Scholar Search and therefore they are not displayed by the TritaH. There are several reasons for that, I think the most common is some publications are captured only if the search is done by the initial of the name, usually this kind of search introduces also a lot of homonyms. For this reason I have created this My Citations service which, using Google Scholar MyCitations, allows the authors to select their own publications.
The Method
30 JUNE 2012  PLEASE NOTE: the method has been modified and it now produces the same results as Harzing's Publish or Perish
Here it is a comparison with the other major software (01/07/2012)
Boscolo's TritaH Scholar HIndex Batch Calculator  Harzing's Publish or Perish  Ianni's Scholar HIndex Calculator (Quoted author name)  TIS  

CM Croce  HIndex: 149, Papers: 1000, Citations: 89613, Years: 42  HIndex: 149, Papers: 1000, Citations: 89558, Years: 42  HIndex:>100, Papers: >100, Citations: 50160, Years: ??  HIndex: 149 
Napoleone Ferrara  HIndex: 118, Papers: 568, Citations: 80561, Years: 31  HIndex: 118, Papers: 568, Citations: 80561, Years: 31  HIndex:>100, Papers: >100, Citations: 72217, Years: ??  HIndex: 132

Alberto Mantovani  HIndex: 116, Papers: 1000, Citations: 53187, Years: 38  HIndex: 116, Papers: 1000, Citations: 53187, Years: 54 (the publications end in 1975 with the exception of one in 1959 with zero cites. TritaH does not pick it up automatically)  HIndex:>100, Papers: >100, Citations: 34452, Years: ??  HIndex: 130

Giorgio Trinchieri  HIndex: 109, Papers: 384, Citations: 44153, Years: 38  HIndex: 109, Papers: 384, Citations: 44153, Years: 38  HIndex:>100, Papers: >100, Citations: 36022, Years: ??  HIndex: 123

Ettore Appella  HIndex: 103, Papers: 507, Citations: 41543, Years: 52  HIndex: 103, Papers: 507, Citations: 41543, Years: 52  HIndex:>100, Papers: >100, Citations: 30153, Years: ??  HIndex: 115

Giuseppe Remuzzi  HIndex: 108, Papers: 883, Citations: 44946, Years: 36  HIndex: 108, Papers: 883, Citations: 44946, Years: 113 (the publication list shows a publication in 1900 with 0 cites, TritaH ignores this automatically)  HIndex:>100, Papers: >100, Citations: 28434, Years: ??  HIndex: 114

Tomaso Poggio  HIndex: 96, Papers: 694, Citations: 50167, Years: 42  HIndex: 96, Papers: 694, Citations: 50167, Years: 42  HIndex: 96, Papers: ??, Citations: 43753, Years: ??  HIndex: 104

Dario Alessi  HIndex: 85, Papers: 266, Citations: 34084, Years: 22  HIndex: 85, Papers: 266, Citations: 34084, Years: 22  HIndex: 85, Papers: ??, Citations: 32256, Years: ??  HIndex: 94

Piero Anversa  HIndex: 87, Papers: 266, Citations: 34084, Years: 22  HIndex: 87, Papers: 266, Citations: 34084, Years: 22  HIndex: 87, Papers: ??, Citations: 31480, Years: ??  HIndex: 94 
Luigi Tavazzi  HIndex: 66, Papers: 351, Citations: 22031, Years: 35  HIndex: 66, Papers: 351, Citations: 22031, Years: 35  HIndex: 64, Papers: ??, Citations: 20143 , Years: ??  HIndex: 66

As you can see from this comparison, Boscolo's Scholar HIndex Batch Calculator and Harzing's Publish or Perish provide the same results while Ianni's Scholar HIndex Calculator is limited to the first 100 publications and it runs only on Mozilla Firefox as an addon. To run Publish or Perish you need to install the software on your computer and after about 100 queries Google blocks the software while with Scholar HIndex Batch Calculator you do not have to install any software on your computer, since it runs on a website and there are no limits on its usage, but you have to provide an email in order to get the results. Also Boscolo's TritaH is more accurate then Harzing's PoP when calculating the academic age.
This software automatises the operations that an user would do to calculate the hindex using the Google Scholar database, that is, it connects to the Google Scholar website, it inserts name and surname, if required, it filters by area, then it gets the publications list. Hence, it downloads this publications list in a database with the related information, that is:
 complete list of authors,
 cites number,
 publication link,
 publication title,
 publication year,
 publisher,
 journal.
TritaH (Scholar HIndex Batch Calculator) is able to download into a database, all the publication data for all the Italian Academics (about 57,000). The information about the Italian Academics has been downloaded from the Cineca(MIUR) website. For each Academic, the software is able to calculate the following paramenters, suggested by ANVUR [1] (National Agency for the Evaluation of Universities and Research Institutes) for the evaluation of the Italian Academics:
 hindex.
 HNormage  this is an hindex normalised by age, which means, in calculating hindex, the cites number is divided by age, where age depends on publication year in the following way: if year is 2012 then age is equal to 1, if year is 2011 then age is equal to 2 and so on.
 HAutoSpecial  this is an hindex normalised by author, which means, in calculating hindex, the cites number is moltiply by a rate, where rate is calculated in the following way: rate equal to 1 if Author is either at the first or last place in the Authors list, rate uqual to 0.5 if Author is not in the first o last place in the Authors list and rate is equal to 0 if Author is not contained in the Authors list. Valid only if you tick the option 'Complete Authors List'.
 Total number of cites. This number is the sum of cites for all the publications downloaded for the selected Acedemic.
 Papers over the last 10 years: this is the total number of publications over the last 10 years. For example in year 2002, it is the total number of publications from 2002 to 2012.
 HIF Index.
 Total number of publications.
 Academic Age = [(2012  First_Publication_Year) + 1]. Sometimes wrong data can be found, for example year publication can be 1792, to overcome this problem only publications whose years are greater than 1960 are considered.
 HIndex Normalised by Academic Age  It is Hindex divided by Academic Age.
 Total Number of Citations Normalised by Academic Age  It is the Total Number of Citations divided by Academic Age.
 HcIndex  the contemporary HIndex [2], basis of the HNormage [3], in which the citations are weighted by a factor of 4 to favour more recent publications.
where:
 C(i,t) is the number of citations of the i publication during t year;
 ti is the publication year;
The HcIndex is the HIndex of the publications where instead of the real cites, it considers the calclulated S(i,t).
[The calculations are done by rounding up to the next higher integer number]
Notes on HcIndex A disadvantage of the hindex is that it cannot decline. That means that academics who “retire” after 1020 active years of publishing maintain their high hindex even if they never publish another paper. In order to address this issue, the contemporary hindex has been proposed. The contemporary hindex adds an agerelated weighting to each cited article, giving (by default; this depends on the parametrization) less weight to older articles. For junior academics the contemporary hindex is generally close to their regular hindex as most of the papers included in their hindex will be recent. For more established academics there can be a substantial difference between the two indices, indicating that most of the papers included in their hindex have been published some time ago. As such the contemporary hindex often provides a slightly fairer comparison between junior and senior academics than the regular hindex.
What is a Minor Citation? A Minor Citation is when the Scientist is cited in the Publication Text, but not in the Authors List. TritaH searches into the Google Scholar database with the option "with the exact phrase", it includes patents and it does NOT include Minor Citations. Also, it filters by area depending on the Academic SSD in the following way:
SSD = FILTER BY AREA AGR = Biology, Chemistry, Social Sciences (and Engineering) BIO = Biology, Medicine but if BIO/10 or Bio/14 then Biology, Medicine and Chemistry CHIM = Chemistry e Physics FIS = Physics GEO = Physics, Chemistry, Biology ICAR = Enginnering e Social/Arts and Humanities (SAH) INF = Engineeering/maths/Computer Sciences INGIND = Engineering (+ Biology only for /05) INGINF = Engineering (+ Biology for /06) IUS = SAH L(ANT = SAH) L(ART = SAH) LFILLET = SAH L(LIN = SAH) LOR = SAH MAT = Engineering e Physics M(DEA = SAH) MED = Medicine e Biology MEDF = Medicione, Biology e SAH M(FIL = SAH) MGGR = SAH, Business e Biology M(PED = SAH) MPSI = Medicine, Biology e SAH M(STO = SAH) SECSP = Business e SAH SECSS = Business, Engineering e SAH SPS = SAH VET = Medicine e Biology
The Search is done normally by Name and Surname, a part for CHIM and ING where the search is done by initial of the name and surname
Because many Italian Academics have two o more names and the search with it has been proved completely wrong in many cases, therefore, to reduced this kind error, TritaH considers only the first name. While it takes the Surname as it is, even though it has 2 o more Surnames.
Calculation of the errors
The main areas of errors are homonyms and double names.
Homonyms  the software is not able to distinguish publications done by different authors with the same name and surname or same name intial and surname. To riduce this kind of error the search has been done by filtering per area.
Double names  to reduce this error the search has been done considering only the first name.
The error for few already examined SSD, such as BIO, MED and MPSI, has been calculated less than 1 point.
Top 100 Homonym Italian Academics List, downloaded from the Cineca(MIUR) web site (Nov 2010)
Surname(COUNT),ROSSI(214),RUSSO(141),FERRARI(104),ROMANO(92),BIANCHI(88),RICCI(79),CONTI(68),COLOMBO(66),GIORDANO(63),BRUNO(62),GRECO(60),MARINO(60),ESPOSITO(59),RIZZO(58),COSTA(57),DE LUCA(52),GALLO(49),LOMBARDI(48),MARCHETTI(47),RINALDI(47),MANCINI(47),NERI(46),MARINI(46),LONGO(44),BARBIERI(43),FONTANA(42),CARUSO(42),MARTINELLI(41),LOMBARDO(41),GRASSI(41),MORETTI(40),LEONE(40),GALLI(40),FERRETTI(39),D ANGELO(39),VILLA(38),SANTORO(38),DE ROSA(37),MONTI(37),CONTE(36),MARIANI(36),FERRARA(36),PINTO(35),PALUMBO(35),ROMEO(35),DE ANGELIS(35),GENTILE(35),MONTANARI(35),GRASSO(34),BARONE(34),FABBRI(34),LEONARDI(34),SERRA(34),VALENTINI(34),SANTINI(33),MESSINA(33),RIZZI(33),POLI(33),PELLEGRINI(32),MARTINI(32),BIANCO(32),CARBONE(32),VITALE(32),COPPOLA(32),MORELLI(31),BIONDI(31),FERRANTE(31),GATTI(31),DE SANTIS(31),D ALESSANDRO(31),PARISI(30),PIAZZA(30),SALERNO(30),MONACO(29),VENTURA(29),BERTI(29),AMATO(29),CAPUTO(29),ANTONELLI(28),PUGLIESE(28),GIULIANI(28),SILVESTRI(28),NEGRI(28),MOTTA(28),MARINELLI(27),D AGOSTINO(27),CATALANO(27),VILLANI(27),FRANCO(27),ORLANDI(27),VALENTE(27),BERNARDI(27),PALMIERI(27),MAGGI(26),BRUNI(26),MOLINARI(26),CASTELLI(26),ANGELINI(26),SANNA(26),VALENTI(26)
Conclusions
This is ONLY an automatic calculation and although it provides good statistical results when considering an entire area of scientitsts, a human intervention it is highly recommended when evaluating single individuals.
Online version
There is an online version of the TritaH (Scholar HIndex Batch Calculator)
References
Google Scholar database is a freely accessible web search engine that indexes the full text of scholarly literature across an array of publishing formats and disciplines.
Scholar HIndex Calculator it is an Addons for Firefox, developed by G.B IANNI, that displays on top of Google Scholar result pages, the corresponding hindex, gindex, eindex and other measures of impact for the submitted query.
Publish or Perish is a software program, designed by AnneWil Harzing that retrieves and analyzes academic citations. It uses Google Scholar to obtain the raw citations.