twitter email
Analyze your Zotero database with R

Why should someone be interested in analyzing her/his literature database? Actually, there are several good reasons to do so. You may also be interested, if you ask yourself one of the following questions:

  • Which journals should your local librarian add to the university bookshelves?
  • To which journal you should send you next ground-breaking manuscript?
  • Which journals are most interesting for me and should get an e-mail alert?

In all these instances, you want to find some data driven recommendations and answers. Here is how you can achieve this goal with just a few lines of R syntax. Before starting your R console you only have to export your library/folder/selected entries to a csv-file.

library(plyr)
setwd("c:/Dropbox/workspace/Bibliothek")

ebf.bib <- read.csv("ebf-jp.csv", encoding="UTF-8") # my ebf-jp library

# count the number of journal articles after 2014-08-03
ebf.jour.freq <- count(ebf.bib[ebf.bib$Item.Type=="journalArticle" & as.Date(substring(ebf.bib$Date.Added,1,10)) - as.Date("2014-08-03") > 1, ], 
  "Publication.Title")

# select only journals with more than 10 entries
ebf.jour <- subset(ebf.jour.freq, freq > 10)
arrange(ebf.jour, freq, decreasing = TRUE)

In case you are interest what the output looks like. Here are the results of my last year reading:

Publication.Title freq
1 Intelligence 37
2 Journal of Educational Psychology 27
3 Learning and Individual Differences 27
4 Zeitschrift für Pädagogische Psychologie 19
5 International Journal of Science Education 12
6 Educational Psychologist 11
7 Learning and Instruction 11

If you have any recommendations or examples, then drop me a line.