Commit 394e8b41 authored by Karthik Ram's avatar Karthik Ram
Browse files

Started working on new branch

- Still missing print methods
- Need to fix the organization of return objects
- But otherwise this is a much better version than the old rAltmetric.
parent 37afc1de
Package: rAltmetric
Type: Package
Title: Retrieves Altmerics Data For Any Published Paper From Altmetric.com
Version: 0.6
Date: 2014-12-02
Author: Karthik Ram [aut, cre]
Maintainer: Karthik Ram <karthik.ram@gmail.com>
Description: Programmatic interface to the Altmetric.com API.
URL: https://github.com/ropensci/rAltmetric
BugReports: https://github.com/ropensci/rAltmetric/issues/
License: CC0
Imports:
plyr,
RCurl,
reshape2,
png,
ggplot2 (>= 0.9.2.1),
RJSONIO
Suggests:
testthat
Package: rAltmetric2
Title: Provides a programmatic interface the the Altmetric.com API on article level metrics.
Version: 0.1.0.9000
Authors@R: 'Karthik Ram <karthik.ram@gmail.com> [aut, cre]'
Description: The package allows for retrieval of metrics on various
Depends: R (>= 3.1.1)
License: MIT + file LICENSE
LazyData: true
Imports:
assertthat,
httr,
jsonlite
YEAR: 2015
COPYRIGHT HOLDER: Karthik Ram
\ No newline at end of file
# Generated by roxygen2 (4.0.2): do not edit by hand
# Generated by roxygen2 (4.1.0): do not edit by hand
S3method(plot,altmetric)
S3method(print,altmetric)
export(altmetric_data)
export(altmetrics)
export(return_provider)
import(ggplot2)
importFrom(RCurl,getCurlHandle)
importFrom(RCurl,getURL)
importFrom(RCurl,getURLContent)
importFrom(RJSONIO,fromJSON)
importFrom(plyr,compact)
importFrom(plyr,rbind.fill)
importFrom(plyr,unrowname)
importFrom(png,readPNG)
importFrom(reshape2,melt)
export(citations)
export(metrics)
import(httr)
importFrom(assertthat,assert_that)
rAltmetric 0.6
------------------
* Several bug fixes and documentation updates
rAltmetric 0.5
------------------
* Removed hard dependencies and moved them to imports
* Minor bug fixes
rAltmetric 0.3
------------------
* Initial release to CRAN
#' Retrieves most popular articles over a defined time period
#'
#' The time period can be any of the following: '1d', '2d', '3d', '4d', '5d', '6d', '1w', '1m', '3m', '6m', '1y'
#' @param day The time period over which metrics are required. Allowed options include '1d', '2d', '3d', '4d', '5d', '6d', '1w', '1m', '3m', '6m', '1y'resu
#' @param page Page number
#' @param num_results Max number of results per page. Cannot exceed \code{100}.
#' @param cited_in One or more comma delimited options from: facebook, blogs, linkedin, video, pinterest, gplus, twitter, reddit, news, f1000, rh, qna, forum, peerreview
#' @param doi_prefix A DOI prefix (the bit before the first slash, e.g. 10.1038)
#' @param nlmid Comma delimited list of journal NLM IDs. Include only articles from journals with the supplied NLM journal IDs (only journals indexed in PubMed have NLM IDs).
#' @param subject Comma delimited list of slugified journal subjects. Include only articles from journals matching any of the supplied NLM subject ontology term(s).
#' @param foptions A list of additional arguments for \code{httr}. There is no reason to use this argument except for debugging purposes.
#' @import httr
#' @importFrom assertthat assert_that
#' @export
#' @examples
#' citations(day = '1d')
#' # Only Facebook mentions
#' fb_1week <- citations('1w', cited_in = "facebook")
citations <- function(day, page = NULL, num_results = 100, cited_in = NULL, doi_prefix = NULL, nlmid = NULL, subject = NULL, foptions = list()) {
assert_that(!is.null(day))
possible <- c('1d', '2d', '3d', '4d', '5d', '6d', '1w', '1m', '3m', '6m', '1y')
assert_that(day %in% possible)
# 100 is the max results one can get per page
assert_that(num_results <= 100)
args <- as.list( altmetric_compact(c(page = page, num_results = num_results, cited_in = cited_in, doi_prefix = doi_prefix, nlmid = nlmid, subject = subject)))
citations_url <- paste0(api_url(), "citations/", day, "/")
data <- GET(citations_url, query = args, foptions)
warn_for_status(data)
results <- content(data, as = 'text')
res <- jsonlite::fromJSON(results, flatten = TRUE)
res$query <- data.frame(t(unlist(res$query)))
class(res) <- "altmetric"
res
}
\ No newline at end of file
#' Grab altmetric data on any paper
#'
#' This function will retrieve data from Altmetric.com on any paper with an appropriate object identifier. Acceptable identifiers include dois, arXiv ids, pubmed ids and altmetric ids.
#' @param oid \code{oid} Any object ID. Any general object identifier as long as the prefix is "doi","pmid", "arXiv", or "id".
#' @param id The Altmetric \code{id} of a paper. If specifiying directly, the "id" prefix is not necessary.
#' @param doi The \code{doi} of a paper. If specifiying directly, the "doi" prefix is not necessary.
#' @param pmid The \code{pmid} of a paper.If specifiying directly, the "pmid" prefix is not necessary.
#' @param arXiv The \code{arxiv} ID of a paper.If specifiying directly, the "arXiv" prefix is not necessary.
#' @param apikey An API key obtained from altmetric. The key for this application is '37c9ae22b7979124ea650f3412255bf9' and you are free to use it for academic non-commercial research. But if you start seeing rate limits, please contact support at altmetric.com to get your own.
#' Metrics
#'
#' The function returns detailed metrics. For more information on all the fields returned by the function, see the API documentation: (\url{http://api.altmetric.com/docs/call_citations.html}). If you get your own key, you can save it in your \code{.rprofile} as \code{options(altmetricKey="YOUR_KEY")}
#' @param curl passes on curl handle in a vectorized operation
#' @param ... additional parameters
#' @importFrom RCurl getURL getCurlHandle
#' @importFrom RJSONIO fromJSON
#' @importFrom plyr compact unrowname
#' Returns metrics for any standard identifier such as doi, arxiv, pmid, id, or ads.
#' @param identifier The identifier passed in the format "identifier/id".
#' @export
#' @return \code{list}
#' @examples \dontrun{
#' altmetrics(doi ='10.1890/ES11-00339.1')
#' Or specfiy the doi with the id
#' altmetrics('doi/10.1890/ES11-00339.1')
#' altmetrics(doi ='10.1038/480426a')
#' Or specfiy the doi with the id
#' You can do the same for other providers such as pmid, id, and arxiv
#' altmetrics('doi/10.1038/480426a')
#' foo <- metrics(identifier = "arxiv/1108.2455")
#' # This is a failed example
#' foo <- metrics(identifier = "arxiv/1108.24553")
#' # Now for a PMID
#' pm <- metrics(identifier = "pmid/21148220")
#'}
altmetrics <- function(oid = NULL, id = NULL, doi = NULL, pmid = NULL, arXiv = NULL, apikey = getOption('altmetricKey'), curl = getCurlHandle(), ...) {
if(is.null(apikey))
apikey <- '37c9ae22b7979124ea650f3412255bf9'
acceptable_identifiers <- c("doi", "arXiv", "id", "pmid")
# If you start hitting rate limits, email support@altmetric.com
# to get your own key.
# If no object identifiers were specified, throw an error
if(is.null(oid) && is.null(id) && is.null(doi) && is.null(pmid) && is.null(arXiv))
stop("No valid identfier found. See ?altmetrics for more help", call.=FALSE)
# If an altmetric id is not prefixed by "id", add it in.
if(!is.null(id)) {
prefix <- as.list((strsplit(id,'/'))[[1]])[[1]]
if(prefix != "id")
id <- paste0("id", "/", id)
}
# If an doi id is not prefixed by "id", add it in.
if(!is.null(doi)) {
prefix <- as.list((strsplit(doi,'/'))[[1]])[[1]]
if(prefix != "doi")
doi <- paste0("doi", "/", doi)
}
# If an arXiv id is not prefixed by "arXiv", add it in.
if(!is.null(arXiv)) {
prefix <- as.list((strsplit(arXiv,':|/'))[[1]])[[1]]
arXiv <- paste0("arXiv", "/", as.list((strsplit(arXiv,':|/'))[[1]])[[2]])
}
# If an pubmed id is not prefixed by "pmid", add it in.
if(!is.null(pmid)) {
prefix <- as.list((strsplit(pmid,'/'))[[1]])[[1]]
if(prefix != "pmid")
pmid <- paste0("pmid", "/", pmid)
}
# remove the identifiders that weren't specified
identifiers <- compact(list(oid, id, doi, pmid, arXiv))
# If user specifies more than one at once, then throw an error
# Users should use lapply(object_list, altmetrics)
# to process multiple objects.
if(length(identifiers) > 1)
stop("Function can only take one object at a time. Use lapply with a list to process multiple objects", call.=FALSE)
if(!is.null(identifiers)) {
ids <- identifiers[[1]]
# Fix arXiv
test <- strsplit(ids, ":")
if(length(test[[1]]) == 2) {
ids <- paste0(as.list(strsplit(ids, ":")[[1]])[[1]], "/", as.list(strsplit(ids, ":")[[1]])[[2]])
}
supplied_id <- as.character(as.list((strsplit(ids,'/'))[[1]])[[1]])
# message(sprintf("%s", supplied_id))
if(!(supplied_id %in% acceptable_identifiers))
stop("Unknown identifier. Please use doi, pmid, arxiv or id (for altmetric id).", call.=F)
}
# message(sprintf("%s", identifiers[[1]]))
url <- "http://api.altmetric.com/v1/"
url <- paste0(url, ids, "?key=", apikey)
metrics <- getURL(url, curl = curl)
if(metrics == "Not Found") {
message(sprintf("No metrics found on %s", identifiers[[1]]))
return(NULL)
} else {
res <- fromJSON(metrics)
class(res) <- "altmetric"
return(res)
}
metrics <- function(identifier) {
assert_that(!is.null(identifier))
type <- dirname(identifier)
possible_types <- c("doi", "id", "arxiv", "pmid", "ads", "uri")
assert_that(type %in% possible_types)
metrics_url <- paste0(api_url(), dirname(identifier), "/", basename(identifier))
# message(sprintf("Calling %s \n", metrics_url))
data <- GET(metrics_url)
if(data$status_code == 404) {
message("No metrics found")
NULL
} else {
# warn_for_status(data)
results <- content(data, as = 'text')
# Needs more clean up here
res <- jsonlite::fromJSON(results, flatten = TRUE)
res
}
}
# Make it possible to do either doi or a full list. In case of full list, the first part must be the name of the identifier.
#' Returns a data frame of metrics for a paper
#'
#' @param alt_obj altmetrics object
#' @importFrom plyr rbind.fill
#' @export
#' @return data.frame
#' @examples \dontrun{
#' altmetric_data(altmetrics(doi='10.1038/489201a'))
#'}
altmetric_data <- function(alt_obj) {
value <- NA
if (is(alt_obj, "altmetric")) {
# Pull our readers and cohorts before squashing the list
reader <- alt_obj$readers
cohort <- alt_obj$cohorts
alt_obj <- alt_obj[-which(names(alt_obj)=="readers")]
alt_obj <- alt_obj[-which(names(alt_obj)=="cohorts")]
# Remove TQ if it exists
if("tq" %in% names(alt_obj)) {
alt_obj <- alt_obj[-which(names(alt_obj)=="tq")]
}
# Basic stats
alt_obj$issns <- paste0(alt_obj$issns, collapse = '', sep = ",")
basic_stuff <- data.frame(matrix(rep(NA, 10), nrow =1))
names(basic_stuff) <- names(alt_obj)[1:10]
basic_data <- data.frame(t(unlist(alt_obj[1:10])))
# basics <- rbind.fill(basic_stuff, basic_data)[-1, ]
basics <- basic_data
# Readers to data.frame
# readers <- unrowname(t(data.frame(reader)))
readers <- t(data.frame(reader))
cohorts <- data.frame("pub" = NA, "sci" = NA, "com" = NA, "doc" = NA)
cohorts2 <- data.frame(unrowname(t(data.frame(cohort))))
cohorts <- rbind.fill(cohorts, cohorts2)[-1,]
# Context (hard to standardize so excluding this for now)
# context <- ldply(alt_obj$context, function(foo) t(data.frame(foo)))
# names(context) <- c("type", "count", "rank", "pct")
# more_metrics <- dcast(melt(context, id.var="type"), 1~variable+type)[, -1]
# # 1, 18
# Counts to data.frame
stats_base <- data.frame("cited_by_gplus_count" =NA, "cited_by_fbwalls_count" =NA,"cited_by_posts_count" =NA, "cited_by_tweeters_count" =NA, "cited_by_accounts_count" =NA, "cited_by_feeds_count" =NA, "cited_by_rdts_count" =NA, "cited_by_msm_count" =NA, "cited_by_delicious_count" =NA, "cited_by_forum_count" = NA, "cited_by_qs_count" = NA, "cited_by_rh_count" = NA)
# stats <- melt(alt_obj[grep("^cited", names(alt_obj))])
stats <- data.frame(alt_obj[grep("^cited", names(alt_obj))])
# stats <- dcast(stats, 1~value+L1)[, -1]
# names(stats) <- gsub("^[0-9]_","", names(stats))
stats <- rbind.fill(stats_base, stats)[-1,]
# 1, 11
alt_obj$subjects <- paste0(alt_obj$subjects, collapse='', sep=",")
alt_obj$scopus_subjects <- paste0(alt_obj$scopus_subjects, collapse='', sep=", ")
if(length(alt_obj$added_on) ==0 || is.null(alt_obj$added_on)) {
alt_obj$added_on <- NA
}
if(length(alt_obj$published_on) ==0 || is.null(alt_obj$published_on)) {
alt_obj$published_on <- NA
}
# Removing more_metrics for the time being
# return(data.frame(basic_stuff, stats, score = alt_obj$score, readers, url = alt_obj$url, added_on = alt_obj$added_on, published_on = alt_obj$published_on, subjects = alt_obj$subjects, scopus_subjects = alt_obj$scopus_subjects, last_updated = alt_obj$last_updated, readers_count = alt_obj$readers_count, more_metrics, details_url = alt_obj$details_url))
return(data.frame(basics, stats, score = alt_obj$score, readers, url = alt_obj$url,
added_on = alt_obj$added_on, published_on = alt_obj$published_on,
subjects = alt_obj$subjects, scopus_subjects = alt_obj$scopus_subjects,
last_updated = alt_obj$last_updated, readers_count = alt_obj$readers_count,
details_url = alt_obj$details_url))
}
}
#' Returns cleaner metric source names (mostly for internal use)
#'
#' @param provider the data provider
#' @export
#' @keywords internal
#' @examples \dontrun{
#' return_provider('cited_by_gplus_count')
#'}
return_provider <- function(provider) {
services <- data.frame(type = c("cited_by_gplus_count", "cited_by_fbwalls_count", "cited_by_posts_count", "cited_by_tweeters_count", "cited_by_accounts_count", "cited_by_feeds_count", "cited_by_rdts_count", "cited_by_msm_count", "cited_by_delicious_count", "cited_by_forum_count", "cited_by_qs_count"), names = c("Google+", "Facebook", "Cited", "Tweets", "Accounts", "Feeds", "Reddit", "MSM", "Delicious", "Forums", "QS"))
return(services$names[which(services$type == provider)])
}
\ No newline at end of file
#' Print a summary for an altmetric object
#' @export
#' @param x An object of class \code{Altmetric}
#' @param ... additional arguments
print.altmetric <- function(x, ...) {
value <- NA
string <- "Altmetrics on: \"%s\" with altmetric_id: %s published in %s."
vals <- c(x$title, x$altmetric_id, x$journal)
cat(do.call(sprintf, as.list(c(string, vals))))
cat("\n")
stats <- melt(x[grep("^cited", names(x))])
stats$names <- unname(sapply(stats$L1, return_provider))
stats$names <- factor(stats$names, levels = stats$names[rev(order(stats$value))])
print( data.frame(provider = stats$names, count = stats$value))
}
#' Plots metrics for an altmetric object
#'
#' @export
#' @param x An object of class \code{Altmetric}
#' @import ggplot2
#' @importFrom reshape2 melt
#' @importFrom png readPNG
#' @importFrom RCurl getURLContent
#' @param ... additional arguments
plot.altmetric <- function(x, ...) {
value <- NA
# just to trick check()
if (!is(x, "altmetric"))
stop("Not an altmetric object")
stats <- melt(x[grep("^cited", names(x))])
stats$names <- unname(sapply(stats$L1, return_provider))
stats$names <- factor(stats$names, levels = stats$names[rev(order(stats$value))])
stats <- stats[-(which(stats$L1=="cited_by_accounts_count")),]
# Grab the donut image
donut <- readPNG(getURLContent(x$images[[2]]))
# Now return a pretty plot
ggplot(stats, aes(names, value)) + geom_point(size = 4, colour = 'steelblue') + ggtitle(x$title) + xlab("Provider") + ylab("Hits") + theme(panel.background = element_blank(), panel.grid.major = element_blank(),
panel.grid.minor = element_blank(), panel.border = element_blank(),
axis.line = element_line(colour = "black")) + annotation_raster(donut, xmin = dim(stats)[1]-1, xmax = dim(stats)[1], ymin = max(stats$value)-(.2*max(stats$value)), ymax = max(stats$value), interpolate = T) + theme(title = element_text(family = "Helvetica", colour = "#0680b0", face="bold"), axis.text = element_text(family="Courier", colour = "#3f3f3f"), axis.title = element_text(colour="#3f3f3f"))
}
#' Returns base API url
#'
#' Returns the base API url for version 1 of altmetric
#' @noRd
api_url <- function() {
"http://api.altmetric.com/v1/"
}
#' @noRd
altmetric_compact <- function(l) Filter(Negate(is.null), l)
Linux build: [![Build Status](https://travis-ci.org/ropensci/rAltmetric.svg?branch=master)](https://travis-ci.org/ropensci/rAltmetric)
Windows build: [![Build status](https://ci.appveyor.com/api/projects/status/x6x8d21rmcsv2ybt)](https://ci.appveyor.com/project/karthik/raltmetric)
# Altmetric2
[altmetric.com](https://raw.github.com/ropensci/rAltmetric/master/altmetric_logo_title.png)
# rAltmetric
This package provides a way to programmatically retrieve altmetric data from [altmetric.com](http://altmetric.com) for any publication with the appropriate identifer. The package is really simple to use and only has two major functions: One (`altmetrics()`) to download metrics and another (`altmetric_data()`) to extract the data into a `data.frame`. It also includes generic S3 methods to plot/print metrics for any altmetric object.
Questions, features requests and issues should go [here](https://github.com/ropensci/rAltmetric/issues/). General comments to [karthik.ram@gmail.com](mailto:karthik.ram@gmail.com).
# Installing the package
A stable version is available from CRAN. To install
```coffee
install.packages('rAltmetric')
```
## Development version
```coffee
# If you don't already have the devtools library, first run
install.packages('devtools')
# then install the package
library(devtools)
install_github('rAltmetric', 'ropensci')
```
# Quick Tutorial
## Obtaining metrics
There was a recent paper by [Acuna et al](http://www.nature.com/news/2010/100616/full/465860a.html) that received a lot of attention on Twitter. What was the impact of that paper?
```coffee
library(rAltmetric)
acuna <- altmetrics('10.1038/489201a')
> acuna
Altmetrics on: "Future impact: Predicting scientific success" with doi 10.1038/489201a (altmetric_id: 942310) published in Nature.
provider count
1 Feeds 9
2 Google+ 1
3 Cited 174
4 Tweets 157
5 Accounts 167
```
## Data
To obtain the metrics in tabular form for further processing, run any object of class `altmetric` through `altmetric_data()` to get data that can easily be written to disk as a spreadsheet.
```coffee
> altmetric_data(acuna)
title
1 Future impact: Predicting scientific success
doi nlmid altmetric_jid issns
1 10.1038/489201a 0410462 4f6fa50a3cf058f610003160 0028-0836
journal altmetric_id schema is_oa cited_by_feeds_count
1 Nature 942310 1.5.4 FALSE 173
cited_by_gplus_count cited_by_posts_count
1 173 173
cited_by_tweeters_count cited_by_accounts_count score
1 156 166 184.598
mendeley connotea citeulike pub sci com doc
1 0 0 11 62 84 6 8
url
1 http://www.nature.com/nature/journal/v489/n7415/full/489201a.html
added_on published_on subjects scopus_subjects
1 1347471425 1347404400 science General
last_updated readers_count X1 count_all count_journal
1 1348828350 11 1 754555 13972
count_similar_age_1m count_similar_age_3m
1 22408 56213
count_similar_age_journal_1m count_similar_age_journal_3m
1 508 1035
rank_all rank_journal rank_similar_age_1m
1 754043 13759 22339
rank_similar_age_3m rank_similar_age_journal_1m
1 56074 459
rank_similar_age_journal_3m pct_all pct_journal
1 947 99.93 98.48
pct_similar_age_1m pct_similar_age_3m
1 99.69 99.75
pct_similar_age_journal_1m pct_similar_age_journal_3m
1 90.35 91.50
details_url
1 http://www.altmetric.com/details.php?citation_id=942310
```
You can save these data into a clean spreadsheet format:
```coffee
acuna_data <- altmetric_data(acuna)
write.csv(acuna_data, file = 'acuna_altmetrics.csv')
```
## Visualization
For any altmetric object you can quickly plot the stats with a generic `plot` function. The plot overlays the [altmetric badge and the score](http://api.altmetric.com/embeds.html) on the top right corner. If you prefer a customized plot, create your own with the raw data generated from `almetric_data()`
```coffee
> plot(acuna)
```
![stats for Acuna's paper](https://raw.github.com/ropensci/rAltmetric/master/acuna.png)
# Gathering metrics for many DOIs
For a real world use-case, one might want to get metrics on multiple publications. If so, just read them from a spreadsheet and `llply` through them like the example below.
```coffee
# Be sure to update the path if the example csv is not in your working dir
doi_data <- read.csv('dois.csv', header = TRUE)
> doi_data
doi
1 10.1038/nature09210
2 10.1126/science.1187820
3 10.1016/j.tree.2011.01.009
4 10.1086/664183
library(plyr)
# First, let's retrieve the metrics.
raw_metrics <- llply(doi_data$doi, function(x) altmetrics(doi = x), .progress = 'text')
# Now let's pull the data together.
metric_data <- ldply(raw_metrics, altmetric_data)
# Finally we save this to a spreadsheet for further analysis/vizualization.
write.csv(metric_data, file = "metric_data.csv")
```
## Further reading
* [Metrics: Do metrics matter?](http://www.nature.com/news/2010/100616/full/465860a.html)
* [The altmetrics manifesto](http://altmetrics.org/manifesto/)
To cite package ‘rAltmetric’ in publications use:
```coffee
Karthik Ram (2012). rAltmetric: Retrieves altmerics data for any
published paper from altmetrics.com. R package version 0.3.
http://CRAN.R-project.org/package=rAltmetric
A BibTeX entry for LaTeX users is
@Manual{,
title = {rAltmetric: Retrieves altmerics data for any published paper from
altmetrics.com},
author = {Karthik Ram},
year = {2012},
note = {R package version 0.3},
url = {http://CRAN.R-project.org/package=rAltmetric},
}
```
[![](http://ropensci.org/public_images/github_footer.png)](http://ropensci.org)
This is the reworked version of rAltmetric (temp repo will get merged into rAltmetric shortly)
\ No newline at end of file
init:
ps: |
$ErrorActionPreference = "Stop"
Invoke-WebRequest http://raw.github.com/krlmlr/r-appveyor/master/scripts/appveyor-tool.ps1 -OutFile "..\appveyor-tool.ps1"
Import-Module '..\appveyor-tool.ps1'
install:
ps: Bootstrap
build_script:
- travis-tool.sh install_deps
test_script:
- travis-tool.sh run_tests
on_failure:
- travis-tool.sh dump_logs
notifications:
- provider: Slack
auth_token:
secure: Da1t3XpyPl/xPpY98Ay83DJc6J0p8jsdu5GB3dMVBypGXGSss6D/4rzHdCmN+xNu
channel: builds
Fixed title case and actually ignored cran-comments.md per Prof. Ripley's request.
Updated the package by request of Kurt Hornik's recent email. I excluded the extra top level files in the .Rbuildignore file and also updated the documentation and the roxygen tags.
% Generated by roxygen2 (4.0.2): do not edit by hand
\name{altmetric_data}
\alias{altmetric_data}