The selectr package for R parses CSS3 Selectors and translates them to XPath 1.0 expressions. It is an R port of the cssselect package for Python. Development occurs on GitHub in the selectr repository.
The main purpose of this package is to make working with (X)HTML and XML documents easier within R. The XML and xml2 packages are typically used when working with these documents, but they can only select content based on XPath expressions. By translating CSS selectors to XPath, we can use the more familiar CSS selectors instead of XPath.
Download
To use selectr, all that is necessary is to run the following command as selectr is available on CRAN:
install.packages("selectr")
Usage
The most basic and flexible function provided by selectr is
css_to_xpath()
. It simply translates a vector of CSS Selectors to
their equivalent XPath expressions.
> library(selectr)
> css_to_xpath("div > a")
[1] "descendant-or-self::div/a"
> css_to_xpath("div:nth-child(2) > a")
[1] "descendant-or-self::div[count(preceding-sibling::*) = 1]/a"
A common task is to search for matching nodes within a
document. selectr makes this task easier by mimicking the behaviour of
DOM methods present in JavaScript
(querySelector()
and
querySelectorAll()
).
> fileName <- system.file("exampleData", "test.xml", package="XML")
> mydoc <- xmlParse(fileName)
> querySelector(mydoc, "a")
<a>
<!-- A comment -->
<b>
%extEnt;
</b>
</a>
> querySelectorAll(mydoc, "code")
[[1]]
<code>
xmlTreeParse("test.xml", replaceEntities = TRUE)
</code>
[[2]]
<code>
xmlTreeParse("test.xml")
</code>
attr(,"class")
[1] "XMLNodeSet"
Further Information
A technical report has been created that describes this package in more detail. It also contains examples on how you would use selectr.
The package documentation also goes into more detail on the usage of each particular function.