The aim of this work for the Google Summer of Code was to improve upon the grImport package  for the R Project . grImport is a package for importing PostScript images into R, which are intended to be used within plots created with R graphics. The original goal was to accurately recreate the state flags of the United States of America. The set of flags are already hosted online by Toby Dylan Hocking (at http://sugiyama-www.cs.titech.ac.jp/~toby/flags/index.html) and serves as a good set of test cases to see where grImport needs improvement.
One example of a state flag that needs improvement is that of Colorado. This is demonstrated in Figure 1 where the original flag is shown on the left, and grImport's attempt at importing and rendering the image is shown on the right.
What Figure 1 shows us is that grImport is not
parsing rectangles correctly. More specifically,
grImport does not understand
/rectstroke operators and consequently it
cannot draw them. To correct this rather obvious flaw,
support for these operators were added to the
grImport package, which now renders many flags
correctly without any further changes necessary. The
corrected Colorado state flag is shown below in Figure 2.
Figure 2 shows us that indeed, this simple fix does correct the rendering of the state flags of Colorado. In fact, it greatly improves the appearance of most state flags. However, limitations in both the PostScript image format and R graphics mean that it is simply not possible for all of the state flags to be imported by grImport. An example of this is PostScript's lack of support for semi-transparency when painting within in graphics objects e.g. a path or a rectangle. Additionally, R graphics does not support clipping to paths, only to horizontally and vertically aligned rectangles. These are fundamental limitations that cannot be worked around by grImport and was the primary motivation for the development of the grImport2  package.
To begin with, let us first demonstrate an example where the grImport2 package illustrates an improvement. This is shown in Figure 3 where the flag of Hawaii renders with a few errors in grImport. The key problems with grImport's rendering is that clipping is not applied either to the horizontal stripes or the boundaries of the Union Jack. The key reason why grImport does not perform clipping is because PostScript clips to a path, which is not possible with the R graphics engine. However, because grImport2 can work out the bounding box of a clipping path, it can also clip to the bounding box. The result is that the flag of Hawaii now renders correctly with grImport2, where it could not do so with the existing grImport package.
Despite the aforementioned improvements, there are some features of images that cannot be rendered in R graphics at all. We have already mentioned clipping to a path, but another example of this is gradient fills. A state flag that uses this feature extensively is the state flag of Kansas, where much of the central features of the flag are filled using linear gradients. This graphical feature is not currently supported by the R graphics engine, but it is supported by the gridSVG package . The gridSVG package enables us to use features of SVG  within R, the key limitations being that the image must be a grid  plot, and that the resulting image must be an SVG image which is generated by gridSVG.
Figure 4 demonstrates that that if we support linear gradients, as is the case when we render with gridSVG, the state flag of Kansas can render correctly.
To summarise, while it is possible to further improve grImport's understanding of PostScript, further development of grImport is fundamentally limited by the features of both PostScript and the R graphics engine. These limitations are not present when using SVG as a source image format instead of PostScript. These are the primary motivations for the development of a new package, grImport2, rather than further extending the grImport package.
grImport2 targets the SVG image format, which
contains a superset of the functionality of the R graphics
engine. One of the key advantages of SVG is that it is a
grammar of XML , which means that it
is straightforward to parse with the XML
package . For grImport2,
the use of the SVG format is similar to grImport's
use of its custom RGML image format, which is also a grammar
of XML. SVG and RGML are similar due to the fact that both
XML grammars attempt to describe how to draw an image. The
key difference is that grImport2 can import the SVG
image immediately, but grImport must first create
the RGML description (via
which can then be imported. Another key benefit of
targetting the SVG image format is that we not only have
complete access to the features of the R graphics engine,
but also the features of SVG when the gridSVG
package is used.
It may indeed be the case that PDF  supports all of the required graphical features, but it would still require the PDF images to be parsed in some way. It is this parsing step that is trivial with the SVG image format (with the XML package) but far more difficult with PDF. In order to parse PDF, we would need write a fairly large amount of code, especially considering most existing PDF libraries do not expose the graphical components of a PDF document. They instead choose to provide only a high level rendering function. Furthermore, even if PDF were even more graphically sophisticated, SVG is sufficiently capable of drawing visually identical plots to PDF.
A complication with SVG is that unlike PostScript and PDF it does not require the use of a physical coordinate system (e.g. points). This, among other things (such as attribute precedence), complicates the process of importing an SVG image. However, if we restrict the SVG we support to the SVG that is generated by the Cairo graphics library , then we know that the SVG images that we import are guaranteed to belong to a common coordinate system and other features of the image will be flattened, making it easier to parse. In particular, this means that grImport2 does not need to parse all of the SVG image format, only the subset of SVG that Cairo generates. Furthermore, because Cairo is is a supported graphics renderer for many open source libraries that import vector images, we know that it is possible for many input formats to be mapped to the Cairo SVG subset of SVG.
We have established that the task of importing vector images for grImport2 is greatly simplified when grImport2 imports only the subset of SVG that the Cairo graphics library draws. Additionally, it is not necessary to target one specific image format, instead we can leverage existing libraries that already handle the specific image formats   . However, grImport2's responsibility is for importing graphics, not for dealing with image formats. The grConvert package has been created to create compliant vector images for importing with either grImport or grImport2.
The grConvert package takes advantage of the fact that there are several mature, performant, and easy-to-use libraries for parsing popular vector image formats. Another key advantage of these libraries is that they support rendering by the Cairo graphics library. This means that we can take any PostScript, PDF, or SVG image and render it to any image format that Cairo supports. For the purposes of grConvert, it only ever needs to use Cairo to render to either PostScript or SVG for grImport and grImport2 respectively.
The interface provided by grConvert is minimal as
there is only one function of
convertPicture(). This function takes an
input vector image that is one of the supported formats
(PostScript, PDF or SVG) and exports either PostScript
for grImport, or SVG for grImport2. By
default the types of images that we import or indeed export
are inferred by their file extensions to make most
conversions as simple as possible.
convertPicture(), consider the cases
shown in Figure 5 where we convert various input images to
SVG, and also to PostScript.
This simple interface allows us to create supported input images for grImport and grImport2. For grImport, this was a task previously left as an exercise to the user, but is now much more convenient with the grConvert package. The convenience of converting between vector image formats in R reduces the time and effort needed to produce valid images for grImport or grImport2. Furthermore, in the case of grImport2, it also guarantees that the SVG produced contain the restricted SVG that grImport2 targets.
At this point in the technical report, we have demonstrated
some of the capabilities of grImport2, but have not
yet shown how this is performed. There are only two steps
necessary to draw images with
grImport2: reading, and drawing. Reading a picture
is performed using the
function. It expects a filename or remote location that
points to an image that should be consumed to create
Picture object. This
contains all of the information necessary to reproduce the
imported image in R. Once the image has been read in
readPicture(), we can reproduce the image
within R. An example of this is shown in Figure 6 where we
have a flag of New Zealand, which will be converted
by grConvert into Cairo SVG, and then read into R
There are two high-level functions we can use to draw this
Picture object. We will first
grid.picture() is a function intended for use
with the grid graphics system. It takes
Picture object, and enables it to be drawn
within any grid viewport. For example, we can draw
it to the entire page.
One problem that we can observe from the rendered image in Figure 7 is that the lines in the Union Jack are not clipped at all. To correct this problem, grImport2 can clip to the bounding box of any clipping paths in the image. This is demonstrated in Figure 8.
By clipping to the bounding box of any clipping paths, at least for this simple image, greatly improves the appearance of the flag. The rendering is not yet perfect, because the diagonal lines in the Union Jack are not clipped properly. However, we have reached the limitations of R graphics as it does not support this feature. We can leverage the gridSVG package to clip to these paths when it exports an SVG image. This is illustrated in Figure 9.
The use of clipping paths instead of clipping rectangles improves the appearance of our flag so that it now renders identically to the source image.
We have focused on clipping, but there are further graphical features that we can use. For example, the central features of the state flag of Kansas contains linear gradients. These are not shown at all without the use of gridSVG, instead they are drawn using the current fill colour. This means that the use of gridSVG is beneficial, and in cases like these, essential for accurate reproduction of imported graphics.
The second high level function that grImport2
provides for working with
Picture objects is
grid.symbols(). It is used for drawing
several copies of of a picture at a set of
y) locations, with specified
sizes. For statistical graphics purposes, the most obvious
use for this function is to create custom plotting
To demonstrate how this may be used, we will use a custom
panel function for the lattice package. The panel
function will give us some (
locations to plot at and will also give a random size for
each of the pieces of the plot.
Figure 11 shows that we have an easy way of using custom
data symbols in grid plots if we
grid.symbols() instead of the
grid.points() function. One drawback
however is that
grid.symbols() can only draw
Picture object repeatedly. To draw multiple
custom symbols will require multiple calls
One of the things you may want to do is change the appearance of a picture once it has been imported. There are two ways to do this, but we will first discuss the most convenient method; the second will be discussed later in this technical report.
The most convenient way to modify images imported
by grImport2 is to change their graphical
parameters, for example the colours that are being used
within a flag. Both the
grid.symbols() functions have an optional
gpFUN. This argument should be a function that
takes a grid
gpar object, and returns
a modified version of that object. This is a much simpler
approach than writing custom methods
Picture and related objects (as is the case
with grImport). Primarily this is because the task
of creating graphics objects should be undertaken
by grImport2, but changing the
graphical parameters of pictures should be
configurable by the user.
As we have established earlier, the way in which the
graphical parameters can be modified is by the use of a
function that takes a
gpar object and returns a
gpar object. We will construct a
function that desaturates all of the colours in
gpar object, and apply that to our flag of
Figure 12 shows how the
gpFUN argument can be
used to modify the appearance of an imported image. The flag
of New Zealand now appears in greyscale colours. Not only is
it possible to change the colours of an image, but it is
also possible to manipulate the image by showing only the
strokes in the image.
Many of the examples used so far appear relatively simple,
perhaps with the exception of the state flag of Kansas
(Figures 4 and 10). However, because the software libraries
that grConvert relies upon are sufficiently mature,
the input images for grImport2 can be equally as
sophisticated as the images from which they were sourced.
For example, a type of picture that typically contains a lot
of content are pages in most PDF documents. We will
demonstrate that grImport2 can render complex
pictures using using a page from the grImport
article published in the Journal of Statistical
Software . Figure 13 also
illustrates the use of the
selects the page that will be converted from the source
document to PostScript or SVG.
When rendering to the PDF graphics device, one can zoom in and observe that the image produced by grImport2 is imperceptibly different to the original page in the PDF document. The main reason why the content of the page is reproduced with a high degree of accuracy is because the Cairo graphics library translates text into an equivalent representation using SVG paths. A consequence of this translation is that grImport2 cannot import text as text, instead only the shapes are imported.
Much of the content of this technical report has discussed the motivation and the benefits to using grImport2, but this does not necessarily mean that grImport should be discarded. There are features that grImport has that grImport2 lacks. One of these features is the ability to import text as text. This means that it is possible to import a PostScript image containing text, then modify not only the text itself, but also its appearance (e.g. by changing its font). grImport2 can only import text as a path. This approach is problematic because paths are more difficult to modify and therefore generally support modification only to their graphical parameters.
Another advantage to grImport is that the R objects it creates to represent the imported image are far simpler, being comprised only of paths. This means that it is far simpler to modify an image if we know how to modify a path, rather than dealing with the multitude of groups and other types of objects that grImport2 creates. To demonstrate this, first consider the simple subset operation to show only a specific set of content in Figure 14.
The subsetting operation shown in Figure 14 was simply reducing the image to only show paths 3—16, 18 and 27. Because everything drawn in grImport is a path, and the imported picture is simply a series of path drawing operations, the subsetting operation is simply keeping only some of the paths.
Performing similar manipulations of an imported picture
using grImport2 is more complicated
process. Firstly, as is the case with grImport we
do not have any convenient way of knowing, for example,
which paths in the flag of New Zealand represent the stars
or the Union Jack. A second complication is that there is
some structure to the
Picture object that
closely matches that of the source Cairo SVG image. This
means that there are often trees of graphical content
objects. grImport does not suffer from this
complication as it manages a series of drawing operations,
rather than any trees of graphical content.
Despite these complications, and after some trial and error,
we can subset or add content to
Picture object. Figure 15
shows an example where the Union Jack on the flag of New
Zealand is removed using subsetting operations on
In general when modifying
Picture objects, it
may be useful to know that they closely match the structure
of the SVG image that they were sourced from. For example,
when modifying the flag in Figure 15, we knew that the image
was composed of a single
PictureGroup object, which contains many
children. With this information at hand, it is reasonably
simple (albeit trial and error) to work out which children
PictureGroup are responsible for drawing
the Union Jack. However, it is clear that grImport
does make this type of task much simpler.
Another advantage of grImport is that it should be expected to work with any PostScript image, whereas grImport2 is dependent not only on SVG, but the specific subset of SVG that Cairo generates. grImport2 relies heavily on the structure of Cairo's SVG output so any change to Cairo could cause grImport2 to import the image incorrectly, or perhaps fail altogether. This is an important dependency to note but the Cairo SVG surface appears to be quite stable. It is unlikely to change in such a drastic manner that it would invalidate much of the parsing behaviour of grImport2.
In summary, if the goal is to accurately reproduce images in
R then grImport2 is a more complete
solution. However, grImport is simpler and
Picture objects can be manipulated more
easily; if that is of greater importance
then grImport may be a preferable solution. The
cost of using grImport instead
of grImport2 is that reproduction of images is
unlikely to be as accurate.
We have demonstrated work performed as part of the Google Summer of Code for the R Project. This work resulted in minor changes to the grImport package, often leading to significant improvements in rendering. The grImport2 package was also created in order to work around limitations in the PostScript image format, in addition to limitations in R graphics. It is able to reproduce imported images with high fidelity, and when used with the gridSVG package, images can be reproduced to a greater degree than what R is capable of drawing.
In addition to developing improved tools for importing vector graphics, the grConvert package has been developed. It provides a convenient R interface for converting popular vector image formats into a format suitable for use in both grImport and grImport2.
The work demonstrated in this technical report was undertaken as part of the the Google Summer of Code 2013. Google funded the work entirely and without their support the development of this software would not have occurred.
The original versions of the flags shown in this technical report are freely available and have been sourced from the Wikimedia Commons. Thanks for the authors of these flags for providing such good examples for test cases.
This document is licensed under a Creative Commons Attribution 3.0 New Zealand License.
The code is freely available under the GPL. The described functionality of grConvert, grImport and grImport2 are present in the latest revisions of the packages on R-Forge.
In addition to the sources for the technical report and the packages described in the technical report, there is also a web page that shows and describes the improvements made to the rendering of the state flags of the USA. The web page is available at: https://dl.dropboxusercontent.com/u/54315147/import/state-table.html, with the source code for generating the web page available on GitHub at: https://github.com/sjp/grImport-flag-tests.