What are the basic knowledge points of Rcpp in R language 07/16 Update SLTechnology News&Howtos

What are the basic knowledge points of Rcpp in R language

2025-07-16 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Shulou(Shulou.com)06/02 Report--

This article will explain in detail what are the basic knowledge points of Rcpp in R language. The editor thinks it is very practical, so I share it with you as a reference. I hope you can get something after reading this article.

1. Related configuration and description

Since Dirk's book Seamless R and C++ Integration with Rcpp was published in 13 years, when the feature of Rcpp Attributes was not approved by CRAN, it was cumbersome to call and write Rcpp functions at that time. Rcpp Attributes (2016) greatly simplifies this process ("provides an even more direct connection between C++ and R"), retains inline functions, and provides sourceCpp functions for calling external .cpp files. In other words, we can store a C++ function in a .cpp file and call the C++ function from an R script file, just like using source, through sourceCpp.

For example, in the R script file, where we want to call a function called test.cpp file, we can do the following:

Library (Rcpp) Sys.setenv ("PKG_CXXFLAGS" = "- std=c++11") sourceCpp ("test.cpp")

The second line means to compile the file using the standard of Clover 11.

In the test.cpp file, the header file uses Rcpp.h, and the functions that need to be output to R are placed after / / [[Rcpp::export]]. If the functions you want to output to R need to call other C++ functions, you can put those functions before / / [[Rcpp::export]].

# include using namespace Rcpp;// [[Rcpp::export]]

For algebraic calculations, Rcpp provides RcppArmadillo and RcppEigen. If you want to use this package, you need to indicate the dependency at the beginning of the function file, such as / / [[Rcpp::depends (RcppArmadillo)]], and load the relevant header file:

/ / [[Rcpp::depends (RcppArmadillo)]] # include # include using namespace Rcpp;using namespace arma;// [[Rcpp::export]]

The basic knowledge of C++ can be found here.

two。 Commonly used data type keywords describe int/double/bool/String/auto integer / numeric / Boolean / character / automatic recognition (Category 11) IntegerVector integer vector NumericVector numerical vector (element type is double) ComplexVector complex vector Not SureLogicalVector logical type vector; logical variables of R can take three values: TRUE, FALSE, NA; and C++ Boolean only two, true or false. If the NA of R is converted to a Boolean in C++, true is returned. CharacterVector character type vector ExpressionVectorvectors of expression typesRawVectorvectors of type rawIntegerMatrix integer matrix NumericMatrix numerical type matrix (element type is double) LogicalMatrix logical type matrix CharacterMatrix character matrix List aka GenericVector list; lists; is similar to the list in R, its elements can make any data type DataFrame data box; data frames; inside Rcpp, the data box is actually Function functional Environment environment type implemented through list Can be used to reference functions in the R environment, functions in other R packages, and types that can be recognized by R by manipulating variables in the R environment RObject

Note:

Some R objects can be converted to Rcpp objects through as (Some_RObject). For example:

Fit a linear model (which is List) in R and pass it into the C++ function

> mod=lm (mod X); NumericVector resid = as (mod ["residuals"]); NumericVector fitted = as (mod ["fitted.values"])

You can convert NumericVector to std::vector through as (Some_RcppVector). For example:

Std::vector vec;vec = as (x)

In the function, you can use wrap () to convert std::vector to NumericVector. For example:

Arma::vec long_vec; vector long_vec2 = conv_to::from (long_vec); NumericVector output = wrap (long_vec2)

When the function returns, you can use wrap () to convert the C++ STL type to an R-recognized type. For examples, see the input and output examples section below.

With the exception of Environment (Function uncertainty), most of the above data types can be returned directly as functions and automatically converted to R objects.

Arithmetic and logical operation symbols +, -, *, /, +, -, pow (XMagnep), =, =,! =. Logical relation symbol & &, | |,!.

3. The establishment of common data types / / 1. VectorNumericVector V1 (n); / / creates a default initialized numeric vector V1 of length n. NumericVector V2=NumericVector::create (1mem2); / / created a numeric vector V2, and initialized it to contain three numbers 1, 1, 2 and 3. LogicalVector V3=LogicalVector::create (true,false,R_NaN) / / creates a logical variable V3. If you convert it to R Object, it contains three values TRUE, FALSE, and NA. / / 2. MatrixNumericMatrix M1 (nrow,ncol); / / creates a default initialized numerical matrix for nrow*ncol. / / 3. Multidimensional ArrayNumericVector out=NumericVector (Dimension (2mem2jin3)) / / created a multidimensional array. However, I don't know what the egg is for. / 4. ListNumericMatrix y1 (2Magne2); NumericVector y2 (5); List L=List::create (Named ("y1") = y1, Named ("y2") = y2); / / 5. DataFrameNumericVector a=NumericVector::create (1d2); CharacterVector b=CharacterVector::create ("a", "b", "c"); std::vector c (3); c [0] = "A"; c [1] = "B"; c [2] = "C" DataFrame DF=DataFrame::create (Named ("col1") = a, Named ("col2") = b, Named ("col3") = c) Common data type element access element access description [n] for a vector type or list, access the nth element. For matrix types, first connect the next column of the matrix to the previous column to form a long column vector and access the nth element. Unlike RMagne, n starts at 0. For the matrix type, access the first element (iQuery j). Unlike RBI and j, they start at 0. Unlike vectors, parentheses are used here. List ["name1"] / DataFrame ["name2"] accesses an element named name1 in List / accesses a column named name2 in DataFrame. 5. Member function member function describes the length of X returned by X.size (); suitable for vector or matrix, if it is matrix, first quantize X.push_back (a) to add a to the end of X; for vector X.push_front (b) to add b to the beginning of X; for vector X.ncol () to return the number of columns of X X.nrow () returns the number of rows of X 6. Grammar Sugar 6.1 arithmetic and logical operators

+, -, *, /, pow (XMagol p), =,! =,!

All the above operators can be vectorized.

6.2. Common function

Is.na ()

Produces a logical sugar expression of the same length. Each element of the result expression evaluates to TRUE if the corresponding input is a missing value, or FALSE otherwise.

Seq_len ()

Seq_len (10) will generate an integer vector from 1 to 10 (Note: not from 0 to 9), which is very useful in conjugation withsapply () and lapply ().

Pmin (a _ r _ b) and pmax (a _ r _ r b)

An and b are two vectors. Pmin () (or pmax) compares the i ith elements of an and b and return the smaller (larger) one.

Ifelse ()

Ifelse (x > y, x-ray, x-ray) means if x > y is true, then do the addition; otherwise do the subtraction.

Sapply ()

Sapply applies a C++ function to each element of the given expression to create a new expression. The type of the resulting expression is deduced by the compiler from the result type of the function.

The function can be a free C++ function such as the overload generated by the template function below:

Template T square (const T & x) {return x * x;} sapply (seq_len (10), square)

Alternatively, the function can be a functor whose type has a nested type called result_type

Template struct square: std::unary_function {T operator () (const T & x) {return x * x;}} sapply (seq_len (10), square ())

Lappy ()

Lapply is similar to sapply except that the result is allways an list _ expression (an expression of type VECSXP).

Sign ()

Other functions

Mathematical functions: abs (), acos (), asin (), atan (), beta (), ceil (), ceiling (), choose (), cos (), cosh (), digamma (), exp (), expm1 (), factorial (), floor (), gamma (), lbeta (), lchoose (), lfactorial (), lgamma (), log (), log10 (), log1p (), pentagamma (), psigamma (), round (), signif (), sin (), sinh (), sqrt (), tan () Tanh (), tetragamma (), trigamma (), trunc ().

Summary functions: mean (), min (), max (), sum (), sd (), and (for vectors) var ()

Return the summary function of the vector: cumsum (), diff (), pmin (), and pmax ()

Lookup functions: match (), self_match (), which_max (), which_min ()

Duplicate value handling functions: duplicated (), unique ()

7. STL

Rcpp can use the data structures and algorithms in C++ 's standard template library, STL. Rcpp can also use data structures and algorithms in Boost.

7.1. Iterator

Here is just an example, see C++ Primer for details, or here.

# include using namespace Rcpp;// [[Rcpp::export]] double sum3 (NumericVector x) {double total = 0; NumericVector::iterator it; for (it = x.begin (); it! = x.end (); + + it) {total + = * it;} return total;} 7.2. Arithmetic

Many algorithms are provided in the header file (which can be shared with iterators), as you can see here.

For example, we could write a basic Rcpp version of findInterval () that takes two arguments a vector of values and a vector of breaks, and locates the bin that each x falls into.

# include # include using namespace Rcpp;// [[Rcpp::export]] IntegerVector findInterval2 (NumericVector x, NumericVector breaks) {IntegerVector out (x.size ()); NumericVector::iterator it, pos; IntegerVector::iterator out_it; for (it = x.begin (), out_it = out.begin (); it! = x.end (); + + it, + + out_it) {pos = std::upper_bound (breaks.begin (), breaks.end (), * it) * out_it = std::distance (breaks.begin (), pos);} return out;} 7.3. Data structure

The data structures provided by STL are also available, and Rcpp knows how to transform the data structures of STL into those of R, so you can return them directly from functions without having to convert them yourself.

Please refer to here for details.

7.3.1. Vectors

For more information, see here

Create

Vector, vector

Element access

Access elements using standard [] symbols

Element increase

Add elements using .push _ back ().

Storage space allocation

If the length of the vector is known in advance, sufficient storage space can be allocated by .reserve ().

Example:

The following code implements run length encoding (rle ()) It produces two vectors of output: a vector of values, and a vector lengths giving how many times each element is repeated. It works by looping through the input vector x comparing each value to the previous: if it's the same, then it increments the last value in lengths; if it's different, it adds the value to the end of values, and sets the corresponding length to 1.

# include using namespace Rcpp;// [[Rcpp::export]] List rleC (NumericVector x) {std::vector lengths; std::vector values; / / Initialise first value int i = 0; double prev = x [0]; values.push_back (prev); lengths.push_back (1); NumericVector::iterator it; for (it = x.begin () + 1; it! = x.end (); + + it) {if (prev = = * it) } else {values.push_back (* it); lengths.push_back (1); iTunes; prev = * it;}} return List::create (_ ["lengths"] = lengths, _ ["values"] = values);} 7.3.2. Sets

See links 1, 2 and 3.

The collection std::set in STL does not allow elements to repeat, while std::multiset allows elements to repeat. Collections are important for detecting duplicates and determining elements that are not duplicated ((like unique, duplicated, or in)).

Ordered set: std::set and std::multiset.

Unordered set: std::unordered_set

Generally speaking, unordered set is faster because they use hash table instead of tree's method.

Unordered_set, unordered_set, etc

7.3.3. Maps

It is closely related to table () and match ().

Ordered map: std::map

Unordered map: std::unordered_map

Since maps have a value and a key, you need to specify both types when initialising a map:

Map, unordered_map.

8. Interaction with R environment

Through EnvironmentRcpp, you can get the variables and loaded functions in the current R global environment (Global Environment), and modify the variables in the global environment. We can also get functions from other R packages through Environment and use them in Rcpp.

Get functions in other R packages

Rcpp::Environment stats ("package:stats"); Rcpp::Function rnorm = stats ["rnorm"]; return rnorm (10, Rcpp::Named ("sd", 100.0))

Get the variables in the R global environment and make changes

Suppose that there is a vector xmapc in the global environment of R, and we want to change its value in Rcpp.

Rcpp::Environment global = Rcpp::Environment::global_env (); / / gets the global environment and assigns a value to the Environment variable globalRcpp::NumericVector tmp= global ["x"]; / / gets xtmp=pow (tmp,2); / / Square global ["x"] = tmp;// assigns the new value to x in the global environment

Get the loaded functions in the R global environment

Suppose there is an R function funR in the global environment, which is defined as:

Xsancc (1 funR 2);

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.