String Manipulation in R

A data scientist loves numbers but most of the human data is combination of characters.  Let me give you a small example,

Price of Camera1= $5000

Price of Camera2= 5,000

Price of Camera3= 5000

For you and me, all the values are same, their average being 5000, but for a computer the first 2 values are strings, while 3rd is integer. The computer cannot calculate the average without type conversions.





This, is where string manipulation comes handy.

Few examples are:

1. String Length -nchar(string_name)


2. Conversion to lower or upper case -tolower(string_name),toupper(string_name)


3. Breaking string at a pivot-strsplit(string_name,split_char)

4. Concatenating strings-paste(string1,string2…stringn)


One can use a more C friendly form of string concatenation- sprintf(“%s%s%d”,”strings)

5. Sub-string a part- substr(string_name, start=(included),stop=(excluded))

6. Convert a string datatype to integer and vice-versa- as.<datatype>(variable)


The above code is available at:

String matching and replacement is another area of importance that will be discussed in subsequent blog.


One thought on “String Manipulation in R

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s