Try to search your question here, if you can't find : Ask Any Question Now ?

How to divide a vector into multiple groups using regex?

HomeCategory: stackoverflowHow to divide a vector into multiple groups using regex?
julie asked 1 week ago

I failed to adapt this solution to group a vector by regular expressions for multiple groups and can’t figure out what I’m doing wrong. Another solution didn’t help me either.

x1 <- gsub(paste0("(^a?A?pr)|(^a?A?ug)|(d?D?ec)"),
           "\1 \2 \3", x)
> unique(x1)
[1] "  dec" "Apr  " " aug " "apr  " "  Dec" " Aug "

I expected three unique groups as I have defined them in the gsub, i.e. just something like "dec Dec", "aug Aug", "apr Apr".

With more than 9 groups it’s even worse.

y1 <- gsub(paste0("(^a?A?pr)|(^a?A?ug)|(d?D?ec)|(^f?F?eb)|(^j?J?an)|(^j?J?ul)|", 
                  "(^j?J?un)|(^m?M?ar)|(^m?M?ay)|(^n?|N?ov)|(^o?O?ct)|(^s?S?ep)"),
           "\1 \2 \3 \4 \5 \6 \7 \8 \9 \10 \11 \12", y)
> unique(y1)
 [1] "         0 1 2"             "      jun   0 1 2"         
 [3] "     jul    0 1 2"          " Aug        0 1 2"         
 [5] "     Jul    0 1 2"          "   feb      0 1 2"         
 [7] "      Jun   0 1 2"          "       Mar  0 1 2"         
 [9] "    jan     0 1 2"          "Apr         Apr0 Apr1 Apr2"
[11] "  dec       0 1 2"          "   Feb      0 1 2"         
[13] "  Dec       0 1 2"          "apr         apr0 apr1 apr2"
[15] " aug        0 1 2" 

As the final result I aim for a factorized vector with unique levels for the different appearances of the same type (i.e. in this example a group for each month name, not case-sensitive).

Data

x <- c("dec", "Apr", "dec", "aug", "dec", "dec", "Apr", "apr", "apr", 
"dec", "Dec", "Aug", "Aug", "Apr", "Aug", "Apr", "aug", "Apr", 
"apr", "Apr", "dec", "aug", "aug", "aug", "aug", "apr", "dec", 
"Aug", "dec", "dec", "Dec", "Dec", "Apr", "Apr", "dec", "dec", 
"Dec", "dec", "apr", "Apr", "Apr", "dec", "apr", "apr", "apr", 
"apr", "Aug", "apr", "dec", "dec")

y <- c("Oct", "jun", "oct", "jul", "Aug", "jul", "Sep", "Jul", "feb", 
"feb", "Jun", "Mar", "jan", "Apr", "jul", "oct", "Jun", "jan", 
"Jun", "Oct", "Jul", "dec", "Jun", "Sep", "Feb", "Nov", "Feb", 
"dec", "Apr", "Dec", "jan", "Aug", "Feb", "apr", "Sep", "Nov", 
"aug", "oct", "Jun", "jul", "Apr", "Jun", "Apr", "Dec", "Jun", 
"Jul", "Aug", "Aug", "Jul", "sep")
1 Answers
Best Answer
Matthias answered 1 week ago
Your Answer

6 + 13 =

Popular Tags

WP Facebook Auto Publish Powered By : XYZScripts.com