Try to search your question here, if you can't find : Ask Any Question Now ?

Generation of groups based on dataframe variables

HomeCategory: stackoverflowGeneration of groups based on dataframe variables

I have a dataframe in which are the described 6 measurements made in 9145 people:

str (Clinical)

‘data.frame’: 9435 obs. of 6 variables

\$ cel_name : chr “ZAC002050.CEL” “ZAC001287.CEL” …

\$ gs_p : num 4 3 3 3 5 5 3 3 3 4 …

\$ gs_s : num 3 4 4 5 5 3 3 4 4 3 …

\$ stadium : chr “pT3a” “pT2” “pT2” “pT2” “NA” …

\$ avi : num 0 0 1 0 2 0 0 2 NA 0 …

\$ rpi : num 1 0 0 0 0 1 0 0 1 1 …

Then, I have a vector containing the values of a numeric variable measured in all the 9145 people as well:

ZAC001287.CEL ZAC005151.CEL

0.2095                                  0.3153

ZAC002050.CEL ZAC007164.CEL

-0.04300                                  0.5331

………truncated………

Please note that the column names in ‘Clinical\$cel_name’ correspond to the names of the ‘Exprs’ vector, though they are not in the same order.

Now, I would need to divide first the values of the ‘Exprs’ vector in two groups, those having rpi=0 and those having rpi=1. Then, for each of the variables, except ‘cel_name’, I would need to make subgroups and calculate their mean and the t.test (Student test). The subgroups would be:

1) rp1=0,gs_p=3 vs rp1=1,gs_p=3

2) rp1=0,gs_p=4 vs rp1=1,gs_p=4

3) rp1=0,gs_p=5 vs rp1=1,gs_p=5

4) rp1=0,gs_s=3 vs rp1=1,gs_s=3

5) rp1=0,gs_s=4 vs rp1=1,gs_s=4

6) rp1=0,gs_s=5 vs rp1=1,gs_s=5