permsep                package:permax                R Documentation

_P_e_r_m_u_t_a_t_i_o_n _a_n_a_l_y_s_i_s _f_o_r _c_o_m_p_l_e_t_e _s_e_p_a_r_a_t_i_o_n

_D_e_s_c_r_i_p_t_i_o_n:

     Given two groups of samples of high dimensional attribute vectors
     (eg DNA array expression levels), determines the number of
     attributes which completely separate the two groups (all values in
     one group strictly larger than in the other), and a permutation
     p-value for this quantity.

_U_s_a_g_e:

     permsep(data, ig1, nperm=0, ig2, WHseed=NULL)

_A_r_g_u_m_e_n_t_s:

    data: Data matrix or data frame.  Each case is a column, and each
          row is an attribute (the opposite of the standard
          configuration). 

     ig1: The columns of data corresponding to group 1 

   nperm: The number of random permutations to use in computing the
          p-values. The default is to use the entire permutation
          distribution, which is only feasible if the sample sizes are
          fairly small 

     ig2: The columns of data corresponding to group 2.  The default is
          to include all columns not in ig1 in group 2.  When both ig1
          and ig2 are given, columns not in either are excluded. 

  WHseed: Initial random number seed (a vector of 3 integers).  If
          missing, an initial seed is generated from the runif()
          function.  Not needed if all permutations are calculated.

          Prints a vector giving the # genes with complete separation
          (all in one group  larger than all in the other, the
          proportion of permutations with this many or more genes with
          complete separation (p-value) ('permutation' actually means a
          distinct rearrangement of columns into 2 groups), the 
          average number of genes per permutation with complete
          separation,  and the proportion of permutations with any
          genes with complete separation.  

          The value returned is a list with components `ics' = a vector
          indicating (with 1) which rows of data have complete
          separation, and `dtcs' = a vector containing the printed
          output.

          Also, if `nperm'>0, then the output includes attributes
          `seed.start' giving the initial random number seed, and
          `seed.end' giving the value of the seed at the end.  These
          can be accessed with the `attributes'  and `attr' functions.

_D_e_t_a_i_l_s:

     For each gene there will be 0, 1 or 2 rearrangements with complete
     separation, depending on the number of unique values and the sizes
     of the two groups.  Adding these numbers over genes and dividing
     by the number of rearrangements gives the average number per
     permutation.  The value returned averages only over the
     rearrangements actually used, though.

     It is strongly recommended that different seeds be used for
     different runs, and ideally the final seed from one run,
     attr(output,'seed.end'), would be used as the initial seed in the
     next run.

_S_e_e _A_l_s_o:

     permax

_E_x_a_m_p_l_e_s:

        ngenes <- 1000
        m1 <- rnorm(ngenes,4,1)
        m2 <- rnorm(ngenes,4,1)
         exp1 <- cbind(matrix(exp(rnorm(ngenes*5,m1,1)),nrow=ngenes),
                    matrix(exp(rnorm(ngenes*10,m2,1)),nrow=ngenes))
        exp1[exp1<20] <- 20
        sub <- exp1>20 & exp1<150
        exp1[sub] <- ifelse(runif(length(sub[sub]))<.5,20,exp1[sub])
        dimnames(exp1) <- list(paste('x',format(1:ngenes,justify='l'),sep=''),
                          paste('sample',format(1:ncol(exp1),justify='l'),sep=''))
        dimnames(exp1) <- list(paste('x',1:ngenes,sep=''),
                          paste('sample',1:ncol(exp1),sep=''))
        exp1 <- round(exp1)

      uuu <- permsep(exp1,1:5)

